Advanced Data Analysis Of In-Situ Bioremediation Site Data Using Dimensionality Reduction Techniques

Thumbnail Image
Freedman, Matan
Data Analysis , Bioremediation , Principal Component Analysis , Self-Organising Map , Groundwater , Machine Learning
Chlorinated solvents are one of the most prevalent groundwater contaminants in North America. Many of these contaminants are very difficult to clean up due to their physical properties and can pollute drinking water aquifers for decades to centuries. Tens of billions of dollars have been spent to clean up these sites, and thousands more remain for the foreseeable future. In-situ bioremediation (ISB) has emerged as a popular remediation technology for the treatment of tetrachloroethene (PCE) and trichloroethene (TCE) due to its inexpensiveness and flexibility for low-strength remediation applications. A typical ISB implementation involves the collection of large amounts of data to address the many possible problems and optimization decisions. Currently, rather primitive tools and techniques are used to analyse these data and the large amount being collected limits analyses that can be performed. Computer science algorithms for dimensionality reduction are common in research and certain industries that use “big data”, however these techniques have yet to be adapted for environmental industry needs or performance monitoring of ISB applications in particular. In this study a new method of multivariate spatiotemporal analysis was developed using principal component analysis (PCA), with the purpose of including multiple analytes and multiple intermediate sampling results in a single analysis. The new PCA “state-trajectory” method visualized temporal evolution in PCA space by connecting multiple well samples in two principal components. Three dimensionality reduction techniques were then compared using one TCE-contaminated site and included (1) the PCA state trajectory method, (2) a self-organizing map (SOM) state trajectory method, and (3) a Mann-Kendall (MK) trend analysis method. The PCA state trajectory method was able to separate monitoring wells into useful categories that generally agreed with practitioner analysis. The main benefits of the PCA state trajectory method were its speed and ease of analysis. When comparing the three dimensionality reduction methods, the PCA state trajectory method had the best results. The results of this research provide the field of practical ISB data interpretation a basis for utilizing computer science algorithms with the purpose of including multiple variables in a single analysis method.
External DOI