Hope Hauptman

UC Merced

“Using machine learning to predict 1,2,3-Trichloropropane contamination from legacy non-point source pollution of groundwater in California’s Central Valley”

1,2,3-Tricholoropropane (TCP) contaminates drinking water wells beneath land used for agriculture. To address the lack of worldwide monitoring data for TCP contamination, our research uses hybrid GIS and machine learning models built from spatial predictor variables to predict TCP levels in groundwater.


1,2,3-Tricholoropropane (TCP) is an impurity common in nematicides applied to agricultural soils from the 1940s to the 1980s. Evidence from animal studies indicates that TCP is a probable human carcinogen. TCP leaches through the soil into groundwater where it persists and contaminates thousands of wells in Asia, Europe and North America. In California, TCP contaminates drinking water wells, with the highest levels of TCP beneath agricultural land used to grow grapevines. Our study performs a mass balance and evaluates the ability of three types of machine learning tree models to predict 1,2,3-Trichloropropane (TCP) concentration in California’s Central Valley aquifer system: classification and regression tree (CART), Random Forest (RF), and Boosted Regression Trees (BRT). Modeling using decision trees can predict TCP contamination levels in areas where monitoring is lacking, help target future TCP monitoring efforts, and aid in identifying areas to avoid when drilling new drinking water wells.

8 + 12 =