Across the U.S., hundreds of sites on land or in lakes and rivers are heavily contaminated with hazardous waste produced by human activity. Many of these places, designated as Superfund sites by the Environmental Protection Agency, can be found in Houston, Texas, the city where my colleagues and I live and work.
Author
- Andres B. Sanchez Alvarado
Ph.D. Candidate in Chemistry, Rice University
Hazardous contaminants present at these sites that can increase the risk of cancer - such as polycyclic aromatic hydrocarbons , or PAHs - are pervasive in soil and water. Detecting these contaminants is only the first step to cleaning them up and keeping the environment safe.
The EPA's standard methods for analyzing water samples from a well, for example, involve expensive techniques that must be carried out in a separate location, taking weeks.
Our chemistry research group develops new methods that are more accessible and portable to detect toxic pollutants in soil , water and even blood .
My colleagues and I use machine learning methods to detect individual compounds in mixtures without separating them and to automatically identify those compounds by comparing them to a digital database. With machine learning we can streamline analysis of a contaminated site, detecting hazardous pollutants faster and on-site, for more efficient environmental monitoring.
Nanomaterials are extra sensitive
Imagine trying to look at the end of a strand of your hair head on. You would barely see the width of the tiny filament. Now try to imagine a material that is 1,000 times smaller than the width of that hair strand. You wouldn't see anything at all. My research uses microscopic objects known as nanoparticles that are about that size.
These nanoparticles interact with light in unique ways - kind of like how a magnifying glass focuses sunlight. Any substances near the nanoparticles are exposed to this focused light. We take advantage of this property by shining a beam of infrared light on the nanoparticles, so the substances around them absorb the intense light and generate a signal. We can detect the signal with a spectrophotometer : an instrument that measures the amount of light of a specific frequency .
Any toxic pollutant near the nanoparticles will absorb more of that infrared light than it normally would, enhancing the signal that we can measure. This process occurs only when the pollutant is close to the nanoparticles' surface. But even the smallest concentrations of these pollutants can be detected using the nanoparticles' enhancement, if they're nearby.
In our laboratory, I make the nanoparticles using solutions of metal salts. I then dissolve them in a liquid to make an ink, which I then paint onto glass microscope plates. After the ink dries, I am left with nanoparticles packed together on the surface of the glass, like beads on a diamond painting kit.
Once the nanoparticle painting is ready, I add a drop of contaminated water on top of the tinted glass and let it dry again. During this process, the contaminant molecules stick to the nanoparticles. Once dry, I slide the glass inside a spectrophotometer and measure the light absorbed and emitted by the pollutants on the nanoparticles.
The specific frequencies of light that a compound absorbs and emits are like a signature. Each contaminant will have a different signature that we can use to identify them in the water.
Machine learning simplifies the analysis
Sometimes, the contaminated water contains a mix of many different compounds, which complicates the analysis. Each compound will absorb light, and they might absorb similar wavelengths. To prevent this interference, scientists usually need to use sophisticated techniques to physically separate out each compound. These techniques can be time-consuming, so our team wanted to figure out how to circumvent this step.
We partnered with computer scientists who have been designing tailored algorithms that use machine learning . These programs take the data from our measurements and find patterns so subtle that even the most skilled analyst would miss them.
These methods can simplify the data and extract the most significant characteristics from each compound. These distinctive characteristics help the computer distinguish the individual compounds present in a mixture, bypassing any physical separation stage in the analysis. Computer scientists can make these algorithms so sophisticated that we don't even need to train the machine before analyzing a sample.
We can use our nanoparticles to measure water or soil polluted with a toxic contaminant, feed the data into the algorithms, and the machine will find the most important features and match them to a reference database. This analysis takes only a few hours, making it at least twice as fast as standard methods.
However, our method is far from perfect. One of the biggest challenges we face is optimizing the nanoparticles' composition for different classes of contaminants. It can take different nanoparticles to enhance the detection of different pollutants. We also have to tweak the algorithm to look more closely for different signatures in the data.
This method could screen a site for broad classes of contaminants that are similar in chemical structure. Subsequently, in the future a specific type of nanoparticle and a more refined model could be used to identify each specific pollutant molecule.
Streamlined analysis can get the job done
Analyzing contaminants in the environment helps detect the presence of hazardous pollutants, and doing so efficiently can prevent exposure to people. The techniques our group uses to detect contaminants and analyze the data have been used in the field with portable instrumentation by other researchers. These portable instruments are still cheaper than those required for standard techniques.
Currently, our team is exploring the use of these machine learning-enhanced methods in different environmental contexts. We've analyzed other types of samples, such as water and air from contaminated sites. We are working on expanding the scope of analysis to a wider range of hazardous pollutants. We also collaborate with toxicologists and environmental engineers in the Texas Medical Center, with the goal of transferring this technology as an alternative method for environmental and public health agencies.
To that end, we've filed a patent for our method that combines spectroscopy and machine learning to analyze complex samples. While our team is not currently pursuing commercialization of this technology, it is a possibility down the road.
Still, detection is not the end for environmental safety. After a hazardous pollutant has been identified, a site must be investigated to decide how to clean it up. Our motivation is to streamline the process of detecting and identifying contaminants. The faster we can detect a hazardous substance, the faster we can prevent future emissions and begin cleanups.
![]()
Andres B. Sanchez Alvarado participated in research into combining spectroscopy and ML to analyze complex samples, which has a patent pending.