New machine learning method to analyze complex scientific data of proteins

Scientists have developed a method using machine learning to better analyze data from a powerful scientific tool: nuclear magnetic resonance (NMR). One way NMR data can be used is to understand proteins and chemical reactions in the human body. NMR is closely related to magnetic resonance imaging (MRI) for medical diagnosis.

NMR spectrometers allow scientists to characterize the structure of molecules, such as proteins, but it can take highly skilled human experts a significant amount of time to analyze that data. This new machine learning method can analyze the data much more quickly and just as accurately.

In a study recently published in Nature Communications, the scientists described their process, which essentially teaches computers to untangle complex data about atomic-scale properties of proteins, parsing them into individual, readable images.

“To be able to use these data, we need to separate them into features from different parts of the molecule and quantify their specific properties,” said Rafael Brüschweiler, senior author of the study, Ohio Research Scholar and a professor of chemistry and biochemistry at The Ohio State University. “And before this, it was very difficult to use computers to identify these individual features when they overlapped.”

The process, developed by Dawei Li, lead author of the study and a research scientist at Ohio State’s Campus Chemical Instrument Center, teaches computers to scan images from NMR spectrometers. Those images, known as spectra, appear as hundreds and thousands of peaks and valleys, which, for example, can show changes to proteins or complex metabolite mixtures in a biological sample, such as blood or urine, at the atomic level. The NMR data give important information about a protein’s function and important clues about what is happening in a person’s body.

But deconstructing the spectra into readable peaks can be difficult because often, the peaks overlap. The effect is almost like a mountain range, where closer, larger peaks obscure smaller ones that may also carry important information.

/Public Release. This material comes from the originating organization/author(s)and may be of a point-in-time nature, edited for clarity, style and length. The views and opinions expressed are those of the author(s).View in full here.