Scientists have begun the search for extraterrestrial life in the Solar System in earnest, but such life may be subtly or profoundly different from Earth-life, and methods based on detecting particular molecules as biosignatures may not apply to life with a different evolutionary history. A new study by a joint Japan/US-based team, led by researchers at the Earth-Life Science Institute (ELSI) at the Tokyo Institute of Technology, has developed a machine learning technique which assesses complex organic mixtures using mass spectrometry to reliably classify them as biological or abiological.
In season 1, episode 29 (“Operation: Annihilate!”) of Star Trek, which aired in 1966, the human-Vulcan hybrid character Spock made the observation “It is not life as we know or understand it. Yet it is obviously alive; it exists.” This now 55-year old pop-culture meme still makes a point: how can we detect life if we fundamentally don’t know what life is, and if that life is really different from life as we know it?
The question of “Are we alone?” as living beings in the Universe has fascinated humanity for centuries, and humankind has been looking for ET life in the Solar System since NASA’s Viking 2 mission to Mars in 1976. There are presently numerous ways scientists are searching for ET life. These include listening for radio signals from advanced civilisations in deep space, looking for subtle differences in the atmospheric composition of planets around other stars, and directly trying to measure it in soil and ice samples they can collect using spacecraft in our own Solar System. This last category allows them to bring their most advanced chemical analytical instrumentation directly to bear on ET samples, and perhaps even bring some of the samples back to Earth, where they can be carefully scrutinised.
Exciting missions such as NASA’s Perseverance rover will look for life this year on Mars; NASA’s Europa Clipper, launching in 2024, will try to sample ice ejected from Jupiter’s moon Europa, and its Dragonfly mission will attempt to land an “octacopter” on Saturn’s moon Titan starting in 2027. These missions will all attempt to answer the question of whether we are alone.
Mass spectrometry (MS) is a principal technique that scientists will rely on in spacecraft-based searches for ET life. MS has the advantage that it can simultaneously measure multitudes of compounds present in samples, and thus provide a sort of “fingerprint” of the composition of the sample. Nevertheless, interpreting those fingerprints may be tricky.
As best as scientists can tell, all life on Earth is based on the same highly coordinated molecular principles, which gives scientists confidence that all Earth-life is derived from a common ancient terrestrial ancestor. However, in simulations of the primitive processes that scientists believe may have contributed to life’s origins on Earth, many similar but slightly different versions of the particular molecules terrestrial life uses are often detected. Furthermore, naturally occurring chemical processes are also able to produce many of the building blocks of biological molecules. Since we still have no known sample of alien life, this leaves scientists with a conceptual paradox: did Earth-life make some arbitrary choices early in evolution which got locked in, and thus life could be constructed otherwise, or should we expect that all life everywhere is constrained to be exactly the same way it is on Earth? How can we know that the detection of a particular molecule type is indicative of whether it was or was not produced by ET life?
It has long troubled scientists that biases in how we think life should be detectable, which are largely based on how Earth-life is presently, might cause our detection methods to fail. Viking 2 in fact returned odd results from Mars in 1976. Some of the tests it conducted gave signals considered positive for life, but the MS measurements provided no evidence for life as we know it. More recent MS data from NASA’s Mars Curiosity rover suggest there are organic compounds on Mars, but they still do not provide evidence for life. A related problem has plagued scientists attempting to detect the earliest evidence for life on Earth: how can we tell if signals detected in ancient terrestrial samples are from the original living organisms preserved in those samples or derived from contamination from the organisms which presently pervade our planet?
Scientists at the Earth-Life Science Institute at the Tokyo Institute of Technology in Japan and the National High Magnetic Field Laboratory (The National MagLab) in the US decided to address this problem using a combined experimental and machine learning computational approach. The National MagLab is supported by the US National Science Foundation through NSF/ DMR-1644779 and the State of Florida to provide cutting-edge technologies for research. Using ultrahigh-resolution MS (a technique known as Fourier-Transform Ion Cyclotron Resonance Mass Spectrometry (or FT-ICR MS)) they measured the mass spectra of a wide variety of complex organic mixtures, including those derived from abiological samples made in the lab (which they are fairly certain are not living), organic mixtures found in meteorites (which are ~ 4.5 billion-year-old samples of abiologically produced organic compounds which appear to have never become living), laboratory-grown microorganisms (which fit all the modern criteria of being living, including novel microbial organisms isolated and cultured by ELSI co-author Tomohiro Mochizuki), and unprocessed petroleum (or raw natural crude oil, the kind we pump out of the ground and process into gasoline, which is derived from organisms which lived long ago on Earth, providing an example of how the “fingerprint” of known living organisms might change over geological time). These samples each contained tens of thousands of discrete molecular compounds, which provided a large set of MS spectra that could be compared and classified.
In contrast to approaches that use the accuracy of MS measurements to uniquely identify each peak with a particular molecule in a complex organic mixture, the researchers instead aggregated their data and looked at the broad statistics and distribution of signals. Complex organic mixtures, such as those derived from living things, petroleum, and abiological samples present very different “fingerprints” when viewed in this way. Such patterns are much more difficult for a human to detect than the presence or absence of individual molecule types.
The researchers fed their raw data into a computer machine learning algorithm and surprisingly found that the algorithms were able to accurately classify the samples as living or non-living with ~95% accuracy. Importantly, they did so after simplifying the raw data considerably, making it plausible that lower-precision instruments, spacecraft-based instruments are often low power, could obtain data of sufficient resolution to enable the biological classification accuracy the team obtained.
The underlying reasons this classification accuracy is possible to remain to be explored, but the team suggests it is because of the ways biological processes, which modify organic compounds differently than abiological processes, relate to the processes which enable life to propagate itself. Living processes have to make copies of themselves, while abiological processes have no internal process controlling this.
“This work opens many exciting avenues for using ultra-high resolution mass spectrometry for astrobiological applications,” says co-author Huan Chen of the US National MagLab.
Lead author Nicholas Guttenberg adds, “While it is difficult if not impossible to characterise every peak in a complex chemical mixture, the broad distribution of components can contain patterns and relationships which are informative about the process by which that mixture came about or developed. If we’re going to understand complex prebiotic chemistry, we need ways of thinking in terms of these broad patterns – how they come about, what they imply, and how they change – rather than the presence or absence of individual molecules. This paper is an initial investigation into the feasibility and methods of characterisation at that level and shows that even discarding high-precision mass measurements, there is significant information in peak distribution that can be used to identify samples by the type of process that produced them.”
Co-author Jim Cleaves of ELSI adds, “This sort of relational analysis may offer broad advantages for searching for life in the Solar System, and perhaps even in laboratory experiments designed to recreate the origins of life.” The team plans to follow up with further studies to understand exactly what aspects of this type of data analysis allows for such successful classification.