Researchers have described a large set of previously unrecognised enzymatic domains – named the Lipocone superfamily – and outlined their evolutionary pathway from bacterial defence molecules to key players in human development.
Most of the superfamily are either reported for the first time or are functionally poorly understood, providing unprecedented insights into the evolutionary origins, structure and function of a diverse group of physiologically significant molecules across several biological kingdoms.
Published as a Reviewed Preprint in eLife and appearing today as the final version, the research is described by editors as a fundamental study presenting a compelling and comprehensive analysis of the newly defined Lipocone superfamily. They add that the work is exemplary in its use of sequence analysis and structural modelling and will be of broad interest to researchers studying protein evolution and biochemistry.
An important member of this superfamily is Wnt, a protein with central roles in animal and human development and associated with a range of diseases, including many cancers. Although the Wnt protein was discovered 40 years ago, its evolutionary origins have, until recently, remained mysterious.
"In 2020, we reported the discovery of the first bacterial versions of Wnt and showed that they have the characteristics of toxins or effectors used in biological conflict systems, such as fending off viruses," explains co-lead author A. Maxwell Burroughs, Staff Scientist at the Division of Intramural Research, National Library of Medicine, National Institutes of Health (NIH), Bethesda, US. "Prompted by these initial observations, we set out to identify and characterise the proteins' evolutionary relationships and predict their functions."
The team's discovery of bacterial Wnt helped to define that it had a unique ancestral structure – an alpha-helical core. Having established that bacterial and metazoan Wnt possess this shared core, they carried out sequence-based searches to identify other remote versions of these proteins across biological kingdoms. After completing the searches and inspecting the results, they found a shared four-alpha helix core across a total of 30 distinct families, together comprising a large superfamily. Of these families, 17 had no previous functional description.
One of the team's first observations was that the shared core structure of the families varied in hydrophobicity – that is, how much it repels water. Of the 30 families, 18 were sufficiently hydrophobic to suggest they would naturally sit across the lipid-rich membrane of cells. This association with the lipid membrane, combined with the cone shape of the conserved core, led the authors to name them as the Lipocone superfamily.
They then used AlphaFold, an artificial intelligence-powered structural biology tool, to predict the likely 3D structure of members of the superfamily from their amino acid sequences. Based on this analysis and contextual analysis of their operons and domain fusions, they inferred a unified biochemical mechanism for the Lipocones. The head groups of lipids (that is, the portion of a lipid that is charged and reacts with water) are positioned in an active site, while the hydrophobic tail of the lipid sits in the hydrophobic core region, allowing the Lipocone to remove or swap phosphate-linked chemical groups.
Next, the authors used algorithms to reconstruct the most likely evolutionary scenario for the diversification of the Lipocone superfamily over time. This led them to determine 30 different families (descendants from one ancestral protein). Across these families, they found statistically significant connections to roles in modifying lipid head groups in various membranes or in the production of important cell-structure molecules such as peptidoglycan and lipopolysaccharides. Further analysis narrowed down these functional predictions and identified roles in a range of cell processes, from inter-bacterial conflicts, stress response and lipid metabolism and transport, to antibiotic resistance, sensory functions and immune protection.
This led the authors to conclude that the early diversification of the Lipocone superfamily had distinct drivers at different stages. Initially in the bacterial kingdom, the emergence of an extensive repertoire of molecules called exopolysaccharides in the cell wall, cell surface and outer membrane would have driven the diversification of the early groups, whereas later further diversification of the group accompanied more specialised roles in inter-organismal conflict and antibiotic resistance.
The authors say one of the remarkable discoveries about this newly identified superfamily is the loss of certain characteristics such as the initial hydrophobicity seen in several families. This transformed them over time from integral membrane proteins to diffusible effector molecules mediating a range of biological activities.
"This is the case for Wnt, which became inactive as an enzyme but retained its ancient involvement in cell communication," says co-lead author Gianlucca Nicastro, Research Fellow at the Division of Intramural Research, National Library of Medicine, NIH. "Although inactive as an enzyme, it retains its binding pocket, raising the possibility that Wnt proteins might be involved in as-yet unexplored interactions with other molecules such as lipids."
"Using sequence and structure analysis, we've identified a large, previously unrecognised superfamily, the Lipocone, and predicted the catalytic activity and potential biochemical pathways of numerous family members for the first time, including some proteins that have remained enigmatic for over two decades," concludes senior author L. Aravind (Aravind Iyer), Senior Investigator in the Protein and Genome Evolution Research Group, National Library of Medicine, NIH. "Our findings establish a unifying theme in lipid biochemistry, explain the origins of Wnt signalling and provide new leads regarding immunity and inter-organismal conflicts across the Tree of Life."
This study was first posted as a preprint on bioRxiv and has since been peer-reviewed by Review Commons and submitted to eLife, where it was published as a Reviewed Preprint. It appears in eLife today as the final version.
Accompanying figures are available to access here .
##