HSE Unveils Genome-wide Quadruplex Map

National Research University Higher School of Economics

An international team, including researchers from HSE University, has created the first comprehensive map of quadruplexes—unstable DNA structures involved in gene regulation. For the first time, scientists have shown that these structures function in pairs: one is located in a DNA region that initiates gene transcription, while the other lies in a nearby region that enhances this process. In healthy tissues, quadruplexes regulate tissue-specific genes, whereas in cancerous tissues they influence genes responsible for cell growth and division. These findings may contribute to the development of new anticancer drugs that target quadruplexes. The study has been published in Nucleic Acids Research.

The word 'DNA' usually brings to mind a double helix, but this molecule can adopt other forms as well. In some regions, the DNA strand can unwind, bend, and form a small knot. If such a fragment is rich in guanine (denoted by the letter G in the DNA sequence), it can fold into a quadruplex—a three-dimensional structure in which several layers of guanine are stacked on top of one another. For proteins that regulate gene activity, these structures serve as prominent landmarks, helping them locate the correct regions of DNA.

Quadruplexes are short-lived structures: they form rapidly, perform their function, and then disappear, meaning that experiments can capture only a fraction of them. Moreover, different experimental methods detect different types of quadruplexes. As a result, it has not been possible to create a genome-wide map that includes all DNA regions where quadruplexes can form.

To address this problem, the team of HSE researchers further trained the genomic language model DNABERT on quadruplex data and used its predictions to reconstruct where these structures arise across the genome.

'In our study, we trained DNABERT on EndoQuad, the world's largest database of experimentally validated quadruplexes, resulting in the GQ-DNABERT model. This model evaluates DNA sequences to predict where a quadruplex is likely to form,' comments Maria Poptsova, Director of the Centre for Biomedical Research and Technology at the HSE Faculty of Computer Science .

Unlike simple algorithms that search only for sequences capable of forming a quadruplex, GQ-DNABERT also considers the surrounding DNA context, which determines whether a region actually folds into a quadruplex. As a result, the model was able to predict about 360,000 quadruplexes—far more than have been identified by individual experimental methods.

The model confirmed the well-known fact that quadruplexes frequently occur in promoters—DNA regions upstream of genes where transcription is initiated. Unexpectedly, many quadruplexes were also found in nearby enhancers, genomic elements that boost gene transcription and influence how much protein is produced. The researchers discovered that quadruplexes often form simultaneously in both the promoter and the enhancer, creating pairs that jointly regulate gene activity.

To investigate the role of these pairs in cells, the researchers used single-cell sequencing data from experiments. In such datasets, the DNA regions that are statistically associated with the activity of specific genes are pre-identified. By overlaying the GQ-DNABERT map for six tissue types onto these data, the scientists found that promoter–enhancer pairs are more often associated with genes responsible for tissue-specific functions: in the brain, for neuronal development and function; in the blood, for immune cell activity; and in the intestine, for epithelial functions. The researchers then examined these pairs in tumour tissues and compared them with those in healthy tissues. While the number of promoter–enhancer pairs containing quadruplexes was similar, the functions of the associated genes differed dramatically.

'In normal cells, these pairs are associated with tissue-specific programmes, whereas in cancer cells they are linked to universal processes of cell division and growth that drive tumour proliferation regardless of the tissue of origin,' explains Poptsova. 'In other words, in healthy cells, these pairs support tissue specialisation, while in cancer they become part of general programmes for rapid cell division.'

The resulting map of quadruplexes provides a clearer understanding of how these structures regulate gene activity in both normal and tumour cells. In the future, this information could be used to develop new anticancer drugs that selectively target quadruplexes.

The study was supported by a grant from the HSE AI Research Centre .

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.