AI Maps 3D Super-Enhancers, IDs Cell Regulators

St. Jude Children's Research Hospital

(MEMPHIS, Tenn. – March 09, 2026) Scientists usually study the molecular machinery that controls gene expression from the perspective of a linear, two-dimensional genome — even though DNA and its bound proteins function in three dimensions (3D). To better understand how key components of this machinery, such as super-enhancers, regulate genes in this 3D reality, scientists at St. Jude Children's Research Hospital developed a new algorithm called BOUQUET. Using machine learning, BOUQUET reveals that sets of genes and their regulatory elements can interact within protein condensates, high-density membraneless droplets, in cells' nuclei. The findings, which provide new insight into how cells regulate the genes that control their specialized identities, were published today in Nucleic Acids Research.

Cells express certain sets of genes to carry out specific functions; for example, a blood cell and a brain cell express different context-specific genes. There are 3 billion base pairs of human DNA, and the genes involved in cell identity are scattered throughout. Even more challenging, enhancers, DNA elements that activate gene expression, can be thousands of DNA bases away from their target genes. Scientists led by Brian Abraham , PhD, St. Jude Department of Computational Biology , saw a problem in finding the full sets of enhancers and their accompanying proteins relevant to each gene's expression across these large distances. To address this issue, they created BOUQUET to consider 3D enhancer architecture in a machine learning-based graph theory framework. Using this approach, researchers can identify which genes may be located inside transcriptional protein condensates.

"With BOUQUET, we can quantify the activating protein apparatus that is associated with each gene," said Abraham, who is the corresponding author of the study. "This assignment gave us two major advances: predicting gene expression from protein binding maps and finding which genes are likely interacting with transcriptional condensates."

Mapping controllers of cell identity

Enhancers activate gene expression by binding specific proteins and contacting target genes. In Abraham's previous work, it was observed that sets of enhancers, called "super-enhancers," were linearly proximal to genes encoding proteins with outsized roles in cell identity such as regulators of differentiation or those that enable cells to carry out identity-specific tasks.

"The idea that linear groups of enhancers, super-enhancers, play big roles in controlling cell identity has helped scientists understand many disease processes, but it's been known for years that enhancers operate in 3D, so we sought to marry these two concepts," added co-first author Kelsey Maher, PhD, Department of Computational Biology. "The data that measure these 3D interactions are complicated and noisy, so we had to use more sophisticated methods to find groups of enhancers and their target genes; that's how we ended up using graph theory and machine learning to take in the whole network context and learn enhancer communities."

While others have successfully grouped enhancers, the Abraham lab went one step further by incorporating protein binding maps. "It's been presupposed that the amount of activating protein we can link with a certain gene should track with that gene's expression, but finding a correlation like this is tricky without knowing which genome regions are important for each gene's expression," Abraham added. To their knowledge, his team is the first to show that enhancer/protein binding patterns do in fact quantitatively correlate with gene expression.

Multi-gene transcriptional condensates

The Abraham lab scientists called their enhancer groupings communities. "The data argue that communities are fundamental units of gene regulation because their parts show correlated activities, and perturbations made to one part of the community affect the whole community," said co-first author Jie Lu, PhD, Department of Computational Biology.

Each community has different levels of associated protein. The communities with the most protein were termed "3D-super-enhancers" to reflect their relationship to linear super-enhancers. Results showed that all genes previously found to interact with transcriptional condensates were within 3D-super-enhancers, and the number of these protein-rich communities matched earlier counts of transcriptional condensates.

"We thought 3D-super-enhancers might connect in some way with condensates since both have lots of protein," Lu added. "Not only did we predict and confirm a new condensate-associated gene, but we also observed two genes sharing the same condensate and being co-transcribed within it." These genes from the same community, separated by half a million base-pairs, were exposed to the same biochemical and transcriptional environment at the same time.

"All of our work here seeks to understand the machinery that controls cell identity through controlling transcription," Lu continued. Dysregulated transcription is central to malignant cell identity, so understanding how this dysregulation occurs is paramount. "If disease-causing genes are being aberrantly expressed, it's important to know if that is being controlled by specific proteins and/or specific protein assemblies," Abraham said. "Now we have a handhold into a multifaceted space to ask whether condensates might control disease gene expression."

Authors and funding

The study's other authors are Li Dong, Virginia Valentine, Seth Staller, Alaguraj Veluchamy, Li

Tian, Yuna Kim, Bensheng Ju, Marcus Valentine, John Easton, Stanley Pounds and Steven

Burden, of St. Jude.

The study was supported by grants from the Transcription Collaborative of St. Jude Children's Research

Hospital and the American Lebanese Syrian Associated Charities (ALSAC), the fundraising and awareness organization of St. Jude.

/Public Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.