Scientists have identified how specific genetic changes function in cells to influence disease risk and other human health traits. By probing regions of DNA previously linked to disease, the work has created high resolution maps of DNA variant activity, helping pinpoint the exact changes that shape blood pressure, cholesterol levels, blood sugar and other complex human traits.
The study, published today in Nature and led by researchers from The Jackson Laboratory (JAX), the Broad Institute, and Yale University, takes on a long-standing challenge in human genetics. Scientists have known for years that certain regions of the genome—often spanning tens of thousands to millions of DNA letters—are associated with diseases. But because these regions usually contain many variants that could potentially drive those associations, performing the necessary experiments to pinpoint which specific DNA changes truly matter has been difficult and time-consuming.
The solution was scale. Using a method capable of testing thousands of such variants at once, the team tested more than 220,000 previously identified DNA changes in five different cell types. By doing so, they resolved about 20 percent of these regions across the genome, revealing new insights into what these variants do, which in turn can help improve risk prediction and guide the development of new therapies.
"For nearly two decades, genetic studies have identified where in the genome we need to look for disease risk, but not which specific DNA changes are responsible," said Ryan Tewhey, a geneticist and an associate professor who led the team at JAX. "Our study helps close this gap by working at the scale needed to confidently pinpoint the specific DNA changes that matter across thousands of regions all at once, rather than one by one."
Tewhey explained that previously making these connections was like searching for a single typo on one page of a massive book. This experimental approach is akin to speed reading, scanning thousands of pages at once and flagging the exact letters that change meaning, dramatically accelerating genetic discovery.
"What excites me is that this is a bridge from association to biology," said Layla Siraj, first author of the study, which she spearheaded while in the Lander Lab at the Broad Institute, and now in her residency in obstetrics and gynecology at Columbia University/New York Presbyterian. "By uncovering the patterns underlying how single-letter changes affect gene regulation, we can start mechanistically connecting genetic risk to the pathways therapies could target."
In addition to Tewhey and Siraj, the study was co-led by Jacob Ulirsch, currently a group leader at Illumina. Key authors also include Steven Reilly, assistant professor at Yale School of Medicine; and Hilary Finucane, associate member at the Broad Institute and assistant professor at Harvard Medical School and Massachusetts General Hospital.
Building a foundation for better disease risk prediction
Most DNA changes linked to common diseases like heart disease and type 2 diabetes occur not within genes themselves—which only constitute about 2 percent of the genome—but in the vast stretches of non-coding DNA, where regulatory elements exist that control when, where and how strongly our genes are expressed. Genetic studies conducted over the last two decades have identified millions of such non-coding disease-related variants throughout the genome. The challenge has been identifying which of the many single-letter changes in these regulatory DNA regions affect gene activity, fine-tuning protein production and in turn shaping disease risk.
To meet this challenge, the researchers used a technology called a massively parallel reporter assay, which allowed them to test the effects of more than 220,000 single-letter DNA variants at the same time across different cell types, including brain, liver and blood cells. Each stretch of DNA was paired with a molecular tag, or reporter, that they could directly measure to see whether a variant increased, decreased, or had no effect on gene activity—an important step in understanding how regulatory DNA changes may affect health.
The results revealed over 13,000 single-letter variants that influence how strongly a gene is expressed. While most act independently, the team found that about 11 percent behaved differently than expected when combined with a nearby variant. This surprising result suggests some genetic risk of disease depends on specific combinations of variants whose whole is greater than the sum of its parts.
These insights revealed potential links to human health. In some cases, pairs of variants were associated with gene activity linked to lower levels of LDL, or "bad" cholesterol. Other combinations appear to affect a gene associated with blood pressure. The team also identified two variants near the ESS2 gene--associated with developmental disorders--whose combined effect on gene expression was greater than would be expected from either variant alone.
Improving equity in genetics-driven advances
In another example, the researchers pinpointed a single variant associated with long-term blood sugar control discovered in people of European ancestry. Based on its molecular behavior, they predicted that similar but previously understudied variants, found predominantly in people of African ancestry, would show a similar association. Follow-up analysis confirmed that prediction, underscoring the importance of understanding genetic mechanisms across diverse populations.
While the study identified which DNA variants regulate specific protein-coding genes in the brain, liver and blood cells, additional experiments will be needed to determine how those variants ultimately influence traits and disease risk. Given the body's many tissues and thousands of distinct cell types, switching genes on or off in a single cell type is only one piece of a much larger puzzle in determining health outcomes. In addition, millions of genetic variants remain untested. Even so, the researchers say the findings can already begin strengthening how scientists study genetic variation and how they influence health traits.
"These findings do more than explain known disease associations. They provide training data to build predictive models of the effects of variants we haven't yet studied or that remain undiscovered," Tewhey said.
Tewhey, Reilly, and their colleagues recently created such a model with this data. Published in Nature in 2024, they used this model to design synthetic DNA sequences that could selectively turn genes on in distinct tissue types one at a time. It also builds on works Tewhey and Ulirsch published in 2016 while colleagues at Broad. Together, these advances point toward a future where genetic risk can be more accurately predicted and where therapies can be designed to act only in the tissues where they are needed most.