A new study finds the proteins responsible for controlling which genes are expressed in a genome do more than simply turn a gene on or off. Essentially, each type of protein that interacts with a gene produces different behaviors - a finding with ramifications for everything from biomedical therapeutics to biological computing.
At issue are "epigenome regulators." Every organism's genome is made up of DNA. But that DNA is bound up with many different proteins into very compact structures. The proteins that are bound to the DNA are called the epigenome, and they control which parts of the DNA get expressed. Your blood cells, nerve cells and skin cells all have the same DNA, but perform very different functions. That's because different parts of the DNA sequence are being expressed in each cell - and that is largely controlled by which proteins are bound to different parts of the DNA in each cell.
"We already knew that the proteins in the epigenome control the way DNA is expressed," says Albert Keung, corresponding author of the study and an associate professor of chemical and biomolecular engineering at North Carolina State University. "Our goal here was to look at a single gene and quantify the full range of ways that the gene could be expressed by different proteins." Keung is the Goodnight Distinguished Scholar in Innovation in Biotechnology and Biomolecular Engineering and director of biotechnology programs in NC State's Integrative Sciences Initiative.
"The results were fascinating," says Leandra Caywood, co-first author of the study and a recent Ph.D. graduate from NC State. "For example, one protein may turn the gene on quickly; a second protein may take slightly longer to turn the gene on - but then keep it on for a long time; and a third protein might have a long time delay before turning the gene on, at which point it spikes up quickly and then turns off right away."
For this study, the researchers focused on a single gene from a yeast organism. The research team exposed the DNA from that gene to 87 different proteins, which were selected as a representative subset of the hundreds of proteins found in that yeast's epigenome. Each protein-gene interaction was tested in approximately 100 yeast cells.
The researchers used light to control the binding of each protein to the gene, and microscopy and analytical tools to measure the resultant gene expression in real time for 12 hours.
"We designed this study in a way that allowed us to capture the dynamics of this entire process," says Jessica Lee, co-first author of the study and recent Ph.D. graduate from NC State. "We could control and measure how long the protein was exposed to the gene and we could observe and measure the dynamic behavior of the gene in response to the protein."
"The big finding here was that each protein produced a uniquely patterned response of gene expression from the gene," says Keung. "The proteins are far more than an on/off switch.
"We also found that some proteins produced the same gene response across all of the yeast cells we tested - the pattern of gene expression they produced was very consistent. But other proteins produced a wide range of responses that varied from cell to cell - there was a lot of noise in the signal they produced."
In analyzing the gene expression patterns produced by each protein, the researchers found a strong association between what the literature already knows about the function of each protein and the gene expression pattern those proteins produce.
"For example, proteins that are known to recruit polymerase tend to produce similar gene expression patterns," Keung says.
The researchers then ran a wide variety of computational models to see whether any of them were able to account for all of their experimental data.
"Ideally, you want a model that helps you understand what is happening in terms of the gene's response to each of the proteins, not just some of the proteins," says Keung. "We initially thought this would be difficult, because there were so many different gene expression patterns. But it turns out that a relatively simple model - a three-state model with positive feedback - was able to capture all of the data."
Altogether, the findings of this study hold significant promise for cellular engineering.
"From a cell biology standpoint, this work gives us a much deeper understanding of how genes are regulated and expressed," says Keung. "From an engineering standpoint, our findings can be used to more dynamically control cellular behavior.
"For example, if you are biomanufacturing proteins or cell therapies for the pharmaceutical or biomedical sectors, our work can be used to fine-tune activities related to protein production.
"By the same token, even the proteins that produce random patterns of gene expression could be useful. For example, if you are trying to optimize a bioproduction pathway in a cell, there's real value in testing the full range of protein levels in the cell," says Keung. "Which ratio of proteins produces the best output? In that scenario, it would be helpful to know how to induce random gene expression, essentially creating a way to get cells to produce varying levels of proteins.
"And this is where the computational model is also valuable. By understanding not only what each protein does, but how it does it, you can make more informed decisions about how to accomplish your goals from an engineering standpoint."
A paper on the study, "Epigenome Regulators Imbue a Single Eukaryotic Promoter with Diverse Gene Expression Dynamics," is published open access in the journal iScience. The paper was co-authored by Riley Basinger, an NC State undergraduate; Lucas Abbott, a Ph.D. student at NC State; and Nicholas Levering, a former undergraduate at NC State.
This work was done with support from the National Institutes for Health under grants 5T32GM133366 and 5F31CA268873; and from the National Science Foundation under grants 2144539 and 1830910.