Boost Study Efficiency for Complex Treatment Interactions

Massachusetts Institute of Technology

MIT researchers have developed a new theoretical framework for studying the mechanisms of treatment interactions. Their approach allows scientists to efficiently estimate how combinations of treatments will affect a group of units, such as cells, enabling a researcher to perform fewer costly experiments while gathering more accurate data.

As an example, to study how interconnected genes affect cancer cell growth, a biologist might need to use a combination of treatments to target multiple genes at once. But because there could be billions of potential combinations for each round of the experiment, choosing a subset of combinations to test might bias the data their experiment generates.

In contrast, the new framework considers the scenario where the user can efficiently design an unbiased experiment by assigning all treatments in parallel, and can control the outcome by adjusting the rate of each treatment.

The MIT researchers theoretically proved a near-optimal strategy in this framework and performed a series of simulations to test it in a multiround experiment. Their method minimized the error rate in each instance.

This technique could someday help scientists better understand disease mechanisms and develop new medicines to treat cancer or genetic disorders.

"We've introduced a concept people can think more about as they study the optimal way to select combinatorial treatments at each round of an experiment. Our hope is this can someday be used to solve biologically relevant questions," says graduate student Jiaqi Zhang, an Eric and Wendy Schmidt Center Fellow and co-lead author of a paper on this experimental design framework.

She is joined on the paper by co-lead author Divya Shyamal, an MIT undergraduate; and senior author Caroline Uhler, the Andrew and Erna Viterbi Professor of Engineering in EECS and the MIT Institute for Data, Systems, and Society (IDSS), who is also director of the Eric and Wendy Schmidt Center and a researcher at MIT's Laboratory for Information and Decision Systems (LIDS). The research was recently presented at the International Conference on Machine Learning.

Simultaneous treatments

Treatments can interact with each other in complex ways. For instance, a scientist trying to determine whether a certain gene contributes to a particular disease symptom may have to target several genes simultaneously to study the effects.

To do this, scientists use what are known as combinatorial perturbations, where they apply multiple treatments at once to the same group of cells.

"Combinatorial perturbations will give you a high-level network of how different genes interact, which provides an understanding of how a cell functions," Zhang explains.

Since genetic experiments are costly and time-consuming, the scientist aims to select the best subset of treatment combinations to test, which is a steep challenge due to the huge number of possibilities.

Picking a suboptimal subset can generate biased results by focusing only on combinations the user selected in advance.

The MIT researchers approached this problem differently by looking at a probabilistic framework. Instead of focusing on a selected subset, each unit randomly takes up combinations of treatments based on user-specified dosage levels for each treatment.

The user sets dosage levels based on the goal of their experiment - perhaps this scientist wants to study the effects of four different drugs on cell growth. The probabilistic approach generates less biased data because it does not restrict the experiment to a predetermined subset of treatments.

The dosage levels are like probabilities, and each cell receives a random combination of treatments. If the user sets a high dosage, it is more likely most of the cells will take up that treatment. A smaller subset of cells will take up that treatment if the dosage is low.

"From there, the question is how do we design the dosages so that we can estimate the outcomes as accurately as possible? This is where our theory comes in," Shyamal adds.

Their theoretical framework shows the best way to design these dosages so one can learn the most about the characteristic or trait they are studying.

After each round of the experiment, the user collects the results and feeds those back into the experimental framework. It will output the ideal dosage strategy for the next round, and so on, actively adapting the strategy over multiple rounds.

Optimizing dosages, minimizing error

The researchers proved their theoretical approach generates optimal dosages, even when the dosage levels are affected by a limited supply of treatments or when noise in the experimental outcomes varies at each round.

In simulations, this new approach had the lowest error rate when comparing estimated and actual outcomes of multiround experiments, outperforming two baseline methods.

In the future, the researchers want to enhance their experimental framework to consider interference between units and the fact that certain treatments can lead to selection bias. They would also like to apply this technique in a real experimental setting.

"This is a new approach to a very interesting problem that is hard to solve. Now, with this new framework in hand, we can think more about the best way to design experiments for many different applications," Zhang says.

This research is funded, in part, by the Advanced Undergraduate Research Opportunities Program at MIT, Apple, the National Institutes of Health, the Office of Naval Research, the Department of Energy, the Eric and Wendy Schmidt Center at the Broad Institute, and a Simons Investigator Award.

/University Release. This material from the originating organization/author(s) might be of the point-in-time nature, and edited for clarity, style and length. Mirage.News does not take institutional positions or sides, and all views, positions, and conclusions expressed herein are solely those of the author(s).View in full here.