MIT engineers' new model could help researchers glean insights from genomic data and other huge datasets
CAMBRIDGE, MA -- Over the past two decades, new technologies have helped scientists generate a vast amount of biological data. Large-scale experiments in genomics, transcriptomics, proteomics, and cytometry can produce enormous quantities of data from a given cellular or multicellular system.
However, making sense of this information is not always easy. This is especially true when trying to analyze complex systems such as the cascade of interactions that occur when the immune system encounters a foreign pathogen.
MIT biological engineers have now developed a new computational method for extracting useful information from these datasets. Using their new technique, they showed that they could unravel a series of interactions that determine how the immune system responds to tuberculosis vaccination and subsequent infection.
This strategy could be useful to vaccine developers and to researchers who study any kind of complex biological system, says Douglas Lauffenburger, the Ford Professor of Engineering in the departments of Biological Engineering, Biology and Chemical Engineering.
"We've landed on a computational modeling framework that allows prediction of effects of perturbations in a highly complex system, including multiple scales and many different types of components," says Lauffenburger, the senior author of the new study.
Shu Wang, a former MIT postdoc who is now an assistant professor at the University of Toronto, and Amy Myers, a research manager in the lab of University of Pittsburgh School of Medicine Professor JoAnne Flynn, are the lead authors of a new paper on the work, which appears today in the journal Cell Systems.
Modeling complex systems
When studying complex biological systems such as the immune system, scientists can extract many different types of data. Sequencing cell genomes tells them which gene variants a cell carries, while analyzing messenger RNA transcripts tells them which genes are being expressed in a given cell. Using proteomics, researchers can measure the proteins found in a cell or biological system, and cytometry allows them to quantify a myriad of cell types present.
Using computational approaches such as machine learning, scientists can use this data to train models to predict a specific output based on a given set of inputs -- for example, whether a vaccine will generate a robust immune response. However, that type of modeling doesn't reveal anything about the steps that happen in between the input and the output.
"That AI approach can be really useful for clinical medical purposes, but it's not very useful for understanding biology, because usually you're interested in everything that's happening between the inputs and outputs," Lauffenburger says. "What are the mechanisms that actually generate outputs from inputs?"
To create models that can identify the inner workings of complex biological systems, the researchers turned to a type of model known as a probabilistic graphical network. These models represent each measured variable as a node, generating maps of how each node is connected to the others.
Probabilistic graphical networks are often used for applications such as speech recognition and computer vision, but they have not been widely used in biology.
Lauffenburger's lab has previously used this type of model to analyze intracellular signaling pathways, which required analyzing just one kind of data. To adapt this approach to analyze many datasets at once, the researchers applied a mathematical technique that can filter out any correlations between variables that are not directly affecting each other. This technique, known as graphical lasso, is an adaptation of the method often used in machine learning models to strip away results that are likely due to noise.
"With correlation-based network models generally, one of the problems that can arise is that everything seems to be influenced by everything else, so you have to figure out how to strip down to the most essential interactions," Lauffenburger says. "Using probabilistic graphical network frameworks, one can really boil down to the things that are most likely to be direct and throw out the things that are most likely to be indirect."
Mechanism of vaccination
To test their modeling approach, the researchers used data from studies of a tuberculosis vaccine. This vaccine, known as BCG, is an attenuated form of Mycobacterium bovis. It is used in many countries where TB is common but isn't always effective, and its protection can weaken over time.
In hopes of developing more effective TB protection, researchers have been testing whether delivering the BCG vaccine intravenously or by inhalation might provoke a better immune response than injecting it. Those studies, performed in animals, found that the vaccine did work much better when given intravenously. In the MIT study, Lauffenburger and his colleagues attempted to discover the mechanism behind this success.
The data that the researchers examined in this study included measurements of about 200 variables, including levels of cytokines, antibodies, and different types of immune cells, from about 30 animals.
The measurements were taken before vaccination, after vaccination, and after TB infection. By analyzing the data using their new modeling approach, the MIT team was able to determine the steps needed to generate a strong immune response. They showed that the vaccine stimulates a subset of T cells, which produce a cytokine that activates a set of B cells that generate antibodies targeting the bacterium.
"Almost like a roadmap or a subway map, you could find what were really the most important paths. Even though a lot of other things in the immune system were changing one way or another, they were really off the critical path and didn't matter so much," Lauffenburger says.
The researchers then used the model to make predictions for how a specific disruption, such as suppressing a subset of immune cells, would affect the system. The model predicted that if B cells were nearly eliminated, there would be little impact on the vaccine response, and experiments showed that prediction was correct.
This modeling approach could be used by vaccine developers to predict the effect their vaccines may have, and to make tweaks that would improve them before testing them in humans. Lauffenburger's lab is now using the model to study the mechanism of a malaria vaccine that has been given to children in Kenya, Ghana, and Malawi over the past few years.
His lab is also using this type of modeling to study the tumor microenvironment, which contains many types of immune cells and cancerous cells, in hopes of predicting how tumors might respond to different kinds of treatment.
###
The research was funded by the National Institute of Allergy and Infectious Diseases.