(2cq) From Causal Discovery to Multiscale Modeling in Biological Signaling Networks | AIChE

(2cq) From Causal Discovery to Multiscale Modeling in Biological Signaling Networks

Authors 

Gregg, R. - Presenter, University of Pittsburgh
Research Interests

Modeling complex dynamic systems requires detailed knowledge of which system components interact as well as how quickly/efficiently/robustly those components communicate with one another. In biological networks, proteins, enzymes, and molecules are the system components that propagate signals of interest while their communication is dictated by reactions rates between interacting species. Unfortunately, these signaling events are noisy and difficult to measure because they are commonly driven by reactions at nanomolar concentrations. This often leads to contradictory experimental findings and large uncertainties in model predictions. Having reliable model predictions for biological signaling networks is crucial because they are the foundation for more complex biological behaviors like angiogenesis in tumor microenvironments or airway remodeling in chronic obstructive pulmonary disease (COPD). My research interests revolve around computational methods that allow us to gain insight into system architecture (causal discovery algorithms) and take those finding to build models that provide new insights into these complex systems (ordinary differential equation and multiscale Cellular Potts modeling).

Graphical-based causal discovery algorithms can help infer potential cause-effect relationships from measured variables. Given observational data, these algorithms perform conditional independence tests to distinguish between true direct interactions and simple correlations. The result is a directed acyclic graph (DAG) that satisfies all conditional independence tests performed. These algorithms are proven to asymptotically approach the true causal graph, provided several assumptions are met: the Markov assumption, faithfulness, and acyclicity. The increasing volume and interoperability of large multi-omics datasets enable causal discovery algorithms to better identify important relationships across biological systems. My main project as a postdoctoral fellow has been to use these methods to identify cause and effect relations between clinical measurements and COPD progression. Our preliminary findings suggest that airway disease—as opposed to emphysema—plays a more important role in initial disease progression. One of my goals as a future faculty researcher is to utilize causal discovery algorithms to investigate causal relationships within poorly understood signaling pathways.

Ordinary Differential Equation (ODE) modeling is useful when one is confident about the architecture of the biological signaling pathways being modeled. They can be difficult to fit to experimental data due to the large number of parameters they require, but Bayesian methods like Markov Chain Monte Carlo (MCMC) can aid in parameterization as they circumvent some parameter identifiability issues. The main benefit of using ODE modeling is the discovery of emergent behavior. For example, in a study investigating how the innate immune system detects foreign cytosolic DNA, I discovered a negative feedback mechanism that was robust to knock-down simulations. Complete shutdown of this feedback loop was necessary to develop a chronic inflammatory state. Of course, ODE modeling has limitations, including the “well-mixed” assumption that makes modeling spatial variations like cytokine diffusion impossible.

Fortunately, methods like Cellular Potts modeling (CPM) provide a way to simulate cellular and subcellular behaviors using previously established ODE models. CPMs are typically defined by a grid of numbers where adjacent sites with the same value belong to an individual cell. The model uses a Metropolis–Hastings algorithm to update grid sites to match their neighbors. Updates are accepted or rejected given a set of penalties which can, for example, encourage cells to adhere together or maintain a desired size. As these steps are applied to the grid, patterns observed in real cellular systems begin to emerge including cell migration, chemotaxis, and diffusion. I am currently developing custom open-source software capable of simulating these models at large scales.

My research interests have aligned to develop this incremental workflow that starts with examining causal relationships within biological systems and leads to modeling their dynamics to investigate emergent behaviors. I have currently focused this modeling workflow on pulmonary diseases including tuberculosis, influenza, and COPD; however, this can be easily generalized to other systems like cancer, heart disease, or metabolism.

Teaching Interests

Over my career, I have gained extensive experience with teaching and mentoring. Much of this experience began as an undergraduate working with Upward Bound: a federally funded program aimed at preparing low-income, first-generation, and ESL high school students for college. After working with them for several years, I was eventually able to come in as a course instructor over the summer and teach senior high school students calculus and differential equations. Developing my own curricula and materials for these courses was extremely rewarding and cemented my desire to pursue an academic career.

On the mentorship and leadership side, I was charged with managing a team of undergraduate software developers as a graduate student. The goal was to develop an educational virtual reality app for high school students to teach them about my research in innate immune signaling. The key skill I learned from this experience was how to communicate and condense complex biological phenomena for a general audience while still making the material approachable and engaging.

Finally, as a postdoctoral fellow, I have volunteered with the Pittsburgh Literacy program to help non-traditional students prepare for their high school equivalency tests. This was especially challenging because I often taught students with math anxiety, limited literacy skills, or those with complete disinterest in the subject. Learning with these students allowed me to develop strategies to overcome those obstacles and eventually further their education.

In addition to teaching the core pillars of chemical engineering (particularly reactor design and control theory), I see a need to develop a course where we teach graduate and undergraduate students scientific data visualization to successfully communicate their research. This course would focus on developing skillsets and learning tools to create high quality figures for academic publications and presentations. The tools used (e.g., Inkscape, R/Python, Blender) would be open source to ensure students can transfer these skills if they choose to leave academia or could not afford the relevant software licenses. Students successful in this course would master these skills, but also become better scientific communicators which is a severely neglected soft skill in engineering.

Overall, my teaching interests coincide with my teaching philosophy that every student brings with them a diverse background which impacts their ability to learn. The goal of a teacher is to hear those unique perspectives and determine what may or may not motivate an individual to learn. Is this always practical in a large classroom setting? Of course not, but we as teachers can work toward this ideal through simple changes like making office hours a more welcoming environment or explaining the same concept using multiple strategies. Once we understand what motivates a student they can actively engage in the material and extract what information they find useful for their future success.