(3ep) Integrating Multi-Omics Datasets and Big Mechanistic Models to Select Experiments Intelligently | AIChE

(3ep) Integrating Multi-Omics Datasets and Big Mechanistic Models to Select Experiments Intelligently

Authors 

Erdem, C. - Presenter, CLEMSON UNIVERSITY
The incidence and mortality rates of all cancers are predicted to increase 10 million by 2040. This is partly due to the fact that we still do not know enough about molecular signaling mechanisms of cancer cells. There are 69 approved drugs for breast cancer treatment, signifying the heterogeneity of patient tumors and the requirement of specialized interventions to block tumor growth. Except very few examples so far, adaptation mechanisms emerge after drug treatment and finally relapses occur. The drugs are discovered through large trial-and-error experiments with tens of thousands of chemicals, refining their structure for safety/toxicity, and administering years of clinical trials. If we can reduce this burden by understanding the underlying molecular mechanisms, it would provide more efficient drug discovery, right treatment for right patient (precision medicine), and even being able to prevent certain cancer types. Building computational models that explain and predict such mechanisms can help us in this task. Some models are mechanistic: detailed equations describing what is known (or supposed) to be happening, while some are statistical: descriptive and explorative correlations of what might be occurring within cells or patients. However, these two types of modeling are rarely combined, missing the opportunity to generate new knowledge while explaining what is already known. I have already worked on both approaches, and now, I propose to explore a combination of both methods to develop better models that will accurately represent the generated biological knowledge. In this proposed framework, statistical regression modeling will provide only a handful, interpretable descriptors of the phenotypic outputs. These possibly new interactions between genes or proteins will be used as new connections in the constructed mechanistic models. Many candidate model topologies will be explored with multiple parameter-sets to recapitulate the data. Models successful in explaining training data will be checked for novel interactions present. These new connections will be the focus of further experiment selection. Initially, I will combine a novel application of gaussian linear models generating robust statistical associations with one of the largest mechanistic models in the literature. For training data, I will use the recently generated big data cube by NIH-LINCS Consortium and MCF10A Common Project. The dataset consists of multiple omics assay types on breast epithelial MCF10A cell line. My experience in all these three specific pieces makes this work a great proof-of-concept experiment, where I plan to expand such work to other cancer types and diseases.

Research Interests

Computational tools are becoming indispensable in medical research, where a cycle of experimentation and computation is used to learn about and test new hypotheses. In systems biology, computational modeling guides experimental hypothesis generation and experimental observations enable fine-tuning of computational models to understand the biological phenomena. Owing to the advances in wet-lab experimental techniques and tools, repositories of “Big Data” become larger each year. The knowledge base of these databases includes genomics, proteomics, epigenetics, and clinical information. To advance our understanding of the underlying biological facts, we should become more practical in analyzing these datasets. Divide and conquer strategies so far accomplished understanding certain, context dependent, and scope-limited biological events. Now, with the wealth of aforementioned big datasets, we should extract as much information as possible and combine these into already existing detailed models of signaling pathways. Only then, we can start to personalize our models to specific tumor data.

During my past and current efforts, I have developed and refined methods for combining biological data to reveal details of aberrant signaling in cancer cells. In my doctoral work, I have studied mechanisms of differential pathway activation by two similar hormone receptors, Insulin-like growth factor 1 receptor (IGF1R) and insulin receptor (InsR), in breast cancer [1]. Contrary to common belief that these two receptors are indistinguishable, we showed that they respond differently to the same downstream perturbation, and distinctively activate cell survival pathways. My postdoctoral work showed that computational modeling can aid in simulation of normal functions as well as disease modes and studied non-tumorigenic normal-like breast epithelial MCF10A cells. With the help of our lab members, I have re-coded a custom-made large mechanistic model into an open-source, human-interpretable, and easy to alter structured format. We have released the model specifics publicly on GitHub. By studying the cellular and phenotypic response of these normal-like cells under different conditions and at multiple levels (genomic, proteomic, epigenetic), we showed how these cells integrate signal inputs and make different growth-related decisions [1, 2].

Based on my previous experience, the need for better models to explain different disease contexts, and the availability of large datasets, my research proposal is to merge existing models and information from big data resources to create larger computational models for exploring pan-disease biology. (1) I want to analyze and integrate multiple big datasets using statistical regression modeling to generate new association candidates. This will allow data-driven exploration of existing information to narrow down possible non-canonical, unknown interactions. (2) I want to create even larger mechanistic computational models using the interaction candidates from previous analysis and literature to study in-silico clinical intervention strategies. By combining new nan-canonical interactions with known mechanistic models, I can create hundreds of different model candidates to explore which mechanisms are better at explaining the experimental observations. (3) I would like to further delineate the signaling pathways of IGF1 and insulin receptors, with an ultimate goal of finding better therapies for cancer (IGF1 related) and diabetes (insulin related). Here, I can use the models constructed during project (2) to study the details of mechanistic differences while exploring the association candidates from (1). By such an effort, I can contribute large mechanistic models better at explaining different scenarios together with numerous context-dependent interactions, while carrying out as little number of wet-lab experiments as possible.

References

  1. Erdem C, Nagle AM, Casa AJ, Litzenburger BC, Wang Y, Taylor DL, et al. Proteomic Screening and Lasso Regression Reveal Differential Signaling in Insulin and Insulin-like Growth Factor I (IGF1) Pathways. Mol Cell Proteomics. 2016;15:3045–57. doi:10.1074/mcp.M115.057729.
  2. Bouhaddou M, Barrette AM, Stern AD, Koch RJ, DiStefano MS, Riesel EA, et al. A mechanistic pan-cancer pathway model informed by multi-omics data interprets stochastic cell fate responses to drugs and mitogens. PLoS Comput Biol. 2018;14. doi:10.1371/journal.pcbi.1005985.

Teaching Interests

Being an international and intercontinental learner, I have experienced varieties of instructors, classes, techniques, tools, and approaches to teaching. Every individual learns differently at a different pace, so making the learning experience personal makes all the difference. To do so, I will start by having project and homework-based classrooms. I will incorporate active learning methods when available, especially nowadays where it was shown that classes with active learning had higher student scores and decreased fail rates when compared to traditional lecturing [1].

In addition to TA duties, I have experience in mentoring individuals as well as groups of students on specific projects. Especially, the Creative Inquiry system at Clemson University has enabled me to advertise a project, recruit students from multiple departments, and lead their weekly efforts. Having this experience together with summer interns and graduate students I have guided over the years, helped me learn where to put the bar for short term projects compared to PhD level research. Such experiences will help me guide future students on how to manage their time.

I can teach any course from the curriculum, like thermodynamics or heat transfer. Additionally, I would like to take on or develop courses in Bioinformatics, Systems Biology, Basic Coding, and Introduction to Engineering. A course combining bioinformatics, system biology, and mathematical modeling of biological systems is becoming a must for chemical engineering departments. It will enable students to see and understand the abstract definitions they are taught in relevant courses such as Biotechnology and Numerical Methods.

References

  1. Freeman S, Eddy SL, McDonough M, Smith MK, Okoroafor N, Jordt H, et al. Active learning increases student performance in science, engineering, and mathematics. Proc Natl Acad Sci USA. 2014 Jun 10;111(23):8410.

Topics