(636b) System Identification and the “Omics” – Challenges in the Time of Big Data | AIChE

(636b) System Identification and the “Omics” – Challenges in the Time of Big Data

Authors 

Shoemaker, J. E. - Presenter, University of Tokyo
Kawaoka, Y., University of Wisconsin-Madison
Kitano, H., Systems Biology Institute


System Identification and the “Omics” – Challenges in the Time of Big Data

            Network and systems biology combined with “omics” technologies, e.g. genomics and proteomics, have often been cited as the key to understanding complex diseases and changing the fundamentals of drug discovery from large scale screening processes to streamlined, predicted protein targeting. But the current approaches to model identification are not well suited to handle big data as they attempt to overlay small subsets of the data on to incomplete models that have been mined from literature.  Here, we present a novel approach to modeling biological systems in which we attempted to reconstruct the innate immune response using only gene expression data. From the gene expression in the lungs of mice infected with influenza virus, we constructed modules of co-expressed genes which on further analysis were found to represent distinct immune-related pathways and functions. Each module was then considered to represent the output of its associated pathway/function and the gene expression was summarized with the scaled average of the expression of all genes in the module. We then applied a collection of network inference algorithms, e.g, time-delay ARACNE, to the modules to predict relationships between them and compared the resultant networks to the expectations of immunology experts. Interestingly, we were able to accurately reconstruct several aspects of the immune response as well as identify some unexpected trends. Key events such as the timing and magnitude of interferon production and the influx of macrophages and natural killer cells were all accurately predicted from gene expression data. The most counterintuitive prediction that was later validated by FACS analysis was that the number of B cells decline early in the infection. Of the network inference algorithms applied, the time-delay ARACNE performed the best by identifying events known to occur early during an infection and correctly identifying interferon’s regulatory role in leukocyte migration. In conclusion, the work presented here supports the use of omics data to create medically relevant biological models as an alternative to time consuming approaches such as literature mining. This work is now being extended so as to construct a complete, dynamic mapping between a large spectrum of pathogens and our body’s immune response.