(53b) Deepmetabolism: A Deep Learning System to Predict Phenotype from Genome Sequencing | AIChE

(53b) Deepmetabolism: A Deep Learning System to Predict Phenotype from Genome Sequencing

Authors 

Guo, W. - Presenter, Virginia Polytechnic Institute and State University
Xu, Y., ai.codes, Inc.
Feng, X., Virginia Polytechnic Institute and State University
High-throughput sequencing technology has brought life science into a “big data” era with an unrivaled explosion in the amount of genomic and transcriptomic data. The falling cost (<$1,000 per human genome) and increasing speed (<1 day per human genome) of high-throughput sequencing lead to the snowballing data at petabyte level. However, it is still difficult to transfigure such “Big Data” to valuable biological insights such as cell growth rate and metabolic pathway activities. The gap between genome sequencing and cell phenotypes is one of the biggest challenges that need to be solved to achieve “Data-to-Insight”. In recent five years, the rapid development of artificial intelligence, especially deep learning, provides a novel option to overcome this challenge. Deep learning is found to be extremely effective in learning and modeling complex systems based on the graphic processing unit computation. In this study, we developed DeepMetabolism, a deep learning system that predicts cell phenotypes from genome sequencing data such as transcriptomics data. DeepMetabolism uses biological knowledge to design a neural network model and integrates unsupervised learning with supervised learning for predicting multiple phenotypes. In a prototypic application on E. coli, DeepMetabolism is able to predict phenotypes with high accuracy (PCC>0.92), high speed (<30 min for >100 GB data using a single GPU), and high robustness (tolerate up to 75% noise). Such outstanding features can be used to facilitate the discovery of natural products via rapid analysis based on DeepMetabolism. We envision DeepMetabolism to bridge the gap between genotype and phenotype and serve as a springboard for applications in synthetic biology and precision medicine.