(187c) Deepmetabolism: A Deep Learning Algorithm to Predict Phenotype from Genome Sequencing
AIChE Annual Meeting
2017
2017 Annual Meeting
Computing and Systems Technology Division
CAST Rapid Fire Session IV
Monday, October 30, 2017 - 5:20pm to 5:25pm
High-throughput sequencing technology has brought life science into a âbig dataâ era with an unrivaled explosion in the amount of genomic and transcriptomic data. The falling cost (<$1,000 per human genome) and increasing speed (<1 day per human genome) of high-throughput sequencing lead to the snowballing data at petabyte level. However, it is still difficult to transfigure such âBig Dataâ to valuable biological insights such as cell growth rate and metabolic pathway activities. The gap between genome sequencing and cell phenotypes is one of the biggest challenges that need to be solved to achieve âData-to-Insightâ. In recent five years, the rapid development of artificial intelligence, especially deep learning, provides a novel option to overcome this challenge. Deep learning is found to be extremely effective in learning and modeling complex systems based on the graphic processing unit computation. In this study, we developed DeepMetabolism, a deep learning algorithm that predicts cell phenotypes from genome sequencing data such as transcriptomics data. DeepMetabolism uses biological knowledge to design a neural network model and integrates unsupervised learning with supervised learning for predicting multiple phenotypes. In a prototypic application on E. coli, DeepMetabolism is able to predict phenotypes with high accuracy (PCC>0.92), high speed (<30 min for >100 GB data using a single GPU), and high robustness (tolerate up to 75% noise). We envision DeepMetabolism to bridge the gap between genotype and phenotype and serve as a springboard for applications in synthetic biology and precision medicine.