Gapseq: A Novel Approach for in silico Prediction and Analysis of Bacterial Metabolic Pathways and Genome-Scale Networks | AIChE

Gapseq: A Novel Approach for in silico Prediction and Analysis of Bacterial Metabolic Pathways and Genome-Scale Networks

Authors 

Waschina, S. - Presenter, Kiel University
Kaleta, C., Christian-Albrechts-University Kiel (UKSH Campus)
Zimmermann, J., Kiel University
In silico biochemical pathway analysis and metabolic phenotype prediction based on bacterial genome sequences became powerful research tools in biotechnology, medicine, and ecology. The accuracy of genome sequence-derived catalytic capability predictions and metabolic models are largely hindered by two factors: (i) False-positive and false-negative enzymatic reaction predictions due to incomplete knowledge of yet-unknown enzyme mechanisms, especially in non-model organisms, and (ii) inconsistencies in reaction databases, which frequently cause thermodynamically infeasible futile cycles and, hence, false phenotype predictions.

Here we present a new tool for pathway analysis and automated reconstruction of metabolic networks, called GapSeq, that addresses these obstacles. First, pathways are predicted by combining sequence homology search results to estimate the evidence of the presence of the pathways’ individual reactions, pathway topologies, and information about the pathways’ key reactions. Second, reactions predicted with high evidence scores are consolidated to construct draft genome-scale networks using a reaction database that has been manually curated and is completely futile cycle-free. This ensures, that also all possible reconstructions are free of thermodynamically infeasible substrate cycles. Third, a novel gap-filling algorithm was implemented that adds candidate reactions based on the reaction’s evidence scores. Moreover, the gap-filling algorithm does not only enable biomass production under the defined media composition, but also fills gaps in alternative resource utilization pathways as well as anabolic pathways. This approach has the advantage that also gaps in the network are filled with candidate reactions that are only employed by the organism under specific environmental conditions or when interacting with different organisms.

We tested GapSeq on a large set of bacterial strains, which are physiologically well-described. We show that not only the presence of strain-specific pathways are correctly predicted, but also metabolic phenotypes, such as the production of specific fermentation products, are accurately predicted by GapSeq in combination with Flux Balance Analysis.