(120f) Dgpredictor: Automated Fragmentation Method for Gibbs Energy Change Prediction of Metabolic Reaction and De Novo Pathway Design | AIChE

(120f) Dgpredictor: Automated Fragmentation Method for Gibbs Energy Change Prediction of Metabolic Reaction and De Novo Pathway Design

Thermodynamic analysis of metabolic pathways is important in identifying novel pathways for biochemical synthesis. To this end, Group Contribution (GC) methods are widely applied for estimating the standard Gibbs energy change (ΔrG'°) of enzymatic reactions using limited experimental measurements. However, drawbacks of such methods arise from their dependence on manually curated functional groups and inability to include stereochemical information; these obstacles lead to limited reaction coverage. Here, we present a moiety-based automated fragmentation method using molecular fingerprints to design a thermodynamic analysis tool called dGPredictor. It allows the ability to include stereochemistry within chemical structures and thus broadens biochemical reaction coverage. dGPredictor shows comparable accuracy as the current GC methods and, in particular, can capture Gibbs energy change for reactions that only undergo stereochemical changes, such as isomerase and transferase reactions, which show no overall group change. We apply dGPredictor to predict the Gibbs energy change for reactions involving novel structures and integration with de novo metabolic pathway design tools such as novoStoic to prohibit reactions steps for which the directionalities are thermodynamically infeasible. We also demonstrate a graphical user interface for dGPredictor that facilitates easy access for predicting Gibbs energy change of reactions at different pH and ionic strengths. dGPredictor allows customized user input of molecules as KEGG IDs or as InChI strings (for novel metabolites) and is open source (https://github.com/maranasgroup/dGPredictor). We find that dGPredictor improves the goodness of fit of the linear regression to find the moiety contribution over the widely used GC method by 78.8% (i.e., Mean Squared Error (MSE) over training data from the TECRDB database). Finally, dGPredictor also increases the coverage of ΔfG'° and ΔrG'° estimation for metabolites and reactions present in the KEGG database by 17.2% and 102%, respectively, over GC by allowing for stereochemical considerations not captured in previous expert-defined chemical groups.