(394b) Uncovering Interaction Pathways in Protein Dynamics Using Machine Learning Models | AIChE

(394b) Uncovering Interaction Pathways in Protein Dynamics Using Machine Learning Models

Authors 

Bhethanabotla, V. - Presenter, University of Pennsylvania
In machine learning approaches to protein characterization and design, representations typically focus on either sequence-based or structure-based information. However, these representations may overlook the dynamic nature of proteins, crucial for their properties and functionalities. Analysis of molecular dynamics simulations offers insights into such dynamic interactions and the effect of minimal modifications to protein sequences and structures, including effects on conformational entropy and long-range allosteric effects, often missed by traditional sequence and structure-based models. These interactions can be particularly relevant in engineering strategies like directed evolution, a strategy of interest in the design of enzymatic systems. [1] Machine-learning models provide an avenue to analyzing rich dynamics data which, even without knowledge of the underlying equations of motion, can uncover directed relational interactions between the variables under study. [2]

A particularly successful strategy to engineering proteins, directed evolution, involves navigating sequence space from a starting point to optimize enzyme properties. This process can be guided randomly or by targeting specific sites for mutation. However, traditional iterative optimization approaches encounter challenges when dealing with epistatic interactions, where mutations' effects depend on higher-order interactions, such as modulation by ligands, substrates, allostery, or conformational dynamics. [3] Understanding these interactions can inform the selection of sites for directed evolution libraries (sites selected based on predicted importance to modifying the protein system’s desired property or function), improving the efficiency of enzyme design strategies. We focus our system of interest on the beta subunit of tryptophan synthase, an enzyme responsible for the conversion of amino acid serine to tryptophan. In nature, this system is bound to a partner protein and exhibits an allosteric activation for its functionality, which can be recapitulated using directed evolution to promote non-natural, standalone activity. [4] Using a deep-learning approach to analyze data from molecular dynamics simulations of this system, we generate directed graph representations which encode dynamic, pair-wise interactions between residues in the wild-type and variants of the enzymatic protein, guiding choices of site targets. Comparisons of these results with traditional methods used in an experimental setting are also made and suggest potential extensions for this approach to computationally-assisted library design in other systems.

1. Romero, P. A.; Arnold, F. H. Exploring Protein Fitness Landscapes by Directed Evolution. Nat Rev Mol Cell Biol 2009, 10 (12), 866–876.

2. Löwe, S.; Madras, D.; Zemel, R.; Welling, M. Amortized Causal Discovery: Learning to Infer Causal Graphs from Time-Series Data. arXiv February 21, 2022.

3. Starr, T. N.; Thornton, J. W. Epistasis in Protein Evolution. Protein Science 2016, 25 (7), 1204–1218.

4. Buller, A. R.; Brinkmann-Chen, S.; Romney, D. K.; Herger, M.; Murciano-Calles, J.; Arnold, F. H. Directed Evolution of the Tryptophan Synthase β-Subunit for Stand-Alone Function Recapitulates Allosteric Activation. PNAS 2015.