(72g) Statistical Learning Theory for Protein Dynamics | AIChE

(72g) Statistical Learning Theory for Protein Dynamics



Proteins and allosteric enzymes undergo large-scale conformational transitions and exhibit wide variations in their functionality and reactivity.  Our research aims to uncover the illusive underlying mechanisms and dynamics that govern the physical properties of these biomolecules.  We developed a novel statistical learning theory to extract from the traditionally convoluted data of single-molecule FRET and pulling experiments, free energy surfaces and diffusion profiles along the physical coordinate employed for measurements.  Our method generalizes the Hidden Markov Model (HMM) to continuous state-space Markovian dynamics in order to extract the probabilistic information of the actual dynamic trajectory along the focused coordinate from indirect measurements of photon emission trajectories and the temporal histories of pulling forces.  

With a highly resolved and accurate picture of system dynamics from indirect measurements, we can optimize the statistical likelihood for the underlying free energy surface and diffusion profile.  We begin with an abstract path-integral representation of the HMM and show how though a self-consistent application of functional optimization can lead to the derivation of Expectation-Maximization (EM) equations.  The EM protocol creates progressively better approximations of the observed dynamic trajectory and converged dynamic parameters from single-molecule experimental data.

We apply this method to the dual helical lid opening transitions of tyrosine phosphatase PtpB observed with FRET to understand the proteins resilience to oxidative degradation.  We observe the equilibrium probability density and reaction rate of lid dynamics and the coupling between the two domains.  We propose and investigate the feasibility of a 2-D experimental technique to probe this coupling to understand mechanism and the accuracy the reaction coordinate.

The thrust of our research is two-fold. First, all-atom molecular dynamics simulations and free energy calculations will be applied to acquire atomic details of protein dynamics. Advanced computational methods such as the calculation of minimum free energy paths would allow us to analyze microscopic mechanisms with full atomic details. The free-energy profiles calculated from the bottoms up will then be contrasted with those determined from the statistical learning of indirect experimental measurements. The matching of bottoms-up and tops-down analyses will provide unprecedented knowledge and understanding of protein machines for modulating their actions and functions.