(233b) An Extended-Ensemble Relative Entropy Approach to Sequence-Specific Coarse-Grained Models for Peptide Aggregation | AIChE

(233b) An Extended-Ensemble Relative Entropy Approach to Sequence-Specific Coarse-Grained Models for Peptide Aggregation

Authors 

Understanding the mechanisms behind aggregation of intrinsically disordered proteins is an important and practically relevant problem: for instance, tau protein aggregation is responsible for the class of neurodegenerative diseases known as tauopathies. However, aggregating systems pose unique challenges for molecular simulation; the ranges of length and time scales involved necessitate the use of multiscale modeling. Coarse-grained models capable of accurately predicting disordered protein aggregation, and its dependence on the protein sequence and aggregation conditions, must reproduce the relevant balance of interactions driving assembly. Empirically developed top-down models can reproduce features of an aggregation process at low computational cost. However, it can be difficult to use these models to understand how specific changes in the molecular details of systems can influence their behavior. Hybrid models incorporating information from both simulations and experimentally known structures can be useful if a previously characterized protein sequence is to be studied, but are less applicable for studying novel mutations. Purely bottom-up models, in principle, can be made completely flexible in this regard. These, however, must be able to capture the particular differences in interaction strengths between pairs of residues to allow modeled disordered proteins to sample the correct ensemble of disordered configurations as well as specific ordered aggregates.

Here, we present a novel approach to developing bottom-up coarse-grained protein models using relative entropy optimization. This “extended ensemble” method allows models to be optimized simultaneously against multiple atomistic reference systems, each containing multiple interacting polypeptide chains, over a range of temperatures. The interactions in the resulting models are thus transferable to new sequences. We demonstrate this method first on simplified reduced-alphabet systems containing representative amino acids, showing how the temperatures of the reference systems can be chosen to optimize the balance of disordered configurations and ordered secondary structures in the reference ensemble. The sequence-specific coarse-grained models made with this strategy accurately predict secondary structures as well as end-to-end distance distributions, even for sequences outside of the training set. Next, we apply the method to aggregation-prone fragments of tau protein that form parts of the ordered cores of many disease-associated fibrillar tau structures, specifically the PHF6 hexapeptide and the hairpin-forming HP1 19-mer. Models of these systems can assemble large numbers of monomers into paired helical aggregates known to be characteristic of experimentally observed fibrils. Overall, we demonstrate that this approach provides a robust, systematic bottom-up path to sequence-specific coarse-grained models useful for studying protein aggregation and assembly.