(474b) Using Deep Learning to Accelerate Molecular Simulations of RNA Folding | AIChE

(474b) Using Deep Learning to Accelerate Molecular Simulations of RNA Folding

Authors 

Ma, H., Argonne National Laboratory
Ramanathan, A., Argonne National Lab
Zerze, G., Princeton University
The in-vitro experiments studying RNA folding have reported the presence of non-native low-free energy structures in the folding pathways1. One bottleneck in using conventional molecular dynamics (MD) simulations in sampling the folding landscapes is that the simulation tends to get stuck in metastable states, as the non-native structures have free energies similar to the folded native structure. To address this, advanced sampling methods can be employed to accelerate the simulation. Many of those methods require defining a set of reaction coordinates or collective variables (CV) i.e. low-dimensional variables that can be used to differentiate between the different metastable states visited by the simulation. Moreover, the success of the sampling often depends on the quality of CVs (i.e. their ability to capture the slowest modes of the process). Obtaining such high quality CVs, however, is a cumbersome task. Neural networks are widely being used to obtain a non-linear set of CVs that can capture the slowest modes of a process. But the current approaches require the availability of a training data set from a reference biased or unbiased simulation to determine the CVs. To circumvent this requirement, we have adapted a deep learning-driven adaptive MD simulation package, called DeepDrive MD that finds undersampled states from an ensemble of running simulations and guides new simulations to visit the unexplored regions of the phase space. For this, we use a convolutional variational autoencoder (CVAE) to represent the visited states in a low-dimensional latent space2. Outliers from this learned latent space are identified and treated as starting points for new simulations, resulting in the exploration of newer regions of the phase space. We demonstrate the efficiency of our approach in simulating the folding pathways of a very simple RNA tetraloop GGCGAGAGCC. Our results show that the CVAE is able to learn the slowest degrees of freedom involved in the folding process and successfully samples the transition pathways from the unfolded to the folded state.

References:

1. Zerze, G. H., Piaggi, P. M., & Debenedetti, P. G. (2021). A Computational Study of RNA Tetraloop Thermodynamics, Including Misfolded States. The Journal of Physical Chemistry B, 125(50), 13685-13695.

2. Brace, A., Yakushin, I., Ma, H., Trifan, A., Munson, T., Foster, I., ... & Jha, S. (2022, May). Coupling streaming ai and hpc ensembles to achieve 100–1000× faster biomolecular simulations. In 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (pp. 806-816). IEEE.