(118c) Bayesian Manifold Crawling (BMAC): An Efficient Data-Driven Algorithm for Automated Discovery of Phase Transitions in Molecular Dynamics Simulations of Polymer Systems
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Computing and Systems Technology Division
10D: Advances in Computational Methods and Numerical Analysis I
Monday, October 28, 2024 - 1:06pm to 1:24pm
When performing molecular simulations, the adjustable parameters can be broadly classified into three categories according to the role they play in the simulation process. (1) State parameters, such as temperature, pressure, and volume, define some essential characteristics of the simulated system and are typically predefined based on experimental data or specific theoretical considerations. (2) Typical force field parameters that make up the functional form of the force field governing the potential energy landscape and thereby implicitly defines the dynamics and equilibrium properties of the simulated system. These parameters may be derived from fitting experimental data or quantum mechanical calculations. Additionally, various optimization and machine learning (ML) methods, such as parameter optimization algorithms and neural networks, can be employed to refine force field parameters and improve their accuracy [1] [2]. (3) System parameters specify the molecular system being considered; for our current example of diblock copolymers in solvent of varying quality, these include solvent quality and hydrophobic fraction. Current approaches typically explore these type of system parameters in simulation using a grid search over the entire parameter space. However, if the goal is to determine under which conditions a specific molecular behavior occurs, these approaches can be computationally expensive. As such, there is increasing interest in using machine learning and active learning to speed up the parameter exploration process. Current research primarily focuses on two aspects: evaluating the performance of various ML models in accelerating the search process [3] or leveraging optimization methods to identify the optimal design parameters for specific properties [4] [5]. However, standard deterministic ML-based methods require collecting substantial amounts of training data, especially when the data are subject to noise due to hidden/unresolved details in the measurements.
To address these challenges, we propose an automated manifold crawling algorithm called BMAC that is inspired by the principles of Bayesian optimization. The goal of BMAC is to efficiently search over the parameter space in molecular dynamic simulations such that desired behaviors are discovered. In general, the behaviors of interest are described as a âmanifoldâ (a locally Euclidean topological space) such that we must devise an efficiently sampling strategy for tracing out this manifold. BMAC does this by first constructing a heteroskedastic Gaussian process (HGP) model from noisy simulation data. The HGP serves two purposes: (i) it provides a systematic way to filter noise from independent observations of the simulated outcomes and (ii) it enables a probabilistic prediction of features that can be used to define the behaviors from time series data (e.g., the derivative of the radius of gyration needed to compute the coil-to-globule transition point). To ensure that the measurements derived from the simulation are statistically independent, we also propose a strategy for detecting when a new sample time is no longer correlated with the past measurement using the auto-correlation function. Using the HGP, our BMAC algorithm decides what configuration of new simulations is most informative for growing the âsizeâ of the manifold. We illustrate the effectiveness of BMAC on a simulation of a block copolymer composed of hydrophobic polystyrene (PS) and hydrophilic polyethylene glycol (PEG) in a small molecule solvent [7]. Our algorithm automatically identifies the coil-to-globule transition point (that occurs for some value of the solvent-hydrophilic bead interaction strength) as a function of hydrophobic bead fraction. To the best of our knowledge, BMAC is the first approach that can identify such behavior without any starting domain knowledge or human intervention.
References
[1] Razi, M., Narayan, A., Kirby, R. M., & Bedrov, D. (2020). Force-field coefficient optimization of coarse-grained molecular dynamics models with a small computational budget. Computational Materials Science, 176, 109518.
[2] Sestito, J. M., Thatcher, M. L., Shu, L., Harris, T. A., & Wang, Y. (2020). Coarse-grained force field calibration based on multiobjective bayesian optimization to simulate water diffusion in poly-ε-caprolactone. The Journal of Physical Chemistry A, 124(24), 5042-5052.
[3] Patra, T. K., Loeffler, T. D., & Sankaranarayanan, S. K. (2020). Accelerating copolymer inverse design using monte carlo tree search. Nanoscale, 12(46), 23653-23662.
[4] Aoyagi, T. (2022). Optimization of the elastic properties of block copolymers using coarse-grained simulation and an artificial neural network. Computational Materials Science, 207, 111286.
[5] Bejagam, K. K., An, Y., Singh, S., & Deshmukh, S. A. (2018). Machine-learning enabled new insights into the coil-to-globule transition of thermosensitive polymers using a coarse-grained model. The Journal of Physical Chemistry Letters, 9(22), 6480-6488.
[6] Kanada, R., Tokuhisa, A., Tsuda, K., Okuno, Y., & Terayama, K. (2020). Exploring successful parameter region for coarse-grained simulation of biomolecules by Bayesian optimization and active learning. Biomolecules, 10(3), 482.
[7] Hpone Myint, K., Brown, J. R., Shim, A. R., Wyslouzil, B. E., & Hall, L. M. (2016). Encapsulation of nanoparticles during polymer micelle formation: a dissipative particle dynamics study. The Journal of Physical Chemistry B, 120(44), 11582-11594.