(376aa) A Bayesian Approach to Model Selection and Parameterization for Non-Bonded Interactions | AIChE

(376aa) A Bayesian Approach to Model Selection and Parameterization for Non-Bonded Interactions

Authors 

Madin, O. - Presenter, University of Colorado Boulder
Shirts, M., University of Colorado Boulder
Messerly, R. A., National Renewable Energy Laboratory
We explore the usage of Bayesian inference as a force field parameterization paradigm and the techniques required to make these Bayesian decisions effectively. Current gradient-based strategies allow for the optimization of certain parameters, but have difficulty effectively sampling multimodal or shallow distributions. Another critical deficiency of these methods is their inability to compare multiple functional forms. Deciding which functional forms or levels of complexity are required will be an important consideration in developing the next generation of biomolecular force fields.

Reversible jump Monte Carlo (RJMC) facilitates equilibrium sampling of multiple models and their respective parameters, allowing for the estimation of probability distributions for both models and parameters. Statistical evidence in the form of Bayes factors can be taken from these distributions and allows for comparison of the relative fitness of force fields. We test this strategy on previously developed physical property surrogate models for the 2-center Lennard Jones plus quadrupole (2CLJQ) force field, allowing for fast comparison between surrogate model outputs and experimental data. RJMC sampling shows that, while the variable bond length of the 2CLJQ force field is justified for the studied diatomics and other similar molecules, the quadrupole parameter is not justified for use in the reproduction of physical properties such as density, saturation pressure, and surface tension. This analysis also shows that for other molecules, with larger expected quadrupole moments, the quadrupole parameter is required to reproduce these physical properties. This study shows the ability of RJMC to distinguish between cases where a certain parameter or functional form is justified or not.

In order to generalize this strategy, we also propose and test techniques to generate surrogate models “on-the-fly” for more general sets of parameters, using a multi-fidelity approach of equilibrium simulation, mixture reweighting with MBAR, and Gaussian process modeling to generate smooth surfaces of physical properties as a function of force field parameters. These surfaces can then be used in Bayesian inference via Monte Carlo sampling. We discuss the rational design of a multi-fidelity sampling hierarchy, and demonstrate the utility of this technique via sampling van der Waals interaction parameters for small molecules by comparing surrogate model outputs to experiment for simple physical properties (such as density, dielectric permittivity, enthalpies of mixing). We also discuss challenges associated with scaleup and the techniques required to overcome them, as well as future plans for using these techniques to rationally design better small-molecule biomolecular force fields, as a part of the Open Force Field Initiative.