(346bb) Bayesian Model Selection for Non-Covalent Interactions | AIChE

(346bb) Bayesian Model Selection for Non-Covalent Interactions

Authors 

Madin, O. - Presenter, University of Colorado Boulder
Messerly, R. A., National Renewable Energy Laboratory
Boothroyd, S., University of Colorado Boulder
Shirts, M., University of Colorado Boulder
Non-covalent interactions, commonly divided into electrostatic and van der Waals interactions, play critical roles in many molecular processes. It is crucial to model these interactions accurately when performing molecular dynamics (MD) simulations. Non-covalent interactions are generally difficult to parameterize, because relatively simple models must describe complex interactions between many particles. We focus on the modeling of van der Waals interactions, which require decisions about functional forms, parameters, combining rules and atom typing.

As part of the Open Force Field Initiative, we are developing force fields, including non-covalent interaction models, using data-driven techniques. To this end, we explore the use of Bayesian inference to make data-driven choices between dispersion-repulsion parameters and functional forms, by calculating Bayes factors, which are essentially ‘odds’ between different models and sets of parameters. This strategy requires repeated evaluation of parameter sets, which has previously been a large source of computational expense.

In this study, we test this strategy on the 2-center Lennard Jones plus Quadrupole (2CLJQ) model for simple fluids, as its simple functional form is easily modified and analytical ‘surrogate models’ exist in the literature, allowing for fast evaluation of parameter sets. In this way, we can sample over the entire distribution of parameters and calculate Bayes factors, without incurring the computational cost of equilibrium simulations. Using the reversible jump Monte Carlo (RJMC) algorithm, we sample the posterior probability distributions of both the models and the parameters.

We ask whether including the model’s quadrupole parameter is justified while reproducing temperature-dependent density, saturation pressure, and surface tension data for simple molecules. In general, we find that the quadrupole is not justified for reproducing these properties (with several notable exceptions). Additionally, we produce parameter probability distributions for these compounds, valuable information for guiding future parameterization; through these distributions we identify targets for future dimensional reduction. This work demonstrates the utility of Bayesian inference as a tool for model selection and paves the wave for future application of this technique to more complex decisions required in fitting complete force fields.