Sampling from the equilibrium distribution of the phase space of atomic-resolution models of molecular systems (and other complex systems modeled by stochastic differential equations) is a crucial step in many workflows for drug discovery, materials science, and more. These workflows require the computation of free energy differences and ensemble averages with respect to the equilibrium distribution. Such calculations are usually accomplished using molecular dynamics simulations together with enhanced sampling techniques that often require the ability to sample from the equilibrium distribution conditioned on the values of some order parameters of the system (e.g., dihedral angles, principal components, etc.) as an intermediate step. Meanwhile in the field of Machine Learning, generative adversarial networks (GANs) have demonstrated state-of-the-art capabilities in generating samples from probability distributions that are concentrated in lower-dimensional manifolds embedded in high-dimensional ambient spaces [1]. For example, GANs have recently been successful in creating images that look like real photographs of human faces in which the GAN creates photos with unique, detailed, and realistic subtle features that are consistent with the coarse features that make up a human face [2]. In this work, we present a computational technique that combines Physics-based biasing methods for sampling conditional distributions with deep learning-based conditional generative adversarial networks (cGANs) [1, 3]. Extending GANs, cGANs sample from conditional distributions, and cGANs have recently appeared in the context of molecular modeling to interpolate continuous molecular trajectories and configurations between two points in simulation time [4]. On one hand, cGANs have the advantage of providing a generative model for the target distribution that does not suffer from the slow mixing times that afflict traditional biasing simulation methods. On the other hand, cGANs are typically trained offline and require large datasets. Physics-based simulation methods, by contrast, rely solely on a model of the system and the dynamics it undergoes without requiring any other prior knowledge. We put forward an approach that couples cGANs with biased Physics-based simulations; we believe this can bring out the best aspects of both techniques, allowing us to sample more effectively. Additionally, we explore the relationship between cGANs and flat-histogram methods [5, 6, 7, 8].
[1] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville,
and Y. Bengio. Generative Adversarial Networks. arXiv:1406.2661 [cs, stat], June 2014. arXiv:
1406.2661.
[2] Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative
adversarial networks, 2019.
[3] M. Mirza and S. Osindero. Conditional Generative Adversarial Nets. arXiv:1411.1784 [cs,
stat], November 2014. arXiv: 1411.1784.
[4] Hythem Sidky, Wei Chen, and Andrew L. Ferguson. Molecular latent space simulators. Chem.
Sci., 11:9459â9467, 2020.
[5] S. Kumar, J. M. Rosenberg, D. Bouzida, R. H. Swendsen, and P. A. Kollman. The weighted
histogram analysis method for free-energy calculations on biomolecules. I. The method. Journal
of Computational Chemistry, 13(8):1011â1021, October 1992.
[6] B. Roux. The calculation of the potential of mean force using computer simulations. Computer
Physics Communications, 91(1-3):275â282, 1995.
[7] P. E. Jacob and R. J. Ryder. The Wang-Landau algorithm reaches the flat histogram criterion
in finite time. The Annals of Applied Probability, 24(1):34â53, February 2014.
[8] Z. Tan, J. Xia, B. W. Zhang, and R. M. Levy. Locally weighted histogram analysis and stochastic solution for large-scale multi-state free energy estimation. Journal of Chemical Physics,
2016.