Reproducibility in scientific research has become a prominent issue [1]. Computational scientists, along with the rest of scientific community, are grappling with the central question: How can a computational study be performed in such a way that it can be replicated by others? To this end, we discuss the Molecular Simulation and Design Framework (MoSDeF) [2], a suite of Python-tools designed to facilitate reproducible molecular simulations, with an emphasis on the ability to perform large scale screening of structural/chemical space. The core tool, mBuild [3,4], is used to generate configurations of chemical systems of interest. mBuild leverages the concept of generative modeling, where complex systems can be constructed from smaller, interchangeable components, allowing for simple parameterization of structures. Repetitive structures like polymers, crystals, spherical or planar tiling patterns can be expressed declaratively. These features of mBuild allow users to programmatically vary parameters for a family of systems (e.g., polymer chain length) as well as vary interchangeable components (polymer subunits, crystal basis particles, etc) with minimal changes to the underlying code. mBuild integrates with Foyer [5,6] a tool for defining force field parameter usage and applying force fields to molecular systems (i.e., atom-typing). Foyer provides a force field and simulation engine agnostic method for defining parameter usage that relies upon SMARTS [6] based annotations of chemical context and overrides statements to set rule precedence. As such, these annotations provide both human and machine readable documentation of parameter usage, reducing ambiguities as to how parameters should be used and removing rule order as a source of error. By separating rules from the code to evaluate them, force field files can be constructed, modified, version controlled, and easily disseminated. We have coupled this framework with the workflow management tool Signac-flow8 enabling MoSDeF to perform large scale parameter screening, capturing the exact MoSDeF procedures, inputs, and other relevant metadata. Combined, these tools provide a framework for performing simulations that are Transparent, Reproducible, Usable by others, and Extensible (TRUE). We demonstrate the capabilities of MoSDeF with a case study examining the tribological properties of functionalized monolayer films, screening over chain length and terminal group functionality, to discover structure/property relationships that should aid in the design of better-performing films.
References
[1] Baker, M. 1,500 Scientists Lift the Lid on Reproducibility. Nature 2016, 533(7604), 452â454.
[2] âMoSDeFâ [Online]. Available: https://github.com/mosdef-Âhub.
[3] C. Klein, J. Sallai, T. J. Jones, C. R. Iacovella, C. McCabe, and P. T. Cummings, âA Hierarchical, Component Based Approach to Screening Properties of Soft Matterâ, Foundations of Molecular Modeling and Simulation, 2016, pp. 79-92.
[4] âmBuildâ [Online]. Available: https://github.com/mosdef-Âhub/mbuild.
[5] Iacovella, C. R.; Sallai, J.; Klein, C.; Ma, T. âIn Idea Paper: Development of a Software Framework for Formalizing Forcefield Atom-Typing for Molecular Simulationâ, 4th Workshop on Sustainable Software for Science: Practice and Experiences (WSSSPE4), 2016.
[6] âfoyerâ [Online]. Available: https://github.com/mosdef-Âhub/foyer.
[7] âSMARTSâ [Online]. Available: http://www.daylight.com/dayhtml/doc/theory/theory.smarts.html
[8] âSignac-flowâ [Online]. Available: https://bitbucket.org/glotzer/signac-flow.