(342x) Managing a High-Throughput Screening Workflow with Open-Source Software: A Study of Tribological Properties of Thin Films | AIChE

(342x) Managing a High-Throughput Screening Workflow with Open-Source Software: A Study of Tribological Properties of Thin Films

Authors 

Quach, C. D. - Presenter, Vanderbilt University
Gilmer, J., Vanderbilt University
Iacovella, C., Vanderbilt University
Cummings, P., Vanderbilt University
McCabe, C., Vanderbilt University
High-throughput screening studies heavily utilize automated workflow-management tools to scan over large parameter spaces. Here, we present a case study of a high-throughput screening study using molecular dynamics (MD) simulations that scan over monolayer film designs in search of those that reduce friction and wear for surfaces in contact. We will discuss the workflow used in the study to build and manage nearly 40,000 systems using the Molecular Simulation Design Framework (MoSDeF), as well as the processing of the data generated. MoSDeF includes tools to enable the automatic initialization of chemical systems [2] via two Python libraries, mBuild [3, 4] and Foyer [5, 6]. These tools automate the system initialization process and enable the batch creation of ten of thousands of systems. These systems can then be simulated with simulation codes under various ensembles before their final trajectories are analyzed and desired properties calculated using data analysis libraries such as MDTraj and MDAnalysis [7, 8]. Tools from the Signac framework were utilized to generate and manage the workspace and control the workflow, including system initialization, simulations, data analysis, and orchestrating the remote submission of the jobs [1]. The random forest regressor as implemented in the scikit-learn library was also used to determine the structure-property relationships (QSPR) for the data set [9, 10]. The project workflow is hosted on GitHub and with the necessary scripts to generate the systems, following the guidelines suggested by the TRUE standard [11].

References

  1. Signac Framework: https://signac.io.
  2. MoSDeF: https://mosdef.org.
  3. mBuild: https://github.com/mosdef-hub/mbuild.
  4. C. Klein, J, Sallai, T. J. Jones, C.R. Iacovella, C. McCabe, P. T. Cummings. (2016) A Hierarchical, Component Based Approach to Screening Properties of Soft Matter. In: Snurr R., Adjiman C., Kofke D. (eds) Foundations of Molecular Modeling and Simulation. Molecular Modeling and Simulation (Applications and Perspectives). Springer, Singapore. https://doi.org/10.1007/978-981-10-1128-3_5
  5. Foyer: https://gitub.com/mosdef-hub/foyer.
  6. C. Klein, A. Z. Summers, M. W. Thompson, J. B. Gilmer, C. McCabe, P. T. Cummings, J. Sallai, C. R. Iacovella. “Formalizing atom-typing and the dissemination of force fields with foyer”. Computational Materials Science, Volume 167, 2019, Pages 215-227, https://doi.org/10.1016/j.commatsci.2019.05.026.
  7. R. T. McGibbon, K. A. Beauchamp, M. P. Harrigan, C. Klein, J. M. Swails, C. X. Hernández, C. R. Schwantes, L. Wang, T. J. Lane, V. S. Pande. “MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories”. Biophysical Journal, Volume 109, Issue 8, 2015, Pages 1528-1532, https://doi.org/10.1016/j.bpj.2015.08.015.
  8. R. J. Gowers, M. Linke, J. Barnoud, T. J. E. Reddy, M. N. Melo, S. L. Seyler, D. L. Dotson, J. Domanski, S. Buchoux, I. M. Kenney, and O. Beckstein. MDAnalysis: A Python package for the rapid analysis of molecular dynamics simulations. In S. Benthall and S. Rostrup, editors, Proceedings of the 15th Python in Science Conference, pages 98-105, Austin, TX, 2016. SciPy, doi:10.25080/majora-629e541a-00e.
  9. Y. L. Pavlov. (2019) ‘Random forests’, Random Forests, pp. 1–122. doi: 10.1201/9780429469275-8.
  10. scikit-learn: https://github.com/scikit-learn/scikit-learn
  11. Matthew W. Thompson, Justin B. Gilmer, Ray A. Matsumoto, Co D. Quach, Parashara Shamaprasad, Alexander H. Yang, Christopher R. Iacovella, Clare McCabe & Peter T. Cummings (2020) Towards molecular simulations that are transparent, reproducible, usable by others, and extensible (TRUE), Molecular Physics, 118:9-10, DOI: 10.1080/00268976.2020.1742938