(710d) Software Strategies for Increasing Throughput and Reproducibility While Lowering Cognitive Load in Molecular Simulation | AIChE

(710d) Software Strategies for Increasing Throughput and Reproducibility While Lowering Cognitive Load in Molecular Simulation

Authors 

Jankowski, E. - Presenter, Boise State University
Thomas, S., Boise State University
Henry, M., Boise State University
Learning molecular simulations and reproducing someone else's molecular simulations are both hindered by software dependency issues: Acquiring the version of Software A to accomplish Task B results in a chain of installation and compilation steps that require significant expertise to navigate nuances associated with a user's specific system architecture, operating system, and software stack. The effort associated with setting up and resolving software conflicts is one example of `cognitive load' that burdents novice molecular simulators disproportionately. In order to make molecular simulations a more mainstream engineering tool, software developers can use the minimization of cognitive load as a guiding principle to inform design decisions.

In this work we discuss three examples in molecular simulation infrastructure development aimed at lowering cognitive load. First, we discuss the use of Docker containers for deploying software stacks to new computational scientists. Maintaining tutorial information and software dependencies through Docker files ensures that versions always match, and are foundational to campus-wide computing efforts now in their third year at Boise State University. Second, we detail continuous integration and regression testing during the development of a simulation plugin to model epoxy crosslinking dynamics. By automatically catching software changes that influence the scientific correctness, developers can focus their efforts on optimizing performance more effectively. Finally, we show examples of using signac-flow to manage the execution of studies with thousands of jobs. By automating post-processing and handling of job scripts, users can focus efforts on understanding their simulation results instead of tracking and manually editing thousands of files. In concert, we demonstrate how an emphasis on lowering cognitive load leads to shorter training times for new scientists and improved transfer of information between researchers deriving from enhanced reproducibility.