(595c) Integrated Upstream and Downstream Data Curation Tools As a Key to Enabling Reproducibility, Usability and Data Sharing
AIChE Annual Meeting
2016
2016 AIChE Annual Meeting
Computational Molecular Science and Engineering Forum
Making Molecular Simulation a Mainstream Chemical Engineering Tool: Reproducibility, Robustness, and Usability
Wednesday, November 16, 2016 - 4:05pm to 4:25pm
In this presentation, we describe the development of a computational â??workbenchâ? whose goal is to provide an integrated computational and data environment to support multiscale modeling of soft materials for the Materials Genome Initiative (MGI). The design has three essential elements: a modular program structure that supports the addition of new functionality through Python scripting and run-time plugins; a hierarchical data structure which enables unified representation of materials at different levels of granularity; finally, integration of the NIST Materials Data Curation System (MDCS) [1-2] into the environment to support ontology based materials descriptions. A key element of the design which we emphasize in this presentation is the database element. The XML schema based database environment allows us to visualize the inter-relationships between data elements, and enables automated curation of both upstream and downstream data in the workflow. We show how controlling the data in this manner is essential for ensuring reproducibility, results in greatly enhanced usability, and allows users to build progressive, materials reference libraries which can be pushed or shared by various means. We will illustrate this using various examples including tools being developed for coarse-grained force-field development and property calculation tools.
References
- Materials Data and Informatics, http://www.nist.gov/itl/ssd/is/materials-data-and-informatics.cfm
- Materials Data Curation System, https://github.com/usnistgov/MDCS