(663c) Catalysts Discovery Via Curated Computational Databases: Past, Present and Future of Catalysis- Hub.Org | AIChE

(663c) Catalysts Discovery Via Curated Computational Databases: Past, Present and Future of Catalysis- Hub.Org

Authors 

Winther, K., SLAC National Accelerator Laboratory
Bajdich, M., SLAC STANFORD
Computational catalysis and the underlying data play an increasing role in guiding and understanding the catalyst's operation, optimization, and ultimately the discovery of new catalysts. Scientific tasks gradually shift from fundamental atomistic simulations to large dataset acquisition to optimize even greater chemical space. Several new data projects have spun recently which make this new mode of operation possible. However many scientific gaps remain particularly within availability, reproducibility, and re/usability as mandated by the recent DOE's Public Reusable Research (PuRe) Data efforts. These issues are additionally amplified by the lack of a suitable catalyst genome format.

In this talk, I will introduce the current datasets and formatting used by Catalysis-hub.org and discuss examples of their real-world use.
In the first case study, I will highlight the use of our large OER datasets in connection with MaterialsProject.org stability data to achieve rapid down-selection of candidate metal oxide catalysts for acidic OER catalysis [1]. In the second case study, we computed highly accurate adsorption energies across the 3d,4d, and 5d rutile series, and other oxides and extracted universal electronic based on Crystal Orbital Hamilton Population (COHP) method [2]. For our most popular use case, I will mention the variety of possible deployments of the extended bimetallic alloy dataset [3] for thermodynamic and structural ML models. These specific use cases allowed us to test our current approach and its limitations. The largest limiting factors are mostly related to reproducibility and reusability, in terms of accuracy, uniqueness, and storage formats. In closing, I will open a discussion about aspects of future catalyst genome formats that can solve these challenges.
I will also discuss the opportunities for collaboration and contribution to a large DOE-wide Computational Database currently under consideration.

[1] Chem Mater.1c04120 (2022). [2] Journal of Physical Chemistry C, 8. (2022) [3] Npj Computational Materials 6:1, 6(1) (2020)