(310c) Scaling Molecular Modeling to Millions of Reactions with Neural Network Potentials
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Computational Molecular Science and Engineering Forum
Recent Advances in Molecular Simulation Methods II
Tuesday, October 29, 2024 - 12:54pm to 1:06pm
Developments in high-throughput experimentation, automated chemistry platforms, and chemical generative models have created an urgent need to rapidly predict reaction outcomes so that synthetic planning and evaluations of degradability can match the emerging pace of molecule/material discovery. Machine learned interatomic potentials (MLIPs), whereby potential energy surface representations are learned from datasets of first-principles calculations, present an attractive opportunity to overcome the expensive and/or time-consuming tasks required to characterize and refine reactions experimentally or with quantum chemistry models. In this talk, I will introduce two models, RxnAIMNet and AIMNet2-Pd, that illustrate how MLIPs can be used to predict thermodynamics of general organic reactions and carbon-carbon cross couplings that are catalyzed by Pd organometallic complexes. By learning from a newly constructed and exhaustive dataset of ~10 million molecular conformers, RxnAIMNet is shown to reliably perform minimum energy pathway searches, transition state optimization, and intrinsic reaction coordinate calculations, leading to predicted activation barriers within ~2 kcal/mol of reference range-separated hybrid density functional theory (DFT) calculations. Despite our reactive training dataset being constructed from systematic enumeration of bond breaking and forming rules, we observe that RxnAIMNet implicitly learns key reaction mechanismsâsuch as Diels-Alder, hydrogenation, and triazole formation âsupporting its broad transferability across general organic chemistry. To meet the needs of high-throughput reaction characterization, we introduce a method for batched nudged elastic band calculations, which is shown to allow RxnAIMNet to identify ~350,000 minimum energy pathways daily on a single medium-end GPU. Beyond concerted organic chemistry pathways, I will discuss the recently introduced AIMNet2-Pd, a model for 3D molecular modeling of Pd-containing organometallics, representing crucial first steps toward the underexplored areas of MLIPs for hybrid organic-inorganic species and systems undergoing chemical reactions. Considering Suzuki coupling as a representative reaction class, we demonstrate that the intermediate structures and main mechanisms (oxidative addition, transmetalation, and reductive elimination) can be evaluated for diverse ligands at a rate ~106 times faster than the standard quantum chemistry calculations used in homogenous catalyst screening. Taken altogether, these two models support an enhanced ability, especially on accelerated computing hardware, to tailor synthetic outcomes and design new reaction pathways by traversing approximate machine learned potential energy surfaces.