(346m) Navigating Combinatorial Challenges in High-Throughput Transition Metal Complex Discovery | AIChE

(346m) Navigating Combinatorial Challenges in High-Throughput Transition Metal Complex Discovery

Authors 

Arunachalam, N. - Presenter, Massachusetts Institute of Technology
Nandy, A., Massachusetts Institute of Technology
Taylor, M., Massachusetts Institute of Technology
Harper, D., Massachusetts Institute of Technology
Computational high-throughput virtual screening workflows using first-principles methods such as density functional theory (DFT) have emerged as important tools for discovering new materials such as transition metal complexes (TMCs). However, even with the aid of such workflows, exhaustive design space exploration remains an intractable challenge because the space of possible complexes grows combinatorially with the number of ligands, metals, oxidation states, and ligand field symmetries considered. To address this challenge, machine learning models trained on topology-based molecular descriptors such as revised autocorrelations (RACs) have been used to rapidly and accurately assess molecular properties. RAC-trained neural networks and kernel ridge regression models are able to achieve test set errors on the scale of 1-3 kcal/mol with respect to DFT training data, in part because they are relatively sparse descriptors that average together properties of ligands in the same plane (e.g., equatorial or axial ligands in an octahedral mononuclear transition metal complex). This approximation has enabled such good accuracy even on comparatively small (ca. 100s to 1000s of points) data sets. These RAC descriptors can fail to distinguish low-symmetry complexes or other spatial isomers for which geometric information is crucial. This challenge suggests a need to better understand how molecular properties vary between ligand field symmetries that are poorly distinguished by current featurization methods. To address this, we fully enumerate the space of transition metal complexes constructed from up to two ligands from a pool of nine monodentate ligands, resulting in over 8,000 attempted DFT geometry relaxations. We quantify the extent to which properties such as adiabatic spin-splitting energy, ionization potential, and frontier orbital energetics vary between isomers within this design space and evaluate the performance of topological descriptors for predicting property variations. This analysis is then used make recommendations for the strategic exploration of broader design spaces.