(595e) Mapping Transition Metal Chemical Space for Machine Learning Models
AIChE Annual Meeting
2017
2017 Annual Meeting
Computational Molecular Science and Engineering Forum
Data Mining and Machine Learning in Molecular Sciences I
Wednesday, November 1, 2017 - 4:21pm to 4:33pm
The unique and tunable electronic properties of transition metal complexes make them ideal targets for molecular design. However, the high complexity and dimensionality of transition metal chemical space both presents challenges for and necessitates new approaches to virtual screening. Data-driven models from machine learning can circumvent the high computational cost of first-principles simulation. However, predictive machine learning models require knowledge of how to optimally map heuristic chemical and topological properties of transition metal complexes to energetic outputs. We have recently trained the first artificial neural network (ANN) to predict electronic structure properties of transition metal complexes to 3 kcal/mol accuracy, outperforming previously developed descriptors (i.e., for organic molecules) by an order of magnitude1. We have also implemented this ANN in our virtual high-throughput screening toolkit, molSimplify2, to enable both prediction of structure and electronic properties prior to first-principles simulation. We will discuss this set and recent modifications to the descriptors through our development of fully continuous variable representations. We will describe how we have used established feature selection techniques from this widened space of candidate descriptors to identify the most important subsets of variables for predictive models. We analyze selected feature sets through how well they resolve differences and similarities among representative transition metal complexes using self-organizing maps and principal component analysis. Finally, we conclude with our approaches for decoding continuous variable representations to lead candidate transition molecules for integration of our predictive machine learning models into multi-level molecular design workflows.
1J. P. Janet and H. J. Kulik âPredicting Electronic Structure Properties of Transition Metal Complexes with Neural Networksâ arXiv preprint arXiv:1702.05771 (2017)
2E. I. Ioannidis, T. Z. H. Gani, and H. J. Kulik âmolSimplify: A toolkit for automating discovery in inorganic chemistryâ J. Comput. Chem. 37, 2106-2117 (2016).