(595j) Stability Prediction of Hypervalent Compaounds Based on Data-Centric Modelling
AIChE Annual Meeting
2017
2017 Annual Meeting
Computational Molecular Science and Engineering Forum
Data Mining and Machine Learning in Molecular Sciences I
Wednesday, November 1, 2017 - 5:21pm to 5:33pm
In this study we explore hundreds of reagents of the general type shown in Figure 1 in view of their stability, i.e. the possibility of their use for the transfer of the substituent marked L.
For that matter, a support vector machine (SVM) was trained using simple descriptors based on the molecular structure and the atomic charges observed in the reactive center. The SVM, a popular machine learning tool for yes/no-type classification, one trained properly, was shown to successfully predict (in)stability of these compounds based on the descriptors selected.
In this talk we will focus on the selection of descriptors, the training of the SVM and its application to a large array of know and (still) unknown compounds. Relative to the explicit determination of the kinetic and thermodynamic stability, this approach allows a prediction at a small fraction of the cost since only the molecular structures and the atomic charges need to be available.
We will also show that in machine learning the availability of ânegativeâ information, i.e. information on non-existing compounds, is essential. The fact that negative information is much more difficult to find, presents a challenge and calls for rethinking of the existing, success-based publication culture.
Figure 1: