(356j) Infrared Spectra Prediction with Machine Learning
AIChE Annual Meeting
2020
2020 Virtual AIChE Annual Meeting
Topical Conference: Applications of Data Science to Molecules and Materials
Applications of Data Science in Molecular Sciences II
Tuesday, November 17, 2020 - 10:15am to 10:30am
A model for IR spectral prediction has been developed, requiring only the input of a molecular graph structure provided in the form of a SMILES code. The chemical representation is processed through a message passing neural network followed by a feed forward neural network for prediction of IR spectra, an adaptation of the Chemprop code for molecular property prediction. The model is pre-trained using quantum chemistry calculations for molecules sampled from the PubChem database to learn the molecular harmonic vibration modes. During pretraining, active learning is used to explore the chemical space more efficiently and prioritize molecules which will most improve the model. Further training is performed using experimentally collected spectra available in open-access databases from the National Institute of Standards and Technology (NIST) and the National Institute of Advanced Industrial Science and Technology (AIST). The model allows for predictions of spectra in the gas phase and in supported condensed phases.