(360ap) Developing Deep Learning Models to Predict Sigma Profiles of Lignin-Derived Organic Molecules | AIChE

(360ap) Developing Deep Learning Models to Predict Sigma Profiles of Lignin-Derived Organic Molecules

Authors 

Abbas, U. - Presenter, University of Kentucky
Nguyen, M. T., University of Kentucky
Zhang, Y., University of Kentucky
Shi, J., University of Kentucky
Chen, J., University of Kentucky
Lignin-derived molecules have been considered as a rich pool to search for suitable solvents or other functional materials. One unmet need is to evaluate which molecules may possess the desired properties. Sigma profile is the numerical integration of the electrostatic potential map of a given molecule. The sigma profile has been recognized as an important feature to evaluate the potential function of an organic molecule. However, the calculation of sigma profiles usually relies on time-consuming quantum mechanical calculations, limiting its application. Machine learning methods can bypass the time-consuming quantum mechanical calculations by directly correlating the representation of a molecule with its sigma profile. In this work, we develop machine learning methods to predict the sigma profile of lignin-derived organic molecules based on their SMILES strings. We first trained an encoder-decoder model on the unlabeled SMILES representations of molecules from CHEMBL. The encoder-decoder model converts the SMILES string to the vectors on the latent molecular space. Then we develop a neural network to predict sigma profiles of molecules based on the vectors on the continuous molecular space using the VT-2005 database. At last, we evaluate the performance of the developed model based on its ability to predict the sigma profile of a group of lignin-derived organic molecules.