(681c) Discriminative Neural Embeddings of Latent Variable Models for Molecular Property Prediction | AIChE

(681c) Discriminative Neural Embeddings of Latent Variable Models for Molecular Property Prediction

Authors 

Song, L. - Presenter, Georgia Institute of Technology
Dai, H., Georgia Institute of Technology
Dai, B., Georgia Institute of Technology
Kernel classifiers and regressors designed for structured data, such as sequences, trees and graphs, have significantly advanced in a number of interdisciplinary areas such as computational biology and dug design. Typically, the kernel functions are designed beforehand for a data type which either exploit statistics of the structures or make use of probabilistic generative models, and then a discriminative classifier is learned based on the kernels via convex optimization. However, such an elegant two-stage approach also limited kernel methods in terms of their ability to scale up to millions of data points and exploit discriminative information to learn feature representations.

We propose an effective and scalable approach for structured data representation which is based on the idea of embedding latent variable models into feature spaces, and learning such feature spaces with ultimate regressor/classifier using discriminative information. The algorithm runs a sequence of function mappings in a way similar to graphical model inference procedures, such as mean field and belief propagation. This can be implemented as a recurrent neural network, where the parameters for the feature spaces and classifiers are trained in an end to end fashion. We deployed our algorithm on several computational chemistry problems, including the compound and protein classification tasks. We achieved state-of-the-art results on several commonly used benchmark datasets, including NCI1, NCI90, ENZYMES and D&D. We also applied our algorithm on Harvard Clean Energy Project dataset with millions of molecules and predicted power conversion efficiency and energy. We achieved below 0.1 mean absolute error, while using significantly small number of parameters than alternatives.