(681c) Discriminative Neural Embeddings of Latent Variable Models for Molecular Property Prediction

Conference

AIChE Annual Meeting

Year

2016

Proceeding

2016 AIChE Annual Meeting

Group

Computational Molecular Science and Engineering Forum

Session

Data-Driven Screening of Chemical and Materials Space

Time

Thursday, November 17, 2016 - 12:54pm to 1:06pm

Authors

Song, L. - Presenter, Georgia Institute of Technology

Dai, H., Georgia Institute of Technology

Dai, B., Georgia Institute of Technology

Kernel classifiers and regressors designed for structured data, such as sequences, trees and graphs, have significantly advanced in a number of interdisciplinary areas such as computational biology and dug design. Typically, the kernel functions are designed beforehand for a data type which either exploit statistics of the structures or make use of probabilistic generative models, and then a discriminative classifier is learned based on the kernels via convex optimization. However, such an elegant two-stage approach also limited kernel methods in terms of their ability to scale up to millions of data points and exploit discriminative information to learn feature representations.

We propose an effective and scalable approach for structured data representation which is based on the idea of embedding latent variable models into feature spaces, and learning such feature spaces with ultimate regressor/classifier using discriminative information. The algorithm runs a sequence of function mappings in a way similar to graphical model inference procedures, such as mean field and belief propagation. This can be implemented as a recurrent neural network, where the parameters for the feature spaces and classifiers are trained in an end to end fashion. We deployed our algorithm on several computational chemistry problems, including the compound and protein classification tasks. We achieved state-of-the-art results on several commonly used benchmark datasets, including NCI1, NCI90, ENZYMES and D&D. We also applied our algorithm on Harvard Clean Energy Project dataset with millions of molecules and predicted power conversion efficiency and energy. We achieved below 0.1 mean absolute error, while using significantly small number of parameters than alternatives.

Topics

Computational Molecular Engineering

Alternative Energy

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: December 2024

CEP: November 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.