(777g) Property Prediction of Crystalline Solids from Composition and Crystal Structure
AIChE Annual Meeting
2016
2016 AIChE Annual Meeting
Computational Molecular Science and Engineering Forum
Data Mining and Machine Learning in Molecular Sciences II
Friday, November 18, 2016 - 2:00pm to 2:12pm
Property
Prediction of Crystalline Solids from Composition and Crystal Structure
Bruno A. Calfa[1]
and John R. Kitchin[2]
Predicting
properties of compounds has received considerable attention across different
disciplines and has seen applications in diverse areas. The approach proposed
in this paper is influenced by the advances in Group Contribution Methods
(GCMs).
A
key characteristic of GCMs in general is that the property descriptors (i.e.,
predictors) are explicitly represented by the molecular structure (and chemical
composition) of a compound. One can pose an optimal inverse (design) problem, which attempts to obtain the original
molecule (structure and composition) given a target property value. This is the
main goal of the research area called Computer-Aided Molecular/Mixture Design
(CAMD) as exemplified by several works in the literature (Eljack et al., 2007;
Samudra and Sahinidis, 2013).
In
the crystalline solids literature, several research groups have proposed
statistical (machine learning) approaches for property prediction. Saad et al.
(2012) investigated both unsupervised and supervised machine learning
techniques to predict structure and properties of crystals with chemical
formula AB. Ma et al. (2015) developed a machine-learning-augmented model based
on artificial neural networks (ANNs) that captures nonlinear
adsorbate-substrate interactions.
We
propose using kernel regression (Li and Racine, 2007) as a data-driven and
rigorous nonparametric statistical technique to predict properties of atomic
crystals. A key feature of the proposed approach is the possibility of treating
predictors not only as continuous, but also as categorical data. The latter
specifically allows the predictive model to capture the discrete nature of
crystals with regards to composition (number of atoms in the chemical formula)
and spatial configuration (finite number of crystallographic space groups).
Another important aspect of using kernel regression is the direct access to its
explicit mathematical form, which can be directly embedded in optimal inverse
problems to design new crystalline materials with given target properties. The
property prediction approach is illustrated by training models to predict
electronic properties of 746 binary metal oxides and elastic properties of
1,173 crystals. As a first approach to solving the inverse problem, we describe
an exhaustive enumeration algorithm (Calfa and Kitchin, 2016).
References
Calfa, B. A.; and Kitchin, J. R. 2016. Property
Prediction of Crystalline Solids from Composition and Crystal Structure. AIChE
Journal. In press. DOI: 10.1002/aic.15251.
Eljack, F. T.; Eden, M. R.; Kazantzi, V.;
Qin, X.; and El-Halwagi, M. M. 2007. Simultaneous Process and Molecular
DesignÑA Property Based Approach. AIChE Journal. 53(5):1232Ð1239.
Li, Q., and Racine, J. S. 2007.
Nonparametric Econometrics: Theory and Practice. Themes in Modern Econometrics.
Princeton University Press. New Jersey, NJ. USA.
Ma, X.; Li, Z.; Achenie, L. E. K.; and
Xin, H. 2015. Machine-Learning-Augmented Chemisorption Model for CO2
Electroreduction Catalyst Screening. Journal of Physi- cal Chemistry Letters.
6(18):3528Ð3533.
Saad, Y.; Gao, D.; Ngo, T.; Bobbitt, S.;
Chelikowsky, J. R.; and Andreoni, W. 2012. Data Mining for Materials:
Computational Experiments with AB Compounds. Physical Review B. 85(10):104104.
Samudra, A. P., and Sahinidis, N. V. 2013.
Optimization-Based Framework for Computer- Aided Molecular Design. AIChE
Journal. 59(10):3686Ð3701.