(192c) Two-Dimensional Energy Histograms As Features for Machine Learning to Predict Adsorption in Metal-Organic Frameworks | AIChE

(192c) Two-Dimensional Energy Histograms As Features for Machine Learning to Predict Adsorption in Metal-Organic Frameworks

Authors 

Shi, K. - Presenter, Northwestern University
Li, Z., Northwestern University
Snurr, R., Northwestern University
Feature engineering is one of the most important steps in a machine learning (ML) workflow. Features of a data set that are constructed in a systematic and physically informative way can facilitate the convergence of ML algorithms and enhance the interpretability of a ML model. In this work, we propose novel features for predicting adsorption in metal-organic frameworks (MOFs) and other nanoporous materials. In particular, we use two-dimensional (2D) histogram features derived from the adsorbate-adsorbent energy and energy gradient at grid points located throughout the adsorbent. The inclusion of the energy gradient encodes some spatial information of the adsorbent while maintaining a relatively low computational expense. The training and testing data for the ML model are obtained from grand canonical Monte Carlo simulations for different adsorbates, including Kr, Xe, ethane, propane, n-butane and n-hexane, at different pressures and temperatures. With these 2D histogram features, the ML model provides an overall improvement of the prediction quality (with the coefficient of determination R2~0.95-0.99) over a model that uses 1D energy histogram features. We identify the restrictions of the 2D histogram features applied to n-butane and n-hexane with the help of dimensional reduction methods. The possibility of combining 2D histogram features with some other features such as those derived from persistent homology will also be discussed. The physical understanding of the adsorption behavior decoded from the ML model may help develop more advanced MOFs for gas storage and separation in the future.