(192c) Two-Dimensional Energy Histograms As Features for Machine Learning to Predict Adsorption in Metal-Organic Frameworks
AIChE Annual Meeting
2021
2021 Annual Meeting
Engineering Sciences and Fundamentals
Faculty Candidates in COMSEF/Area 1a, Session 2
Monday, November 8, 2021 - 3:54pm to 4:06pm
Feature engineering is one of the most important steps in a machine learning (ML) workflow. Features of a data set that are constructed in a systematic and physically informative way can facilitate the convergence of ML algorithms and enhance the interpretability of a ML model. In this work, we propose novel features for predicting adsorption in metal-organic frameworks (MOFs) and other nanoporous materials. In particular, we use two-dimensional (2D) histogram features derived from the adsorbate-adsorbent energy and energy gradient at grid points located throughout the adsorbent. The inclusion of the energy gradient encodes some spatial information of the adsorbent while maintaining a relatively low computational expense. The training and testing data for the ML model are obtained from grand canonical Monte Carlo simulations for different adsorbates, including Kr, Xe, ethane, propane, n-butane and n-hexane, at different pressures and temperatures. With these 2D histogram features, the ML model provides an overall improvement of the prediction quality (with the coefficient of determination R2~0.95-0.99) over a model that uses 1D energy histogram features. We identify the restrictions of the 2D histogram features applied to n-butane and n-hexane with the help of dimensional reduction methods. The possibility of combining 2D histogram features with some other features such as those derived from persistent homology will also be discussed. The physical understanding of the adsorption behavior decoded from the ML model may help develop more advanced MOFs for gas storage and separation in the future.