(698a) Assessment of Sampling Strategies for Surrogate Modeling and Active Learning in the Design and Optimization of Pvsa Processes. | AIChE

(698a) Assessment of Sampling Strategies for Surrogate Modeling and Active Learning in the Design and Optimization of Pvsa Processes.

Authors 

Moosavi, S. M., University of Toronto
Xu, H., Svante Inc
In this study, we tackle the complex challenge of accurate and robust design and optimization of Pressure Vacuum Swing Adsorption (PVSA) processes. These processes are known for their computational intensity and multivariate nature of design parameters. Recent advancements in hybrid and surrogate modeling techniques have provided a pathway to expedite this process. In this study, we investigate various aspects of surrogate modelling such as sampling techniques, decision variable (DVs) size and the choice of underlying algorithms. A comprehensive detailed computational model that simulates a 4-step cycle with LPP process cycle in conjunction with a hydrophobic metal-organic framework CALF-20 material is used. This model was previously validated experimentally for both dry and humid flue gas in the context of point source CO2 capture scenarios and is used here to generate training and testing datasets.

Different surrogate models such as Gaussian Process Regression (GPR) and Artificial Neural Networks (ANN) are also considered in this study. We employ surrogate models to predict Key Performance Indicators (KPIs) crucial for the PVSA process performance, such as Purity, Recovery, Productivity, and Energy consumption. Through the analysis of learning curves, which plot evaluation metrics like R-square (R2), Mean Square Error (MSE), and Model Max Error (MaxE) against training size. We evaluate the impact of different sampling methods on model KPIs. Various sampling techniques such as grid-based approaches, random, quasi-random are considered during the data generation stage. Additionally, we vary the input variable size from 6 to 12 decision variables (DVs), to gain insights into the relationship between input DVs dimensions and model performance. The findings emphasize the necessity for training data size vs choice of sampling technique to achieve comparable results for different scenarios.

Furthermore, we introduce a novel Active Learning technique and assess its effectiveness for different algorithms such as GPRs and ANNs. Our results indicate the effectiveness active learning techniques to learn from different scales of complexity leading to more efficient and effective search and optimizations. This study not only enhances our understanding of PVSA surrogate models concerning training set size, algorithm choice, and input variables but also emphasizes the critical role of machine learning algorithms and data quality in model performance.