(698a) Assessment of Sampling Strategies for Surrogate Modeling and Active Learning in the Design and Optimization of Pvsa Processes.
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Separations Division
Adsorption Processes II
Thursday, October 31, 2024 - 12:30pm to 12:48pm
Different surrogate models such as Gaussian Process Regression (GPR) and Artificial Neural Networks (ANN) are also considered in this study. We employ surrogate models to predict Key Performance Indicators (KPIs) crucial for the PVSA process performance, such as Purity, Recovery, Productivity, and Energy consumption. Through the analysis of learning curves, which plot evaluation metrics like R-square (R2), Mean Square Error (MSE), and Model Max Error (MaxE) against training size. We evaluate the impact of different sampling methods on model KPIs. Various sampling techniques such as grid-based approaches, random, quasi-random are considered during the data generation stage. Additionally, we vary the input variable size from 6 to 12 decision variables (DVs), to gain insights into the relationship between input DVs dimensions and model performance. The findings emphasize the necessity for training data size vs choice of sampling technique to achieve comparable results for different scenarios.
Furthermore, we introduce a novel Active Learning technique and assess its effectiveness for different algorithms such as GPRs and ANNs. Our results indicate the effectiveness active learning techniques to learn from different scales of complexity leading to more efficient and effective search and optimizations. This study not only enhances our understanding of PVSA surrogate models concerning training set size, algorithm choice, and input variables but also emphasizes the critical role of machine learning algorithms and data quality in model performance.