(674i) Accelerated Modeling of Various Chemical Processes Using Meta-Learning-Based Foundation Models: A Few-Shot Learning Approach Using Reptile | AIChE

(674i) Accelerated Modeling of Various Chemical Processes Using Meta-Learning-Based Foundation Models: A Few-Shot Learning Approach Using Reptile

Authors 

Wang, W. - Presenter, National University of Singapore
Wang, Z., National University of Singapore
Wu, Z., University of California Los Angeles
Given the challenges associated with obtaining first-principles models for intricate chemical reactions in real-world scenarios, there has been a proposition to utilize machine learning-based models to replace them in capturing the system dynamics of chemical reactions. This paradigm shift aims to facilitate applications such as model predictive control [1, 2]. However, employing neural networks for system dynamics modeling often demands substantial datasets, rendering it impractical for data-sparse scenarios. To mitigate this challenge, transfer learning-based approaches have been proposed, which include initially training the neural network on a large dataset akin to the target task and subsequently fine-tuning the model's weights using samples from the designated task [3]. However, transfer learning-based methods come with limitations. First, it necessitates identifying a sufficiently similar task with a sizable dataset for training. Second, the process of identifying analogous tasks with adequate data, training on them, and then transferring to the designated task must be repeated for each new task. Additionally, a considerable number of samples, typically in the hundreds or thousands, are still required from the designated task for effective adaptation.

To address the limitations of transfer learning, we will develop ONE universal neural network that is capable of swiftly adapting to any new task, such as modeling system dynamics, following the idea of foundation models. We employ a meta-learning technique, Reptile [4] that is proposed by OpenAI to optimize neural network weights, to facilitate easy adaptation to new tasks with just a few shots from those tasks. While meta-learning techniques are commonly used in few-shot image classification problems, the application to time-series regression is still in its infancy [5]. In this study, we consider an example of neural network modeling of nonlinear chemical reactors, and develop a universal neural network to encompass various chemical reactors such as continuous stirred tank reactors (CSTRs), batch reactors (BRs), and plug flow reactors (PFRs). To validate the efficacy of the Neural Network (NN)-based Reptile in few-shot learning for various chemical reactors, two sets of simulations on nonlinear processes were conducted. Firstly, we trained an RNN-based Reptile with 1,000 CSTRs of different parameters (e.g., volume, reaction rates, inlet flow rates, etc.) and demonstrated its ability to enable few-shot learning on unseen CSTRs, achieving similar accuracy with as few as 10 shots compared to training with a large number of samples. Secondly, we trained an NN-based Reptile with 1,000 CSTRs, BRs, and PFRs of different parameters, showcasing their effectiveness in few-shot learning on unseen chemical reactions. Again, with as few as 10 shots, it achieved similar accuracy as training with sufficiently large number of samples. Notably, the transfer learning-based approach failed to achieve satisfactory performance in few-shot learning under both settings.

References:

[1] Z. Wu, A. Tran, D. Rincon, and P. D. Christofides, “Machine learning‐based predictive control of nonlinear processes. Part I: theory,” AIChE J., vol. 65, no. 11, p. e16729, 2019.

[2] Z. Wu, A. Tran, D. Rincon, and P. D. Christofides, “Machine‐learning‐based predictive control of nonlinear processes. Part II: Computational implementation,” AIChE J., vol. 65, no. 11, p. e16734, 2019.

[3] F. Zhuang et al., “A comprehensive survey on transfer learning,” Proc. IEEE, vol. 109, no. 1, pp. 43–76, 2020.

[4] A. Nichol and J. Schulman, “Reptile: a scalable metalearning algorithm,” ArXiv Prepr. ArXiv180302999, vol. 2, no. 3, p. 4, 2018.

[5] T. Hospedales, A. Antoniou, P. Micaelli, and A. Storkey, “Meta-learning in neural networks: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 9, pp. 5149–5169, 2021.