(674c) Benchmarking Surrogate Embedding Strategies for Model Predictive Control
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Computing and Systems Technology Division
10B: AI/ML Modeling, Optimization and Control Applications I
Thursday, October 31, 2024 - 1:02pm to 1:18pm
Despite the current efforts, determining the most computationally efficient way to embed NN surrogates into NMPC formulations remains an open question. The Optimization and Machine Learning Toolkit (OMLT) offers a platform capable of translating neural networks (NNs) into explicit algebraic constraints in Pyomo [5]. These explicit equation transformations provide an intuitive means to embed NNs, but often introduce a significant number of auxiliary constraints and variables to the formulation. To avoid adding unnecessary variables and constraints, auto-differentiation (AD) in ML libraries such as PyTorch can efficiently compute gradients and hessians of NNs such that they can be treated external functions to an algebraic modeling language like Pyomo (such functions are called greybox functions in Pyomo [6]). To the authors' knowledge, there has been no comparative analysis in the literature evaluating these two approaches for embedding NN surrogates, especially in the context of NMPC.
This study benchmarks these two embedding strategies for ML surrogates based on physics-informed neural networks (PINNs) and physics-informed convolutional neural networks (PICNNs) architectures for PDE-constrained NMPC problems that are solved simultaneously with Ipopt via direct transcription. Physics-informed ML methods have become prominent choices for ML surrogates due to their ability to decrease the reliance on labelled data while maintaining fidelity to the fundamental physics laws [7, 8]. Due to the infinite-dimensional (i.e., continuous) spatial nature of PDEs, CNNs are a natural choice since they can capture the interactions of variables over spatial domains [9, 10]. With all these NN surrogates, nonlinear activation functions (e.g., tanh) are selected instead of the more common rectified linear unit (ReLU) activation function to avoid the introduction of binary variables when reformulating.
Three PDE-constrained NMPC case studies are presented in this work. These case studies are based on plug flow reactors of increasing complexity: a simple isothermal 1D PFR, a non-isothermal PFR, and a methane steam-reforming PFR. For each case study, the performance of the embedding PINNs and PICNNs with OMLT is juxtaposed against embedding them as external functions. To embed them as external (greybox) functions, this study utilizes PyNumero which provides an API in Pyomo that can accept the Jacobian and hessian evaluations provided by PyTorch [6].
The results from this benchmarking study provide several insights. First, they demonstrate that it is not always computationally advantageous to replace mechanistic models in NMPC problems with NN surrogates, even if the mechanistic models are highly nonlinear. This stands in stark contrast to the majority of studies in the literature which often use ReLU NN activation functions, giving rise to mixed-integer formulations. Here, the use of nonlinear NN activation functions provide little to no advantage over the original mechanistic equations when a local NLP solver (i.e., Ipopt) is used with a simultaneous discretization approach. Moreover, when a NN surrogate is used, superior performance is often achieved using PyTorch and PyNumero instead of the explicit algebraic reformulations behind OMLT. In part, this can be attributed to the difficulty in initializing the auxiliary variables and constraints introduced with the explicit algebraic reformulations.
References
[1] M. Tejeda-Iglesias, N. H. Lappas, C. E. Gounaris, and L. Ricardez-Sandoval, âExplicit model predictive controller under uncertainty: An adjustable robust optimization approach,â Journal of Process Control, vol. 84, pp. 115â132, Dec. 2019, doi: 10.1016/j.jprocont.2019.09.002.
[2] L. T. Biegler and D. M. Thierry, âLarge-scale Optimization Formulations and Strategies for Nonlinear Model Predictive Control,â IFAC-PapersOnLine, vol. 51, no. 20, pp. 1â15, 2018, doi: 10.1016/j.ifacol.2018.10.167.
[3] P. Daoutidis et al., âMachine learning in process systems engineering: Challenges and opportunities,â Computers & Chemical Engineering, vol. 181, p. 108523, Feb. 2024, doi: 10.1016/j.compchemeng.2023.108523.
[4] Misener, R., & Biegler, L. (2023). Formulating data-driven surrogate models for process optimization. Computers & Chemical Engineering, 179, 108411.
[5] F. Ceccon et al., âOMLT: Optimization & Machine Learning Toolkit.â arXiv, Nov. 15, 2022. Accessed: Feb. 22, 2024. [Online]. Available: http://arxiv.org/abs/2202.02414
[6] J. S. Rodriguez, R. B. Parker, C. D. Laird, B. L. Nicholson, J. D. Siirola, and M. L. Bynum, âScalable Parallel Nonlinear Optimization with PyNumero and Parapint,â INFORMS Journal on Computing, vol. 35, no. 2, pp. 509â517, Mar. 2023, doi: 10.1287/ijoc.2023.1272.
[7] Z. Hao et al., âPhysics-Informed Machine Learning: A Survey on Problems, Methods and Applications.â arXiv, Mar. 06, 2023. Accessed: Mar. 07, 2024. [Online]. Available: http://arxiv.org/abs/2211.08064
[8] E. A. Antonelo, E. Camponogara, L. O. Seman, E. R. de Souza, J. P. Jordanou, and J. F. Hubner, âPhysics-Informed Neural Nets for Control of Dynamical Systems.â arXiv, May 31, 2022. Accessed: Dec. 16, 2022. [Online]. Available: http://arxiv.org/abs/2104.02556
[9] Z. Zhang, âA physics-informed deep convolutional neural network for simulating and predicting transient Darcy flows in heterogeneous reservoirs without labeled data,â Journal of Petroleum Science and Engineering, vol. 211, p. 110179, Apr. 2022, doi: 10.1016/j.petrol.2022.110179.
[10] Jiang, S., & Zavala, V. M. (2021). Convolutional neural nets in chemical engineering: Foundations, computations, and applications. AIChE Journal, 67(9), e17282.