(674c) Benchmarking Surrogate Embedding Strategies for Model Predictive Control | AIChE

(674c) Benchmarking Surrogate Embedding Strategies for Model Predictive Control

Authors 

Pulsipher, J. - Presenter, University of Wisconsin-Madison
Elorza Casas, C., University of Waterloo
Ricardez-Sandoval, L., University of Waterloo
Nonlinear model predictive control (NMPC) has emerged as a widely embraced control algorithm for addressing constrained multivariable problems [1]. Yet, solving NMPC problems can pose significant computational challenges, primarily due to their inherent nonlinear characteristics [2]. This is especially notable in systems modelled by a set of highly nonlinear ordinary differential equations (ODEs) or nonlinear partial differential equations (PDEs), which frequently yield models with a considerable number of states. Consequently, there has been considerable interest in replacing nonlinear models with tractable machine learning (ML) surrogate models in an effort to alleviate the computational burden [3]. Surrogate models based on types of neural networks (NNs) are emerging and becoming a particularly popular choice [4].

Despite the current efforts, determining the most computationally efficient way to embed NN surrogates into NMPC formulations remains an open question. The Optimization and Machine Learning Toolkit (OMLT) offers a platform capable of translating neural networks (NNs) into explicit algebraic constraints in Pyomo [5]. These explicit equation transformations provide an intuitive means to embed NNs, but often introduce a significant number of auxiliary constraints and variables to the formulation. To avoid adding unnecessary variables and constraints, auto-differentiation (AD) in ML libraries such as PyTorch can efficiently compute gradients and hessians of NNs such that they can be treated external functions to an algebraic modeling language like Pyomo (such functions are called greybox functions in Pyomo [6]). To the authors' knowledge, there has been no comparative analysis in the literature evaluating these two approaches for embedding NN surrogates, especially in the context of NMPC.

This study benchmarks these two embedding strategies for ML surrogates based on physics-informed neural networks (PINNs) and physics-informed convolutional neural networks (PICNNs) architectures for PDE-constrained NMPC problems that are solved simultaneously with Ipopt via direct transcription. Physics-informed ML methods have become prominent choices for ML surrogates due to their ability to decrease the reliance on labelled data while maintaining fidelity to the fundamental physics laws [7, 8]. Due to the infinite-dimensional (i.e., continuous) spatial nature of PDEs, CNNs are a natural choice since they can capture the interactions of variables over spatial domains [9, 10]. With all these NN surrogates, nonlinear activation functions (e.g., tanh) are selected instead of the more common rectified linear unit (ReLU) activation function to avoid the introduction of binary variables when reformulating.

Three PDE-constrained NMPC case studies are presented in this work. These case studies are based on plug flow reactors of increasing complexity: a simple isothermal 1D PFR, a non-isothermal PFR, and a methane steam-reforming PFR. For each case study, the performance of the embedding PINNs and PICNNs with OMLT is juxtaposed against embedding them as external functions. To embed them as external (greybox) functions, this study utilizes PyNumero which provides an API in Pyomo that can accept the Jacobian and hessian evaluations provided by PyTorch [6].

The results from this benchmarking study provide several insights. First, they demonstrate that it is not always computationally advantageous to replace mechanistic models in NMPC problems with NN surrogates, even if the mechanistic models are highly nonlinear. This stands in stark contrast to the majority of studies in the literature which often use ReLU NN activation functions, giving rise to mixed-integer formulations. Here, the use of nonlinear NN activation functions provide little to no advantage over the original mechanistic equations when a local NLP solver (i.e., Ipopt) is used with a simultaneous discretization approach. Moreover, when a NN surrogate is used, superior performance is often achieved using PyTorch and PyNumero instead of the explicit algebraic reformulations behind OMLT. In part, this can be attributed to the difficulty in initializing the auxiliary variables and constraints introduced with the explicit algebraic reformulations.

References

[1] M. Tejeda-Iglesias, N. H. Lappas, C. E. Gounaris, and L. Ricardez-Sandoval, “Explicit model predictive controller under uncertainty: An adjustable robust optimization approach,” Journal of Process Control, vol. 84, pp. 115–132, Dec. 2019, doi: 10.1016/j.jprocont.2019.09.002.

[2] L. T. Biegler and D. M. Thierry, “Large-scale Optimization Formulations and Strategies for Nonlinear Model Predictive Control,” IFAC-PapersOnLine, vol. 51, no. 20, pp. 1–15, 2018, doi: 10.1016/j.ifacol.2018.10.167.

[3] P. Daoutidis et al., “Machine learning in process systems engineering: Challenges and opportunities,” Computers & Chemical Engineering, vol. 181, p. 108523, Feb. 2024, doi: 10.1016/j.compchemeng.2023.108523.

[4] Misener, R., & Biegler, L. (2023). Formulating data-driven surrogate models for process optimization. Computers & Chemical Engineering, 179, 108411.

[5] F. Ceccon et al., “OMLT: Optimization & Machine Learning Toolkit.” arXiv, Nov. 15, 2022. Accessed: Feb. 22, 2024. [Online]. Available: http://arxiv.org/abs/2202.02414

[6] J. S. Rodriguez, R. B. Parker, C. D. Laird, B. L. Nicholson, J. D. Siirola, and M. L. Bynum, “Scalable Parallel Nonlinear Optimization with PyNumero and Parapint,” INFORMS Journal on Computing, vol. 35, no. 2, pp. 509–517, Mar. 2023, doi: 10.1287/ijoc.2023.1272.

[7] Z. Hao et al., “Physics-Informed Machine Learning: A Survey on Problems, Methods and Applications.” arXiv, Mar. 06, 2023. Accessed: Mar. 07, 2024. [Online]. Available: http://arxiv.org/abs/2211.08064

[8] E. A. Antonelo, E. Camponogara, L. O. Seman, E. R. de Souza, J. P. Jordanou, and J. F. Hubner, “Physics-Informed Neural Nets for Control of Dynamical Systems.” arXiv, May 31, 2022. Accessed: Dec. 16, 2022. [Online]. Available: http://arxiv.org/abs/2104.02556

[9] Z. Zhang, “A physics-informed deep convolutional neural network for simulating and predicting transient Darcy flows in heterogeneous reservoirs without labeled data,” Journal of Petroleum Science and Engineering, vol. 211, p. 110179, Apr. 2022, doi: 10.1016/j.petrol.2022.110179.

[10] Jiang, S., & Zavala, V. M. (2021). Convolutional neural nets in chemical engineering: Foundations, computations, and applications. AIChE Journal, 67(9), e17282.