(373v) Input Convex Lstm for Fast Machine Learning-Based Optimization | AIChE

(373v) Input Convex Lstm for Fast Machine Learning-Based Optimization

Authors 

Wang, W. - Presenter, National University of Singapore
Wang, Z., National University of Singapore
Wu, Z., University of California Los Angeles
Traditional Model Predictive Control (MPC) relies heavily on the development of first-principles models, a process known for its resource-intensive nature. In the current era of big data, data-driven deep learning approaches have emerged as promising alternatives to these conventional models in MPC formulations [1, 2, 3, 4]. However, the use of a standard neural network for capturing system dynamics in MPC can introduce non-convexity, given the inherent nature of neural networks, which often leads to suboptimal local solutions for the optimization problem of MPC [9]. The application of Input Convex Neural Networks (ICNNs) in optimization problems such as MPC has proven successful in achieving globally optimal solutions by preserving convexity within the optimization framework [5,6,7]. Nevertheless, current ICNN architectures encounter challenges such as exploding gradients, limiting their effectiveness as deep neural networks for complex tasks. Moreover, existing neural network-based optimization problems, including both conventional neural network-based optimization and ICNN-based optimization, exhibit slower convergence speeds compared to optimization problems using first-principles models.

In this study, we introduce a novel addition to the ICNN family, termed Input Convex Long short-term memory (ICLSTM). The primary goal of this extension is to enhance the overall performance of machine learning-based optimization (e.g., neural network-based MPC) by addressing the issues related to convergence runtime and the exploding gradient problem observed in current ICNNs. The novel design of ICLSTM is first proven to be input convex. Additionally, we incorporate the ICLSTM to the optimization problem of MPC, which is shown to be a convex optimization problem. Through a simulation study on a nonlinear chemical reactor, we observed a mitigation of the exploding gradient problem and a reduction in convergence time. The percentage decrease compared to the baseline plain RNN, plain LSTM, and Input Convex Recurrent Neural Networks was 46.7%, 31.3%, and 20.2%, respectively. These results highlight the efficacy of the proposed Input Convex LSTM in overcoming challenges associated with current neural network-based MPCs.

References:

[1] N. Lanzetti, Y. Z. Lian, A. Cortinovis, L. Dominguez, M. Mercangöz, and C. Jones, “Recurrent neural network based MPC for process industries,” presented at the 2019 18th European Control Conference (ECC), IEEE, 2019, pp. 1005–1010.

[2] M. J. Ellis and V. Chinde, “An encoder–decoder LSTM-based EMPC framework applied to a building HVAC system,” Chem. Eng. Res. Des., vol. 160, pp. 508–520, 2020.

[3] N. Sitapure and J. S.-I. Kwon, “Neural network-based model predictive control for thin-film chemical deposition of quantum dots using data from a multiscale simulation,” Chem. Eng. Res. Des., vol. 183, pp. 595–607, 2022.

[4] Y. Zheng, X. Wang, and Z. Wu, “Machine learning modeling and predictive control of the batch crystallization process,” Ind. Eng. Chem. Res., vol. 61, no. 16, pp. 5578–5592, 2022.

[5] B. Amos, L. Xu, and J. Z. Kolter, “Input convex neural networks,” presented at the International Conference on Machine Learning, PMLR, 2017, pp. 146–155.

[6] Y. Chen, Y. Shi, and B. Zhang, “Input convex neural networks for optimal voltage regulation,” ArXiv Prepr. ArXiv200208684, 2020.

[7] S. Yang and B. W. Bequette, “Optimization-based control using input convex neural networks,” Comput. Chem. Eng., vol. 144, p. 107143, 2021.