(15b) A Bayesian Optimization Framework for Interconnected Systems | AIChE

(15b) A Bayesian Optimization Framework for Interconnected Systems

Authors 

Gonzalez, L. - Presenter, University of Wisconsin-Madison
Zavala, V., University of Wisconsin-Madison
Optimization of complex systems often relies on the use of data-driven modeling tools (e.g., direct search, neural networks, reinforcement learning, and Bayesian optimization) that treat the entire system as a black-box function f(x) [1, 2, 3]. This treatment allows algorithms to be generalizable and simplifies their implementation, as all that the user has to provide is input/output data. This has resulted in a dramatic increase in the classes of problems that can be solved using entirely data-driven methodologies [4]. However, this approach results in an oversimplification of the system wherein any knowledge of its internal structure (e.g., physical laws, constraints, and connectivity) is ignored. Previous work has shown that exploiting such structural knowledge can effectively constrain the search space and improve algorithm performance, resulting in lower sampling requirements and higher-quality solutions [5, 6]. However, determining the best way to exploit structural information can be challenging and can increase the complexity of the algorithm.

Many of the algorithms that make use of structural knowledge, such as physics-constrained neural networks [8] and multi-fidelity Bayesian optimization (BO) [9], rely on a “grey box” modeling paradigm; here, a known model structure is combined with a data-driven subcomponent to generate the system model [7]. This can be expressed as a composition of functions f(y(x)), where f is a known function and y(x) is an unknown, vector-valued function that describes the behavior of internal system subcomponents . Since f is known, the modeling task then shifts to obtaining a data-driven model of y(x). As a result, the function f(y(x)) does not have to be re-estimated whenever there are changes in y(x) and can be re-optimized using traditional methods. Additionally, this decomposition approach reduces sampling requirements by exploiting the structure of f to set targets for y(x) [11]. This method also allows for the consideration of constraints, which can be encapsulated in y(x) [12]. However, this modeling flexibility comes at a price, as increasing the granularity of y(x) increases the complexity of the algorithm needed to drive the system towards an optimum value of f(y(x)) [10].

In this work, we propose a variant of the partition-based BO algorithm presented in [13] that enables the exploitation of the composite structure f(y(x)). This allows us to efficiently scale the BO algorithm to handle large, interconnected systems. Our approach differs from the methods presented in [11,12] in that we are able to obtain an analytical form of the acquisition function (AF). This is achieved via linearization of the composite function f(y(x)) in the neighborhood of point values of y(x). We use a Gaussian process (GP) to develop the data-driven model y(x); as such, the linearization allows us to obtain explicit formulas to estimate the mean performance and uncertainty of the linearized model of f(y(x)) (which are combined to obtain the AF). We show how this simple linearization idea allows us to obtain a data-driven model of y(x) that exploits the interconnected structure of complex processes. We test our algorithm using a case study involving the optimization of a complex process for the recovery of nutrients from wastewaster using cyanobacteria. We also compare the performance of our algorithm with a variety of methods from the literature.

[1] Robert Hooke and T. A. Jeeves. “Direct Search” solution of numerical and statistical problems. Journal of the ACM, 8(2):212-229, 1961.

[2] Jaron C. Thompson, Victor M. Zavala, and Ophelia S. Venturelli. Integrating a tailored recurrent neural network with Bayesian experimental design to optimize microbial community functions. bioRxiv preprint bioRxiv:2022.11.12.516271, 2022.

[3] Ayub I. Lakhani, Myisha A. Chowdhury, and Qiugang Lu. Stability-preserving automatic tuning of PID control with reinforcement learning. arXiv preprint arXiv:2112.15187, 2022.

[4] Andrew R. Conn, Katya Scheinberg, and Luis N. Vicente. Introduction to derivative-free optimization. SIAM, 2009

[5] Qiugang Lu, Leonardo D. González, Ranjeet Kumar, and Victor M. Zavala. Bayesian optimization with reference models: A case study in MPC for HVAC central plants. Computers & Chemical Engineering, 154:107491, 2021.

[6] Jie Zhang, Søren D. Petersen, Tijana Radivojevic, Andrés Ramirez, Andrés Pérez-Manríquez, Eduardo Abeliuk, Benjamín J. Sánchez, Zak Costello, Yu Chen, Michael J. Fero, et al. Combining mechanistic and machine learning models for predictive engineering and optimization of tryptophan metabolism. Nature Communications, 11(1):1-13, 2020.

[7] Raul Astudillo and Peter I. Frazier. Thinking inside the box: A tutorial on grey-box Bayesian optimization. arXiv preprint arXiv:2201.00272, 2022.

[8] Maziar Raissi, Paris Perdikaris, and George E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019.

[9] K. Kandasamy, G. Dasarathy, J. Schnieder, and B. Pózcos. Multi-fidelity Bayesian optimisation with continuous approximations. In Uncertainty in Artificial Intelligence, pages 1799–1808. PMLR, 2017.

[10] Dominik Goldstein, Mathis Heyer, Dion Jakobs, Eduardo S. Schultz, and Lorenz T. Biegler. Multilevel surrogate modeling of an amine scrubbing process for CO2 capture. AIChE Journal, 68(6):e17705, 2022.

[11] Raul Astudillo and Peter I. Frazier. Bayesian optimization of composite functions. In Proceedings of the 36th International Conference on Machine Learning, pages 354-363. PMLR, 2019.

[12] J. A. Paulson and C. Lu. COBALT: COnstrained Bayesian optimizAtion of computationally expensive grey-box models exploiting derivaTive information. Computers & Chemical Engineering, 160:107700, 2021.

[13] Leonardo D. González and Victor M. Zavala. New paradigms for exploiting parallel experiments in Bayesian optimization. Computers & Chemical Engineering, 170:108110, 2023.