(676f) A Bilevel Optimization Framework for Solving Active Learning Problems Using Physics Information | AIChE

(676f) A Bilevel Optimization Framework for Solving Active Learning Problems Using Physics Information

Authors 

Durkin, A. - Presenter, Imperial College London
Liu, T., Rensselaer Polytechnic Institute
Dong, L., Imperial College London
Introduction

Physics-informed machine learning (PIML) refers to the combination of physical prior knowledge for the abstraction of natural behaviors, with data-driven models [1]. It has emerged as an effective way to mitigate the shortage of training data and to ensure the physical plausibility of predictive results [2]. From a process optimization perspective, embedding data-driven models into optimization problems is quickly becoming a state-of-the-art approach [3]. Among various alternatives for surrogate models, such as regression based on polynomials, regression trees, and Gaussian processes (GPs), neural networks are gaining widespread adoption and recently PINNs have been used for optimization as well [4]. The attractiveness of PINNs lies in their practicality for integrating physics information. This is achieved by introducing a physics component into the overall loss function during PINN training, facilitating the satisfaction of first-principles equations of interest. In the PINN framework, all inputs and outputs relevant to the system being modelled can be treated simultaneously during training. This simultaneous treatment is a notable advantage, allowing for the seamless incorporation of complex multi-variable relationships into the learning process. However, in an active learning setting collecting enough data for reliably training PINNs poses a challenge.

In many safety-critical industrial systems, critical constraints need to be enforced with high confidence to guarantee process safety during optimization. GP-based models provide attractive properties for surrogate optimization under these conditions, as they enable quantifying uncertainty of predictive results that can be used to probabilistically enforce constraints and the uncertainty can be used to drive exploration during optimization. Owing to this property, BO based on GPs are widely adopted in process optimization where experimental evaluation is expensive and costly due to deviations from optimal conditions during exploration [5].

Compared to PINNs, the field of “physics-informed BO” is still under-studied for process systems engineering applications. Yang et al. [6] follow an approach to generate low-fidelity data based on physics information and then models the discrepancy between data generated from low-fidelity models and high-fidelity data obtained from actual observations over a spatial domain. This is achieved using parameterized GPs with hyperparameters identified via optimization. For active learning problems this approach of evaluating discrepancies via sampling over the complete search space is not practical but instead it can be possible to carry out pointwise corrections to address physics and ML model discrepancies during the optimization process. Inspired by this idea, in this work we follow an approach of using correction factors to accommodate ML models and we optimize the values of these correction factors to minimize the discrepancies between ML model predictions and physics information.

Methods

The physics-informed real time optimization (RTO) problem was formulated as:

min f(t, x, p(x)) (1)

s.t. g(t, x, p(x)) ≤ 0 (2)

F(x, p(x)) = 0 (3)

Where the functional form of f and g is known and (5) describes the physics of the system with q equations. We introduce a vector of correction factors c such that ||F(x, p(x, c))||22 is minimal. For example (4) represents such additive correction factors.

p(x, c) = μ(x) + c (4)

Equations (1)-(4) form the subproblem within a bilevel real time optimization (RTO) problem based on GP uncertainty intervals as in the ARTEO framework [7].

min f(t, x, p(x, c*)) - zσ(x) (5)

s.t. g(t, x, p(x, c*)) + βσ(x) ≤ 0 (6)

c* ∈ C(x) (7)

Where C(x) is a set of solutions of the reconciliation problem to minimize (8).

h(x, c) = ∑i=1q Fi2 + wT||c||22 (8)

where w is a vector of constant weights. The optimization problem is an unconstrained problem, but it may be nonconvex due to the form of the system of equations describing the physics in (3). We cast the reconciliation problem of GP models and the physics information in the context of an optimization problem where the solution can be seen as part of a physics information reconciled inference scheme at a given input for any kind of supervised ML model used for regression purposes, with the mean µ and standard deviation σ.

Results

We implemented the proposed PI-CoF approach in an RTO setting for the simulated operation of a combined heat and power system composed of multiple parallel stacks of fuel cells (FCs) using hydrogen as a fuel. There are five FC units in operation each with their own power control system. The FCs exhibit different performance characteristics for electric power generation and thermal load given a hydrogen flow rate, which are assumed to be unknown to the optimizer in this case study. The system is sent a total electric power output reference to follow and there is a total thermal load limit to be respected based on the sum of the individual thermal loads of the different stacks. The objective of the optimizer is to satisfy the desired total electric power output, while minimizing fuel consumption and staying below the thermal load limit constraint. The control system architecture is implemented in MATLAB/Simulink with the underlying dynamic equations and controllers for the simulation of the FCs.

The degrees of freedom for the RTO problem are the individual electric power output setpoints provided to the FC power controllers. Therefore, for implementing the RTO problem in the PI-CoF framework, we take the electric power output as the input variable and the resulting hydrogen consumption and the thermal load as outputs for each cell. The physics information utilized in this case study is the satisfaction of the energy balance based on the combined enthalpy of the reactions taking place in the FCs. The algorithm is executed with the sampling interval of 250 s, and with a constant total power output reference of 360 kWe for a duration of 10000 s. For comparison the default ARTEO algorithm without the PI-CoF corrections is also run with the same settings.

The results of the simulations are shown in three plots in Fig. 1A-C displaying the evolution of the total power output trajectory, total thermal load trajectory, and the cumulative hydrogen consumption trajectory respectively. The results indicate an improved performance when using the PI-CoF formulation with the PI-CoF version of the algorithm providing a more consistent total power output and resulting in less hydrogen consumption. Whereas both algorithms converge to the same hydrogen consumption rate as indicated by the slope in Fig. 1C while maintaining the same total power output, the PI-CoF version reaches that solution earlier and with minimal exploration. The algorithm without access to physics information carries out additional exploration at the cost of additional fuel consumption. The presented scenarios took around 180 s to simulate the PI-CoF version and around 72 s for the base version of the algorithm indicating a factor of 2.5 increase of the computational effort incurred for solving the inner optimization problem. It is difficult to generalize from this case study with only 5 decision variables for the outer problem and 10 for the inner problem but generally RTO problems are solved over longer periods and the increase in computational load observed in this case study is expected to be tolerable for increased safety and efficiency.

Conclusion

The regularization approach we introduced is powerful in controlling the size of the corrections and is also intuitive when being used along GP models, however there is a possibility of satisfying the physics by using physically implausible corrections. This can be avoided using problem specific insights in selecting weights for the correction factors or by introducing constraints to the inner problem, the latter will however further increase the computational burden introduced by the bilevel optimization formulation. In the constrained BO or more specifically safe BO context, the impact of the correction factors on the predicted uncertainties for constraint satisfaction is also an open problem requiring further investigation. In our implementations, we treated the correction factors as shifting the mean predictions for the GPs and we assumed the uncertainty quantification around the mean to be still valid.

References

[1] G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, and L. Yang, “Physics-informed machine learning,” Nature Reviews Physics, vol. 3, no. 6, pp. 422–440, 2021.

[2] S. Cuomo, V. S. Di Cola, F. Giampaolo, G. Rozza, M. Raissi, and F. Piccialli, “Scientific machine learning through physics–informed neural networks: Where we are and what’s next,” Journal of Scientific Computing, vol. 92, no. 3, p. 88, 2022.

[3] R. Misener and L. Biegler, “Formulating data-driven surrogate models for process optimization,” Computers & Chemical Engineering, vol. 179, p. 108411, 2023.

[4] E. S. Koksal and E. Aydin, “Physics informed piecewise linear neural networks for process optimization,” Computers & Chemical Engineering, vol. 174, p. 108244, 2023.

[5] D. Krishnamoorthy and F. J. Doyle, “Model-free real-time optimization of process systems using safe Bayesian optimization,” AIChE Journal, vol. 69, no. 4, p. e17993, 2023.

[6] X. Yang, D. Barajas-Solano, G. Tartakovsky, and A. M. Tartakovsky, “Physics-informed CoKriging: A Gaussian-process-regression-based multifidelity method for data-model convergence,” Journal of Computational Physics, vol. 395, pp. 410–431, 2019.

[7] B. S. Korkmaz, M. Zagorowska, and M. Mercangoz, “Safe and adaptive decision-making for optimization of safety-critical systems: The ARTEO algorithm,” arXiv preprint arXiv:2211.05495, 2022.