(469b) Approximate Dynamic Programming Based Strategy for Optimal Blending of Linear Model Predictive Controllers | AIChE

(469b) Approximate Dynamic Programming Based Strategy for Optimal Blending of Linear Model Predictive Controllers

Authors 

Lee, J. M. - Presenter, Georgia Institute of Technology
Lee, J. H., Korea Advanced Institute of Science and Technology (KAIST)


Many chemical process control problems involve nonlinear dynamics. Conventional model-based control strategies require a global nonlinear model, which can be difficult to identify and computationally expensive to use on-line. An alternative is to design several linear controllers and combine them into a nonlinear control scheme. Linear controllers can be designed using different linear models around multiple operating points and / or different tuning parameters such as penalties on the input moves. Though it has recently been proposed to build a piecewise affine (PWA) model and then use the state feedback gain for the particular regime the dynamic state is in at each sample time, it can be limiting since it ignores the future evolution of the state and model matrices' dependence on it. Furthermore, tuning of linear model predictive controllers (MPC) has been largely heuristic and laborious. To properly account for the dynamic nature of a nonlinear system, one needs to add logical constraints to the optimal control formulation, resulting in mixed-integer quadratic program (MIQP) to solve on-line.

In this presentation, we propose an approximate dynamic programming (ADP) based strategy to build a switching rule among multiple linear controllers. For example, if we use a linearized model around some point to design a linear MPC for a nonlinear plant, the fidelity of the model will depend on the state, and proper amount of input weight to get the optimal performance would also depend on the state. One can start with a set of linear MPCs designed for different operating regimes, and the suggested method ?schedules' them optimally by solving dynamic programming. The ?cost-to-go' function, which is the solution of the Bellman equation in DP, will estimate the benefit of using a particular control policy given a state of the system in terms of a discounted infinite sum of future costs. The proposed approaches take the following steps: 1) Perform closed-loop simulations (or collect operation data) using different linear MPCs for all operating regimes. 2) Record states and corresponding control laws for the data 3) Iterate on the Bellman equation where the minimization of the cost-to-go is performed with respect to all the available control policies in each iteration. In the iteration step, we will use two different approaches. The first will assume that we have a global nonlinear model and solve the Bellman equation. The converged cost-to-go function maps a state to an optimal choice among the control policies. The second will assume that we do not have a global model at all. In this more realistic case, the cost-to-go function can be constructed by mapping state and action pair to a cost-to-go value. This will allow us to identify the system's state transition rule implicitly and can provide a switching rule effectively for a case where a global model is difficult to identify, but large amounts of closed-loop data with several different linear controllers are available.