(299f) Stability-Preserving Automatic Tuning of PID Control with Reinforcement Learning | AIChE

(299f) Stability-Preserving Automatic Tuning of PID Control with Reinforcement Learning

Authors 

Chowdhury, M. A. - Presenter, Texas Tech University
Lu, Q. J., Texas Tech University
Lakhani, A., Texas Tech University
Proportional-integral-derivative (PID) controllers have been widely used in the process industry, accounting for more than 80% of the market, due to their simplicity in design and effectiveness for controlling practical systems [1]. However, the control performance of PID strictly relies on the tuning of its parameters. Conventional PID tuning techniques, such as Ziegler-Nichols, Cohen-Coon, and internal model control (IMC), are limited by system complexity and may not yield the optimal PID values [2]. Moreover, the PID parameters often remain fixed after tuning and thus lack adaptivity to time-varying dynamics. Adaptive PID methods based on, e.g., evolution optimization [3] and neural network [4], can mitigate the non-adaptivity issue, however, they either suffer from overwhelming computation for online adaption or lack the teaching signal for learning. As an online learning method, reinforcement learning (RL) has shown unique advantages in addressing the above issues, and thus has received attention in the community for enabling online adaptive PID tuning [5-7]. Despite these rapid advancements, maintaining the closed-loop stability during RL search, a critical issue for the safe operation of practical systems, has not been thoroughly studied.

In this work we present a stability-preserving automatic PID tuning approach based on RL [8], particularly the deep deterministic policy gradient (DDPG) algorithm [9]. Specifically, the PID parameters to be explored are considered as the RL action and the corresponding control performance is treated as the reward. The RL agent learns the optimal PID parameters (actions) based on the past action-reward interactions with the environment (the closed-loop system). In our DDPG scheme, the parameters in the actor and critic networks are updated once after every episode (i.e., one complete closed-loop step test). To ensure the closed-loop stability of each episode, a baseline PID controller is designed that can present stable responses. When the RL agent explores a set of PID parameters, the running reward, e.g., the accumulated tracking error so far, is closely monitored across each time step in the episode. As soon as it exceeds a threshold, the underlying PID parameters are switched back to baseline values as an early correction to prevent instability. The developed methods are validated through setpoint tracking experiments based on a second-order plus dead-time system. Simulation results reveal that with our stability-preserving strategy, the closed-loop stability can be maintained throughout the RL search and the optimal PID parameters can be discovered efficiently. Moreover, the RL-based PID tuning can adapt to changes in the process model automatically without user intervention.

Reference

[1] H. Boubertakh, M. Tadjine, P.Y. Glorennec, and S. Labiod. Tuning fuzzy PD and PI controllers using reinforcement learning. ISA Transactions, 49:543–51, 2010.

[2] D.E. Seborg, T.F. Edgar, and D.A. Mellichamp, PID controller design, tuning, and troubleshooting, Process Dynamics and Control. 2:300–323, 2004.

[3] K. Zhou, and L. Zhen. Optimal design of PID parameters by evolution algorithm. Journal of Huaqiao University (Natural Science), 26:85–88, 2005.

[4] J. Chen, and T.C. Huang. Applying neural networks to on-line updated PID controllers for nonlinear process control. Journal of Process Control, 14:211–30, 2004.

[5] Y. Qin, W. Zhang, J. Shi, and J. Liu. Improve PID controller through reinforcement learning. In: 2018 IEEE CSAA Guidance, Navigation and Control Conference (CGNCC), pp. 1–6, 2018.

[6] W.J. Shipman, and L.C. Coetzee. Reinforcement learning and deep neural networks for PI controller tuning, IFAC-PapersOnLine, 52:111–116, 2019.

[7] X.-S. Wang, Y.-H. Cheng, and S. Wei. A proposal of adaptive PID controller based on reinforcement learning, Journal of China University of Mining and Technology, 17:40–44, 2004.

[8] A. Lakhani, M. Chowdhury, and Q. Lu. Stability-preserving automatic tuning of PID control with reinforcement learning, Complex Engineering Systems, 2:3, 2022.

[9] D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, et al. Deterministic policy gradient algorithms. In: International Conference on Machine Learning, pp. 387–395, 2014.