(299f) Stability-Preserving Automatic Tuning of PID Control with Reinforcement Learning
AIChE Annual Meeting
2022
2022 Annual Meeting
Computing and Systems Technology Division
Advances in Process Control II
Tuesday, November 15, 2022 - 2:05pm to 2:24pm
In this work we present a stability-preserving automatic PID tuning approach based on RL [8], particularly the deep deterministic policy gradient (DDPG) algorithm [9]. Specifically, the PID parameters to be explored are considered as the RL action and the corresponding control performance is treated as the reward. The RL agent learns the optimal PID parameters (actions) based on the past action-reward interactions with the environment (the closed-loop system). In our DDPG scheme, the parameters in the actor and critic networks are updated once after every episode (i.e., one complete closed-loop step test). To ensure the closed-loop stability of each episode, a baseline PID controller is designed that can present stable responses. When the RL agent explores a set of PID parameters, the running reward, e.g., the accumulated tracking error so far, is closely monitored across each time step in the episode. As soon as it exceeds a threshold, the underlying PID parameters are switched back to baseline values as an early correction to prevent instability. The developed methods are validated through setpoint tracking experiments based on a second-order plus dead-time system. Simulation results reveal that with our stability-preserving strategy, the closed-loop stability can be maintained throughout the RL search and the optimal PID parameters can be discovered efficiently. Moreover, the RL-based PID tuning can adapt to changes in the process model automatically without user intervention.
Reference
[1] H. Boubertakh, M. Tadjine, P.Y. Glorennec, and S. Labiod. Tuning fuzzy PD and PI controllers using reinforcement learning. ISA Transactions, 49:543â51, 2010.
[2] D.E. Seborg, T.F. Edgar, and D.A. Mellichamp, PID controller design, tuning, and troubleshooting, Process Dynamics and Control. 2:300â323, 2004.
[3] K. Zhou, and L. Zhen. Optimal design of PID parameters by evolution algorithm. Journal of Huaqiao University (Natural Science), 26:85â88, 2005.
[4] J. Chen, and T.C. Huang. Applying neural networks to on-line updated PID controllers for nonlinear process control. Journal of Process Control, 14:211â30, 2004.
[5] Y. Qin, W. Zhang, J. Shi, and J. Liu. Improve PID controller through reinforcement learning. In: 2018 IEEE CSAA Guidance, Navigation and Control Conference (CGNCC), pp. 1â6, 2018.
[6] W.J. Shipman, and L.C. Coetzee. Reinforcement learning and deep neural networks for PI controller tuning, IFAC-PapersOnLine, 52:111â116, 2019.
[7] X.-S. Wang, Y.-H. Cheng, and S. Wei. A proposal of adaptive PID controller based on reinforcement learning, Journal of China University of Mining and Technology, 17:40â44, 2004.
[8] A. Lakhani, M. Chowdhury, and Q. Lu. Stability-preserving automatic tuning of PID control with reinforcement learning, Complex Engineering Systems, 2:3, 2022.
[9] D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, et al. Deterministic policy gradient algorithms. In: International Conference on Machine Learning, pp. 387â395, 2014.