(522h) Constrained Reinforcement Learning for Process Optimization and Control

Conference

AIChE Annual Meeting

Year

2020

Proceeding

2020 Virtual AIChE Annual Meeting

Group

Computing and Systems Technology Division

Session

Advances in Machine Learning and Intelligent Systems II

Time

Wednesday, November 18, 2020 - 9:45am to 10:00am

Authors

del Rio Chanona, A. - Presenter, Imperial College London

Petsagkourakis, P., University College London

Bradford, E., NTNU

Sandoval Cardenas, I. O., Imperial College London

Zhang, D.

Galvanin, F., University College London

The optimization of chemical processes presents distinctive challenges to the stochastic systems community given that they suffer from three conditions:

There is no precise known model for most industrial scale processes (plant model mismatch), leading to inaccurate predictions and convergence to suboptimal solutions.
The process is affected by disturbances (i.e. it is stochastic).
State constraints must be satisfied due to operational and safety concerns, therefore constraint violation can be detrimental or even dangerous.

To solve the above problems, we propose a Reinforcement Learning (RL) Policy Gradient method, which satisfies chance constraints with probabilistic guarantees.

Machine learning is helping to address complex problems in the chemical and process industries, such as process optimal control [1,2], estimation and online monitoring [3,4]. However, less studies have been conducted to investigate the applicability and efficiency of RL in process engineering, and none include the efficient handling of constraints.

RL would be a natural choice to address nonlinear, uncertain and stochastic process control problems as it effectively addresses stochastic environments [5]. Unfortunately, present RL algorithms fail to reliably satisfy state constraints even when initialized with feasible initial policies [6]. Various approaches have been proposed in the litera-ture, where usually penalties are applied for the constraints.Such approaches can be very problematic, easily losing op-timality or feasibility [7] especially in the case of a fixed penalty. The main approaches to incor-porate constraints in this way make use of trust-region and fixed penalties [9],as well as cross entropy [7]. As it is observed in [8], when penalty methods are ap-plied in policy optimization, depending on the value of the penalty parameter the behaviour of the policy may change.

We propose a constrained RL algorithm which guarantees the satisfaction of joint chance constraints. To accomplish this, we propose the introduction of backoffs, which are computed simultaneously with the feedback policy. Backoffs are adjusted with Bayesian optimization using the empirical cumulative distribution function, which can, therefore, guarantee the satisfaction of joint chance constraints.

[1] Bradford, E.; Schweidtmann, A. M.; Zhang, D.; Jing, K. and del Rio-Chanona, E. A., Dynamic modeling and optimization of sustainable algal production with uncertainty using multivariate Gaussian processes, 118, 143-158, 2018

[2] del Rio-Chanona, E. A.; Fiorelli, F.; Zhang, D.; rashid Ahmed, N.; Jing, K.; and Shah, N, An efficient model construction strategy to simulate microalgal lutein photo-production dynamic process, 114(11), 2518-2527, 2017

[3] do Carmo Nicoletti, M. and Jain, L. C., Computational Intelligence Techniques for Bioprocess Modelling, Supervision and Control, Volume 218 of Studies in Computational Intelligence, Springer Science & Business Media, 29 Jun 2009

[4] Xiong, Z. and Zhang, J., Modelling and optimal control of fed-batch processes using a novel control affine feedforward neural network, Neurocomputing, 61, 317-337, 2004

[5] Petsagkourakis, P.; Sandoval, I. O.; Bradford E.; Zhang, D. and del Rio-Chanona, E. A., Reinforcement Learning for Batch Bioprocess Optimization, 133, 2020

[6] Wen, M., Constrained Cross-Entropy Method for Safe Reinforcement Learning, Neural Information Processing Systems (NeurIPS), 2018

[7] Ray, A.; Achiam, J. and Amodei, D., Benchmarking Safe Exploration in Deep Reinforcement Learning, Deep RL Workshop NeurIPS 2019, arXiv:1910.01708, 2019

[8] Achiam, J; Held, D.; Tamar, A. and Abbeel, P., Constrained Policy Optimization, International Conference on Machine Learning (ICML) 2017

[9] Tessler, C.; Mankowitz, D. J.; and Mannor, S.; Reward Constrained Policy Optimization, International Conference on Learning Representations (ICLR) 2019

Topics

Process Automation & Control

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2024 Annual Safety in Ammonia Plants and Related Facilities Symposium

4th Optogenetic Technologies and Applications Conference

Upcoming Conferences & Events

Procesa 2024: 6th AIChE Latin America Student Regional Conference

2024 Indonesia Student Regional Conference

CCPS Workshop on Process Safety Metrics: API-RP-754 Implementation

University of Houston Student Process Safety Bootcamp

2024 Annual Safety in Ammonia Plants and Related Facilities Symposium

9th CCPS Canadian Regional Meeting

4th Optogenetic Technologies and Applications Conference

tcbiomass 2024

AIChE 2024 Virtual Career Fair for Professionals

CEP: August 2024

CEP: July 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(522h) Constrained Reinforcement Learning for Process Optimization and Control

AIChE Annual Meeting

2020

2020 Virtual AIChE Annual Meeting

Computing and Systems Technology Division

Advances in Machine Learning and Intelligent Systems II

Wednesday, November 18, 2020 - 9:45am to 10:00am

Authors

Topics

More Conference Links

Contact Us

Cancellation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams

Code of Conduct

Beware of Hotel and Attendee-list Scams