(373z) Two-Stage Reinforcement Learning for Batch Bioprocess Optimization Under Uncertainty

Conference

AIChE Annual Meeting

Year

2019

Proceeding

2019 AIChE Annual Meeting

Group

Computing and Systems Technology Division

Session

Interactive Session: Systems and Process Operations

Time

Tuesday, November 12, 2019 - 3:30pm to 5:00pm

Authors

Petsagkourakis, P. - Presenter, University College London

del Rio Chanona, A., Imperial College London

Zhang, D.

Bradford, E., NTNU

Bioprocesses have received a lot of attention to produce clean and sustainable alternatives to fossil-based materials. However, they are generally difficult to optimize due to their unsteady-state operation modes and stochastic behaviours. Furthermore, biological systems are highly complex, therefore plant-model mismatch is often present. Bioprocess optimization suffers from three conditions: 1) there is no precise model known for the process under consideration (plant-model mismatch), leading to inaccurate predictions and convergence to suboptimal solutions, 2) the process presents disturbances and 3) the system is risk-sensitive hence exploration is inconvenient.

It follows that we seek a strategy that can optimize a process and handle both the systemâ€™s stochastic behaviours (e.g. process disturbances) and plant-model mismatches. In this work we have opted to use Reinforcement learning and more specifically, Policy Gradients [1], as an alternative to current existing methods.

The chemical engineering community has been dealing with stochastic biosystems for a long time. For example, nonlinear dynamic optimization and particularly nonlinear MPC is a powerful methodology to address uncertain dynamic systems, however there are several properties that make its application less attractive. Most MPC approaches require the knowledge of a detailed model that describes the systemâ€™s dynamics, and stochastic MPC additionally requires an assumption of the uncertainty quantification/propagation. Furthermore, conventional MPC assumes open-loop control actions at future time points in the prediction, which can lead to overly conservative control actions.

In contrast, RL directly accounts for the effect of future uncertainty and its feedback in a proper â€˜closed-loopâ€™ manner [2]. In addition, policy gradients can establish a policy in a model-free fashion and excel at on-line computational time. This is because the online computations require only evaluation of a policy, since all the computational cost is shifted off-line.

In this work we propose a two-stage reinforcement learning strategy. We assume that a process model is available, which is exploited to obtain a preliminary optimal control policy. Reinforcement learning is utilized to train the policy off-line for a large number of epochs and episodes, shifting most of the computational effort off-line. This policy has been chosen as a recurrent neural network that receives a window of past states and control actions as well as the current state, providing as an output the stochastic policy from which the control action is drawn.

Subsequently, during the online optimization stage, and by implementing elements from transfer learning [3], the policy network adapts to and optimizes the true system (the plant). The approach is verified in a series of case studies including stochastic differential equation systems with complex dynamics.

[1] R. Sutton, A. Barto, 2018. Reinforcement Learning: An Introduction Second Edition. MIT Press.

[2] J. H. Lee, J. M. Lee, 2006. Approximate dynamic programming based approach to process controland scheduling. Computers & Chemical Engineering 30 (10-12), 1603â€“1618

[3] A. Krizhevsky, I. Sutskever, G. E. Hinton, 2012. ImageNet Classification with Deep ConvolutionalNeural Networks. In: F. Pereira, C. J. C. Burges, L. Bottou, K. Q. Weinberger (Eds.), Advancesin Neural Information Processing Systems 25. Curran Associates, Inc., pp. 1097â€“1105.

Topics

Process Automation & Control

Process Design & Development

Computing and Systems Engineering

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: December 2024

CEP: November 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(373z) Two-Stage Reinforcement Learning for Batch Bioprocess Optimization Under Uncertainty

AIChE Annual Meeting

2019

2019 AIChE Annual Meeting

Computing and Systems Technology Division

Interactive Session: Systems and Process Operations

Tuesday, November 12, 2019 - 3:30pm to 5:00pm

Authors

Topics

More Conference Links

Cancelation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams