(138b) Applying Reinforcement Learning for Batch Trajectory Optimization in an Industrial Chemical Process

Conference

AIChE Spring Meeting and Global Congress on Process Safety

Year

2021

Proceeding

2021 AIChE Virtual Spring Meeting and 17th Global Congress on Process Safety

Group

Industry 4.0 Topical Conference

Session

Emerging Technologies in Data Analytics II

Time

Thursday, April 22, 2021 - 1:55pm to 2:20pm

Authors

Rendall, R. - Presenter, University of Coimbra

Ma, Y., Louisiana State University

Castillo, I., Dow Inc.

Wang, Z., Dow Inc.

Chiang, L., Dow Inc.

Bentley, D., The Dow Inc.

Peng, Y., The Dow Chemical Co

Reinforcement Learning (RL) is one of the three basic machine learning paradigms, alongside supervised and unsupervised learning. RL focuses on training an agent to learn an optimal policy, maximizing cumulative rewards from the environment of interest [1]. The recent developments in model-free RL have achieved remarkable success in various process optimization and control tasks, where multiple applications have been reported in the literature, including parameter tuning for existent single PID control loops [2], supply chain management [3] and robotics operations [4].

There are multiple challenges when applying RL in an industrial setting, but the main one concerns the training of the agent. In the learning phase, the agent estimates and improves its policy and value functions through a large number of trial and error iterations. Many input-output experimentations are required, which is obviously not feasible in an industrial chemical plant. As an alternative, a model of the plant can be utilized for training the agent and provide the input-output data. Both first principles and data-driven models are suitable, and both options are explored in this work.

In this work, we test two state-of-the-art RL approaches to optimize an industrial batch case study: Proximal Policy Optimization (PPO), Soft Actor Critic (SAC) and Advantage Actor Critic (A2C). These RL methods optimize the batch process by controlling the reaction conditions and maximizing the total reward (the reward is defined as the profit margin, subject to certain process and safety constraints). The batch optimal trajectories are compared in two scenarios. The first scenario uses, as an environment, a first principles model for training the agent. In the second scenario, a surrogate Long-Short-Term-Memory (LSTM) model is utilized, which combines both historical data from the reactorâ€™s operation and the first principle model estimates. The use of the LSTM is motivated by the fact that it helps mitigate accuracy issues from the first principle model by relying on the relationships found in the plant data. The optimized trajectories were compared to the current trajectories, and the RL optimal batch profiles show a 3% increase in product profit.

References

Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
Badgwell, T. A., Liu, K. H., Subrahmanya, N. A., & Kovalski, M. H. (2019). U.S. Patent Application No. 16/218,650.
Gokhale, A., Trasikar, C., Shah, A., Hegde, A., & Naik, S. R. (2021). A Reinforcement Learning Approach to Inventory Management. In Advances in Artificial Intelligence and Data Engineering (pp. 281297). Springer, Singapore.
Haarnoja, T., Zhou, A., Abbeel, P., & Levine, S. (2018). Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. arXiv preprint arXiv:1801.01290.

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: January 2025

CEP: December 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(138b) Applying Reinforcement Learning for Batch Trajectory Optimization in an Industrial Chemical Process

AIChE Spring Meeting and Global Congress on Process Safety

2021

2021 AIChE Virtual Spring Meeting and 17th Global Congress on Process Safety

Industry 4.0 Topical Conference

Emerging Technologies in Data Analytics II

Thursday, April 22, 2021 - 1:55pm to 2:20pm

Authors

More Conference Links

Cancelation Policy

Accepted Authors' Guide

Code of Conduct

Beware of Hotel and Attendee-list Scams

Code of Conduct

Beware of Hotel and Attendee-list Scams