(519a) Reinforcement Learning Based Optimization of Non-Linear Processes: Industrial Applications
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Computing and Systems Technology Division
Industrial applications in Intelligent Operations
Wednesday, October 30, 2024 - 12:30pm to 12:51pm
The described applications implement a modified extension of the systematic methodology previously proposed and published by the author himself in Patel (2023). The published method addresses the need for RL implementation to be safe, fast learning and explainable when applied to linear industrial control and optimization problems. It follows a data centric approach rather than a technology centric approach. Rather than adding to the extensive research on augmenting existing RL algorithms, the method focuses on using existing model-free RL algorithm with specially curated data that leverages the available process domain-specific knowledge. The RL agent is designed to discover new knowledge, not known previously, rather than learning the already known process domain-specific knowledge. This results in significantly reduced dimensionality of agent, significantly reduced training data requirement, no need of special hardware for training and deployment as well as improved explainability of the implementation. The published method is modified to address non-linear processes with an emphasis on capturing non-monotonic non-linearities. Though the exact nature of modifications will not be described in the paper due to Intellectual property implications, the results of the actual implementation on NGL processes will be presented and analyzed. The RL agent is programmed as a python package and interfaced with a high-fidelity simulation model of the process. The agent trains by interacting with the simulation model, running it several hundred times. The trained agent is then deployed either as an advisory agent or an online agent in the process control network providing optimal operating targets. These targets are typically implemented through an existing Advanced Process Control (APC) layer.
The NGL recovery process consists of cooling down the sweet gas feed before being fed to a De-methanizer column where the gas feed is separated into overhead C1 and bottoms C2+ NGL product. The optimization objective is to maximize the C2 recovery thereby maximizing the NGL recovery. In order to cool down the gas feed, it is split multiple times and made to flow through a complex train of heat exchangers and turbo-expanders. The effect of changes in flow splits on C2 recovery is non-linear and generally not well understood. RL based optimization is used to extract this knowledge and recommend optimal values of the flow splits. In this application two flow splits are taken as optimization variables with the C2 recovery as the objective function to be maximized. The agent successfully discovered the non-monotonic non-linear relationship between one of the flow splits and C2 recovery as shown in Fig.1. The optimal value of the flow split is at the top of the peak in Fig.1. where the C2 recovery is maximum.
The NGL recovered in NGL recovery process is then fed to NGL fractionation process consisting of successive distillation to separate C2, C3, C4 and Natural Gasoline (NG) products. The optimization objective is to maximize the profit calculated as the Product revenue minus Utility cost assuming the feed to be constant. The profit is a non-linear function of product purity (or impurity) and product & utility price. In this application two product impurities are taken as optimization variables with profit as the objective function to be maximized in presence of variation in five prices and four constraints. The two product impurity optimization variables are C4 in C3 product and C5 in C4 product. The five prices considered are C3 to C4 price differential, C4 to NG price differential, power price, steam price and Co2 abatement cost. The four constraints considered are De-propanizer and De-butanizer, reboiler and condenser duty limitations. Not only did the agent successfully discover the non-monotonic non-linear relationship between both the product impurities and profit at a given pricing as shown in Fig.2., it also discovered how the relationship changes when the price changes. Fig.3. shows the effect of steam pricing change on the relationship for C4 in C3 impurity. We notice that as the steam price increases, the agent recommends making the product more impure to save on steam cost even at the cost of losing some product revenue. This behavior is due to the tradeoff between product revenue and utility cost. Utility cost increases exponentially as the product purity is increased.
The successful industrial applications demonstrate the capability of RL to discover new knowledge and optimize non-linear industrial processes. The key is to incorporate available domain-specific knowledge resulting in smaller sized agents requiring less training data and no special hardware for training. The future direction for the work is to investigate further RL agent training using real-time process data once it is deployed in the process control network.
Reference:
Patel K. M., A practical Reinforcement Learning implementation approach for continuous process control. Computers & Chemical Engineering, 174, 2023. https://doi.org/10.1016/j.compchemeng.2023.108232
Topics
Checkout
This paper has an Extended Abstract file available; you must purchase the conference proceedings to access it.
Do you already own this?
Log In for instructions on accessing this content.
Pricing
Individuals
AIChE Pro Members | $150.00 |
AIChE Emeritus Members | $105.00 |
AIChE Graduate Student Members | Free |
AIChE Undergraduate Student Members | Free |
AIChE Explorer Members | $225.00 |
Non-Members | $225.00 |