(109a) Use of Dimensionality Reduction and Transfer Learning in Deep Reinforcement Learning Controller for Hydraulic Fracturing | AIChE

(109a) Use of Dimensionality Reduction and Transfer Learning in Deep Reinforcement Learning Controller for Hydraulic Fracturing

Authors 

Bangi, M. S. F. - Presenter, Texas A&M University
Kwon, J., Texas A&M University
Reinforcement Learning (RL) is an area of machine learning where an agent learns optimal control actions by directly interacting with the environment in order to maximize the cumulative sum of rewards. However, the application of RL to process control has been recent and limited [1]. There are multiple approaches to solving the RL problem in the literature but recent successes in RL has been due to combining concepts from deep learning with RL algorithms. The idea of leveraging deep neural networks (DNNs) as function approximators in RL has delivered tremendous success in video games like Atari games [2] etc. On the contrary, in process control, a Deep RL (DRL) controller was recently proposed to control discrete-time nonlinear processes [3]. The DRL controller is a model-free off-policy actor-critic algorithm in which learning is based on temporal difference (TD) error [1] and Deterministic Policy Gradient (DPG) theorem [4]. The actor and the critic are represented using DNNs so that the policy and the value functions can be generalized to the continuous state and action spaces. Ideas like replay memory, target networks, and gradient clipping were used in the DRL controller to make learning appropriate for process control applications. The DRL controller was able to track setpoints for single-input-single-output (SISO), multi-input-multi-output (MIMO), and a nonlinear system with external disturbances [3].

Hydraulic fracturing is the extraction of oil and gas from rocks with low porosity and permeability. In order to extract oil and gas from such rocks, an artificial medium of proppant (sand) is created within the fractures using controlled explosions followed by the pumping of fracturing fluids at high pressures. The efficiency of the extraction process is dependent on the final fracture geometry and proppant concentration along the fracture. In order to achieve the desired objectives with respect to these process variables, model predictive controllers (MPC) have been designed recently [5]. But these controllers (a) require an accurate model which is difficult to obtain in the case of hydraulic fracturing given its complexity, (b) involve exorbitant computational costs, and (c) require regular re-tuning of controller parameters. Therefore, in this work, we propose to design a model-free DRL controller for hydraulic fracturing. Despite its success, the DRL controller has a few limitations which include the requirement of long training times, and careful initialization of hyperparameters for fast convergence [6]. In order to overcome these challenges, we propose to implement transfer learning wherein the controller learns a suboptimal policy offline using a data-based reduced-order model (ROM) before learning online from the actual process. In this work, we used a high-fidelity model of hydraulic fracturing process as a virtual numerical experiment of the actual process. Additionally, we used principal component analysis (PCA) to reduce the dimension of the RL state before using it for learning. Consequently, the actor and the critic were trained in the reduced-PCA space. With these proposed steps included in the DRL controller training, it shows convergence to an optimal policy with the objective of obtaining uniform proppant concentration along the fracture.

Literature cited:

[1] Sutton, R.S., Barto, A.G. Reinforcement Learning: An Introduction. Cambridge: MIT Press, 1998.

[2] Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M. Playing Atari with Deep Reinforcement Learning. arXiv Preprint, arXiv:13125602, 2013.

[3] Spielberg, S., Tulsyan, A., Lawrence, N.P., Loewen, P.D., Bhushan Gopaluni, R. Toward self‐driving processes: A deep reinforcement learning approach to control. AIChE J, 65(10), 2019.

[4] Silver, D., Lever, G., Heess, N., Degris, T., Wierstra, D., Riedmiller, M. Deterministic policy gradient algorithms. In ICML, 2014.

[5] Siddhamshetty, P., Yang, S., Kwon, J. S.-I. Modeling of hydraulic fracturing and designing of online pumping schedules to achieve uniform proppant concentration in conventional oil reservoirs. Comput. Chem. Eng., 114:306 – 317, 2018.

[6] Shin, J., Badgwell T.A., Liu, KH., Lee, J.H. Reinforcement Learning – Overview of recent progress and implications for process control. Computer Aided Chemical Engineering, 44:71-85, 2018.