(375r) Offline Rl for Optimal Bioprocess Production Scheduling
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Computing and Systems Technology Division
Interactive Session: Data and Information Systems
Tuesday, October 29, 2024 - 3:30pm to 5:00pm
In this work we apply a novel offline RL framework for optimal bioprocess production scheduling. In contrast to the traditional RL algorithms like temporal difference (TD) learning, we utilise the transformer model to treat the sequential decision-making process through sequential modelling. In this way, we can train a transformer model to predict the next best action based on a desired future reward by taking the information of the current state, the scheduling actions, and the accumulative reward from the current state as the input. On the one hand, this method allows us to learn the optimal scheduling strategy only from historical data without interaction with the bioprocess operation. One the other hand, utilization of the transformer model provides the opportunity to associated advances in language modelling such as GPT-x and BERT (Chen et al., 2021).
Here we show case the capacity of the framework on a continuous biomanufacturing process with a single stage and single production unit operating under stochastic demand over a planned horizon. At the same time, the transition losses which are caused by product type change are considered and minimised in this case study (Hubbs et al., 2020). Our results show the offline RL method can provide a near optimal policy for bioprocess scheduling without interacting with the bioprocess environment.
References
CHEN, L., LU, K., RAJESWARAN, A., LEE, K., GROVER, A., LASKIN, M., ABBEEL, P., SRINIVAS, A. & MORDATCH, I. 2021. Decision transformer: Reinforcement learning via sequence modeling. Advances in neural information processing systems, 34, 15084-15097.
HUBBS, C. D., LI, C., SAHINIDIS, N. V., GROSSMANN, I. E. & WASSICK, J. M. 2020. A deep reinforcement learning approach for chemical production scheduling. Computers & Chemical Engineering, 141, 106982.
MOWBRAY, M., ZHANG, D. & CHANONA, E. A. D. R. 2022. Distributional reinforcement learning for scheduling of chemical production processes. arXiv preprint arXiv:2203.00636.
VIEIRA, M., PINTO-VARELA, T., MONIZ, S., BARBOSA-PÓVOA, A. P. & PAPAGEORGIOU, L. G. 2016. Optimal planning and campaign scheduling of biopharmaceutical processes using a continuous-time formulation. Computers & Chemical Engineering, 91, 422-444.