(433f) Next-Generation Hybrid Models: Combining Attention Mechanisms and Lstm for Improved Predictions and Process Control in the Chemical Industry
AIChE Annual Meeting
2023
2023 AIChE Annual Meeting
Computing and Systems Technology Division
Data-driven Modeling, Estimation and Optimization for Control II
Thursday, November 9, 2023 - 9:30am to 9:48am
Recently, attention-based ML models have been in the spotlight due to their remarkable ability to establish strong correlations between input and outputs, even in the presence of system noise or uncertainties. These models adeptly focus on short- and long-term dependences in the evolution of system states [7,8]. In essence, the attention mechanism performs a scaled-dot product calculation between various input vectors, enabling it to selectively pay attention to significant long-term (e.g., concentration evolution) and short-term (e.g., sudden change in temperature due to control actions) process alterations by assigning higher attention scores to such instances. As a result, the attention mechanism serves as a filtering mechanism to dynamically handle process uncertainties and data noise by effectively dampening weak correlations and amplifying strong interactions between the system states. On the other hand, LSTM-based sequential time-series models have shown superior predictive performance due to their ability to explicitly consider the time evolution of system states (i.e., battery dynamics, stock market estimates, energy forecasting) as opposed to DNNs. This is because, LSTM utilizes several internal gates to dynamically update, forget, and store relevant changes in the state dynamics, whereas DNNs tend to assign roughly equal weight to all input channels [9]. Thus, considering the abovementioned details, a combination of attention mechanism and LSTM can pose a highly effective solution that (a) can account for process uncertainties by selective attention to changes in system dynamics, and (b) accurately predict time-varying parameters.
To this end, we propose a novel attention-LSTM-based hybrid model for a complex, non-trivial fed-batch fermentation process. Specifically, the input to the data-driven module of the hybrid model consists of state measurements for the previous time steps. This input is sent through an encoder module to lift the states into higher dimensions, and then an attention mechanism with a subsequent LSTM layer is applied to obtain time-series predictions of uncertain parameters for the next steps. The uncertain parameters are a lumped representation of different process variations, such as varying bacterial kinetics, and feed and temperature fluctuations, and are represented by the most sensitive kinetic parameters determined through global sensitivity analysis [6]. Next, the predicted uncertain parameters are then fed to the first-principles model, which includes mass and energy balance equations, concentration dynamics, and kinetic equations to obtain state predictions for the next time steps. The training and validation dataset is generated by simulating a high-fidelity (HF) model of a fermenter system for over 100 different arbitrarily initialized operating conditions, such as temperature, substrate flow rate, and catalyst rate. Additionally, the prediction results (i.e., biomass, substrate, oxygen concentration, and product) were compared against an existing DNN-based hybrid model to highlight the superior performance of the proposed attention-LSTM-based hybrid model. Finally, the developed hybrid model is incorporated within a model predictive controller (MPC) to achieve set-point targets for product amount and operating cost by determining optimal input profiles for feed flow rate and temperature. In a nutshell, the combined benefits of attention mechanism and LSTM-based sequential modeling give rise to the next generation of hybrid models that can regulate process uncertainties while providing accurate process predictions. Moreover, the current work lays the groundwork for developing attention-based hybrid models for more complex chemical processes and further gaining insights regarding unknown process uncertainties, leading to more accurate predictions and intelligent control.
References:
- Sansana, J., Joswiak, M. N., Castillo, I., Wang, Z., Rendall, R., Chiang, L. H., & Reis, M. S. (2021). Recent trends on hybrid modeling for Industry 4.0. Computers & Chemical Engineering, 151, 107365.
- Chen, Yingjie, and Marianthi Ierapetritou. "A framework of hybrid model development with identification of plantâmodel mismatch." AIChE Journal 66.10 (2020): e16996
- Thompson, M. L., & Kramer, M. A. (1994). Modeling chemical processes using prior knowledge and neural networks. AIChE Journal, 40(8), 1328-1340.
- Sharma, Niket, and Y. A. Liu. "A hybrid scienceâguided machine learning approach for modeling chemical processes: A review." AIChE Journal 68.5 (2022): e17609.
- Shah, P., Sheriff, M. Z., Bangi, M. S. F., Kravaris, C., Kwon, J. S. I., Botre, C., & Hirota, J. (2022). Deep neural network-based hybrid modeling and experimental validation for an industry-scale fermentation process: Identification of time-varying dependencies among parameters. Chemical Engineering Journal, 135643.
- Bangi, Mohammed Saad Faizan, and Joseph Sang-Il Kwon. "Deep hybrid modeling of a chemical process: Application to hydraulic fracturing." Computers & Chemical Engineering 134 (2020): 106696.
- Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
- Vaswani, Ashish, et al. "Attention is all you need." Advances in neural information processing systems 30 (2017).
- Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.