(375ag) A Pre-Train and Fine-Tune Paradigm of Fault Prognosis for Chemical Process | AIChE

(375ag) A Pre-Train and Fine-Tune Paradigm of Fault Prognosis for Chemical Process

Authors 

Zhao, J., Responsible Production and APELL Center (UNEP), Department of Chemical Engineering, Tsinghua University, Beijing, 100084, China
Abstract

With the transformation of industrial production digitization and automation, process monitoring has been an indispensable technical method to realize the safe and efficient production of chemical process. The process monitoring stands as an indispensable technical methodology critical to the realization of safe and efficient chemical production. In process monitoring loop, fault detection pertains to the identification of the current system state, while fault prognosis predominantly aims to anticipate potential and forthcoming system faults [1]. Deep learning methods had a significant impact in chemical process system.

Conventional fault detection in the industrial domain primarily relies on established statistical techniques like principal component analysis, partial least squares, and independent component analysis. The advent of machine learning has ushered in a rapid proliferation of sophisticated models such as deep belief networks, convolutional neural networks, and variational recurrent autoencoders. These advanced models demonstrate superior efficacy in handling intricate problems with heightened accuracy [2]. The evolution of fault prognosis methodologies has been intricately tied to the evolving characteristics of industrial data. As data complexity increases, many data-driven methods are flexibly applied in nonlinear scenarios, such as artificial neural networks, autoencoder, support vector machine, and radial basis function network. recurrent neural network, long short-term memory and Transformer model have shown advantages in learning nonlinear features of sequences and extracting long-dependent information [3].

But for each process fault or process variable, a specific deep learning model needs to be trained to solve the problem, which would consume a lot of computing resources and time. To handle these challenges, a prevalent approach involves the training of expansive models on unsupervised or weakly supervised objectives initially, followed by fine-tuning or assessing zero-shot generalization for downstream tasks [4]. This practice finds widespread application, notably in domains such as natural language processing and computer vision, which inspired the pre-trained models for time series data [5].

In this work, we proposed a pre-train and fine-tune paradigm of fault prognosis for chemical process, inspired by the pre-trained models in natural language processing and computer vision. We took a deep masked attention model pretrained on the raw datasets and fine-tuned models based on process downstream tasks. The Tennessee Eastman process (TEP) was used to demonstrate the validity of the method. The results show that the proposed method obtain the strong performance on the chemical process fault detection and prognosis task. Regarding fault detection time, the model fine-tuned with only one batch of dataset generally outperformed LSTM-VAE trained with 8 batches of datasets in early fault detection, especially in some slowly changing faults, the proposed method can reduce a large amount of detection time. For process variables prediction, the MSE and MAE loss of the model fine-tuned with only one batch of dataset is close to that of the Transformer and LSTM models trained with 8 batches of datasets.

In conclusion, we proposed a pre-train and fine-tune paradigm of fault prognosis for chemical process. The proposed method exhibited substantial advantages for fault prognosis and process safety, further underscoring its value for industrial applications.

References

[1] Y. Bai, J. Zhao, A novel transformer-based multi-variable multi-step prediction method for chemical process fault prognosis, Process Safety and Environmental Protection 169 (2023) 937–947. https://doi.org/10.1016/j.psep.2022.11.062.

[2] X. Bi, J. Zhao, A novel orthogonal self-attentive variational autoencoder method for interpretable chemical process fault detection and identification, Process Safety and Environmental Protection 156 (2021) 581–597. https://doi.org/10.1016/j.psep.2021.10.036.

[3] Q. Wen, T. Zhou, C. Zhang, W. Chen, Z. Ma, J. Yan, L. Sun, Transformers in Time Series: A Survey, (2022). https://doi.org/10.48550/arXiv.2202.07125.

[4] T. Brown, B. Mann, N. Ryder, M. Subbiah, J.D. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, D. Amodei, Language Models are Few-Shot Learners, Advances in Neural Information Processing Systems 33 (2020) 1877–1901.

[5] Q. Ma, Z. Liu, Z. Zheng, Z. Huang, S. Zhu, Z. Yu, J.T. Kwok, A Survey on Time-Series Pre-Trained Models, (2023). https://doi.org/10.48550/arXiv.2305.10716.