(635c) Regularized Bayesian Fusion for Toxin Concentration Estimation in an Industrial Wastewater Treatment Plant
AIChE Annual Meeting
2021
2021 Annual Meeting
Computing and Systems Technology Division
Data Science/Analytics for Process Applications
Thursday, November 11, 2021 - 4:00pm to 4:15pm
The proposed fusion scheme considers several single-source models (one source regards a particular origin of information in the process, usually a unit or an analytical device) that are flexibly combined and used depending on the availability of information. Their quality is also taken into account, and the smoothness of the successive estimates of the toxin level is controlled using a Bayesian regularization approach. Several regression methods from different corners of the data analytics landscape were considered for building the single-source models (namely, penalized, latent variable and tree-based ensemble regression methods). The models were developed and tuned using a nested double cross-validation strategy [5], [6] and the repeated prequential method [4] in order to handle the time series nature of data.
Our Regularized Bayesian Fusion strategy led to more frequent access to toxin concentration, based on the most updated information available, properly fused through Bayesian fusion, penalyzing excessive variation in the estimates due to unreliable/noisy estimates when less information is available. This methodology should facilitate the management and operation of WWTP and the ability to maintain the toxin concentration below the compliance level.
References
[1] F. Castanedo, «A Review of Data Fusion Techniques», The Scientific World Journal, vol. 2013, pp. 1â19, 2013, doi: 10/gb7x39.
[2] A. Diez-Olivan, J. Del Ser, D. Galar, e B. Sierra, «Data fusion and machine learning for industrial prognosis: Trends and perspectives towards Industry 4.0», Information Fusion, vol. 50, pp. 92â111, 2019, doi: 10/gf6wkf.
[3] M. S. Reis e R. Kenett, «Assessing the value of information of data-centric activities in the chemical processing industry 4.0», AIChE J, vol. 64, n. 11, pp. 3868â3881, 2018, doi: 10/gff327.
[4] V. Cerqueira, L. Torgo, e I. Mozetic, «Evaluating time series forecasting models: An empirical study on performance estimation methods», arXiv:1905.11744 [cs, stat], 2019, Acedido: Mai. 06, 2020. [Em linha]. DisponÃvel em: http://arxiv.org/abs/1905.11744.
[5] R. Rendall e M. S. Reis, «Which regression method to use? Making informed decisions in âdata-rich/knowledge poorâ scenarios â The Predictive Analytics Comparison framework (PAC)», Chemometrics and Intelligent Laboratory Systems, vol. 181, pp. 52â63, 2018, doi: 10/gfc6c8.
[6] T. J. Rato e M. S. Reis, «SS-DAC: A systematic framework for selecting the best modeling approach and pre-processing for spectroscopic data», Computers & Chemical Engineering, vol. 128, pp. 437â449, 2019, doi: 10/ghj9ks.