(646f) Dynamic Latent Variable Regression for Data Modeling and Monitoring
AIChE Annual Meeting
2017
2017 Annual Meeting
Computing and Systems Technology Division
Big Data in Process Modeling, Estimation and Control
Thursday, November 2, 2017 - 9:25am to 9:42am
For supervised learning or regression type of modeling, PLS is a dimensionality reduction algorithm that extracts latent variables (i.e., outer model projection) based on the maximum covariance criterion. However, the ultimate objective is to perform regression, as properly reflected in the inner modeling objectives. This discrepancy in the outer model and inner model objectives leads to many drawbacks. One of them is that the extracted scores of PLS may contain orthogonal variations, which are irrelevant to predict or monitor the quality variables. Due to the covariance objective in PLS, it usually requires multiple latent variables even to predict a single output variable [7].
Several subsequent works were proposed to overcome the above issue, including orthogonalized PLS [8] and concurrent PLS [9]. An alternative way is to use canonical correlation analysis (CCA) proposed by Hotelling [10]. CCA focuses only on extracting the multidimensional correlation between X and Y, and it performs better in prediction than PLS. One issue involved in CCA is that it pays no attention to the input variances, and it cannot exploit the input variance structure. Therefore, a concurrent CCA (CCCA) combineing CCA and PCA was proposed to exploit the variances and correlation in process-specific and quality-specific spaces [11].
Both PLS and CCA need additional processing to achieve good performance. Recently, Zhu and Qin [12] proposed a latent variable least squares (LVLS) method as an alternative method to exploit latent structured relations between X and Y. LVLS aims to minimize the prediction error between the input scores and the output scores, and it focuses on both the input variance structure and the prediction efficiency, which overcomes the drawbacks of PLS and CCA.
LVLS considers only static relations between X and Y. However, when the relationship between X and Y are dynamic, the static LVLS is not suitable for dynamic system modeling and subsequent process and quality monitoring. Several modified dynamic algorithms have been proposed for PLS and CCA to deal with dynamic processes [13-17]. Dong and Qin [17] developed a dynamic inner PLS (DiPLS) for dynamic system modelling, and it provides an explicit description for dynamic inner model and outer model.
In this paper, a dynamic inner LVLS (DiLVLS) algorithm is proposed to capture the dynamic relation between X and Y with a weighted combination of lagged process variables X. The objective of DiLVLS is to minimize the prediction error between the output scores and the weighted combination of lagged input scores, which builds the outer relation of DiLVLS model. A regularization term is also included in the objective to overcome the collinearity problems. The consistent inner model is developed in inner modeling to describe the dynamic relations. After auto-correlation is extracted, the static LVLS model is then employed to exploit the static cross-correlations between the residuals of X and Y. The corresponding monitoring scheme is also developed for DiLVLS model. A synthetic case study and the Tennessee Eastman process are used to demonstrate the prediction and monitoring effectiveness of the proposed algorithm.
References
[1] Wold, S., Ruhe, A., Wold, H. and Dunn, III, W.J., 1984. The collinearity problem in linear regression. The partial least squares (PLS) approach to generalized inverses. SIAM Journal on Scientific and Statistical Computing, 5(3), pp.735-743.
[2] Geladi, P. and Kowalski, B.R., 1986. Partial least-squares regression: a tutorial. Analytica chimica acta, 185, pp.1-17.
[3] Höskuldsson, A., 1988. PLS regression methods. Journal of chemometrics, 2(3), pp.211-228.
[4] Nomikos, P. and MacGregor, J.F., 1995. Multivariate SPC charts for monitoring batch processes. Technometrics, 37(1), pp.41-59.
[5] Wise, B.M. and Gallagher, N.B., 1996. The process chemometrics approach to process monitoring and fault detection. Journal of Process Control, 6(6), pp.329-348.
[6] Joe Qin, S., 2003. Statistical process monitoring: basics and beyond. Journal of chemometrics, 17(8â9), pp.480-502.
[7] Qin, S.J. and Dong, Y., DATA DISTILLATION, ANALYTICS, AND MACHINE LEARNING.
[8] Sun, L., Ji, S., Yu, S. and Ye, J., 2009, July. On the equivalence between canonical correlation analysis and orthonormalized partial least squares. In IJCAI (Vol. 9, pp. 1230-1235).
[9] Qin, S.J. and Zheng, Y., 2013. Quality-relevant and process-relevant fault monitoring with concurrent projection to latent structures. AIChE Journal, 59(2), pp.496-504.
[10] Hotelling, H., 1936. Relations between two sets of variates. Biometrika, 28(3/4), pp.321-377.
[11] Zhu, Q., Liu, Q. and Qin, S.J., 2016. Concurrent canonical correlation analysis modeling for quality-relevant monitoring. IFAC-PapersOnLine, 49(7), pp.1044-1049.
[12] Zhu, Q. and Qin, S.J., 2017. Latent variable least squares for process and quality modeling. Submitted to 56th IEEE Conference on Decision and Control.
[13] Kaspar, M.H. and Ray, W.H., 1993. Dynamic PLS modelling for process control. Chemical Engineering Science, 48(20), pp.3447-3461.
[14] Lakshminarayanan, S., Shah, S.L. and Nandakumar, K., 1997. Modeling and control of multivariable processes: Dynamic PLS approach. AIChE Journal, 43(9), pp.2307-2322.
[15] Li, G., Liu, B., Qin, S.J. and Zhou, D., 2011. Quality relevant data-driven modeling and monitoring of multivariate dynamic processes: The dynamic T-PLS approach. IEEE transactions on neural networks, 22(12), pp.2262-2271.
[16] Liu, Q., Zhu, Q., Qin, S.J. and Chai, T., 2017. Dynamic concurrent kernel CCA for strip-thickness relevant fault diagnosis of continuous annealing processes. Accepted by Journal of process control.
[17] Dong, Y. and Qin, S.J., 2015. Dynamic-Inner Partial Least Squares for Dynamic Data Modeling. IFAC-PapersOnLine, 48(8), pp.117-122.