(172a) Linear Dynamic Model Identification and Data Reconciliation Using Dynamic Iterative PCA (DIPCA)
AIChE Spring Meeting and Global Congress on Process Safety
2017
2017 Spring Meeting and 13th Global Congress on Process Safety
Process Development Division
Modeling Tools and Techniques for Process R&D II
Wednesday, March 29, 2017 - 10:15am to 10:30am
Linear Dynamic Model Identification and Data Reconciliation using Dynamic Iterative PCA (DIPCA)
Identification of systems from input-output data without any prior knowledge about the process is of utmost relevance in process industries. The input-output data used in identification exercises often has measurement errors with unequal error-variances across both input and output variables i.e. the errors are heteroskedastic in nature. Model identification under these conditions translates to solving an errors-in-variables (EIV) problem which is difficult to solve using classical system identification techniques. There exist several techniques for identifying the underlying process in the errors-in-variables framework but the total least squares (TLS) formulation gives the most efficient solution. One such technique is Dynamic PCA (DPCA) proposed by Ku et al. in 1995 which involves applying PCA on a matrix of lagged variables. However, it requires the order to be known and gives biased estimates under heteroskedastic error conditions. Another technique, Dynamic Iterative PCA (DIPCA), recently proposed by Maurya et al. in 2016, addresses these issues by correctly identifying the true order, delay, and gives unbiased estimates of the model parameters along with input-output error variances for large sample cases. In this work, we study this method further for finite sample cases which are of relevance from an industrial point of view and address some of the shortcomings that arise due to finite sample conditions.
The DIPCA method is based on deriving the true order of the system from the number of constraints that the data matrix satisfies. The true number of constraints is given by the number of unity eigenvalues obtained after singular value decomposition (SVD) of the scaled data matrix, which consists of lagged input and output variables. We show through simulations that while working with finite sample sizes, the eigenvalues deviate away from unity. The deviations become more pronounced as we decrease the sample size and increase the process order. This leads to ambiguity in order determination. In this paper, we propose an approach to sharply identify the true order even for finite sample cases by making use of a metric which we call ‘Eigenvalue ratio’. This is demonstrated through the following figures which correspond to a 5th order process with a stacking lag of 15. According to DIPCA, the true number of constraints for this case should be 11 and therefore, there should be 11 unity eigenvalues. However, the figure on the left demonstrates a clear deviation in eigenvalues away from 1. On the other hand, the image of the right demonstrates the efficacy of the proposed metric in identifying the true number of constraints which gives a breaking point when the number of constraints equals 11.
The second limitation of DIPCA in its current form is the lack of knowledge about an appropriate stacking lag. We have shown that as we increase the stacking lag, the estimates improve. However, increasing the lag to a very large value results in an eventual fall in performance due to decrease in the effective degrees of freedom. We show the existence of an optimal stacking lag and propose an approach to determine its optimal value from data.
Therefore, this work focuses on extending the capabilities of DIPCA to small sample cases and automatically determining the optimal stacking lag so as to minimize user intervention in DIPCA. Subsequently, we use the information derived about the underlying process to solve a data reconciliation problem in order to obtain noise-free estimates of the input-output variables. The data reconciliation problem has applications in process industries where the measurements are not-accurate due to low sensor accuracy or due to external disturbances during on-line measurements. We address this issue by implementing a Kalman filter based approach for errors-in-variables (EIV) cases.
Checkout
This paper has an Extended Abstract file available; you must purchase the conference proceedings to access it.
Do you already own this?
Log In for instructions on accessing this content.
Pricing
Individuals
AIChE Pro Members | $150.00 |
AIChE Graduate Student Members | Free |
AIChE Undergraduate Student Members | Free |
AIChE Explorer Members | $225.00 |
Non-Members | $225.00 |