(374a) NucID - System Identification with Missing Data Via Nuclear Norm Regularization | AIChE

(374a) NucID - System Identification with Missing Data Via Nuclear Norm Regularization

Authors 



The need to identify a dynamic system from an incomplete data set is a rather common situation in industry [1]. There are different reasons that lead to missing entries in the data sets available for identification, such as: Sensor failures, outliers or plant shutdowns, which generate missing entries in the data set at random, and multi-rate sampling or periodic disturbances that create patterns of missing data. Over the last three decades a number of researchers from various fields have recognized the need for systematic methods to exploit incomplete data sets for system identification.

The goal of this paper is to present a recently developed method for system identification from noise-corrupted data with missing entries in the outputs. The proposed method, called NucID, is applicable to SISO and MIMO systems, identifies a non-parametric linear model and incorporates the minimization of the order of the identified system in a natural and transparent way by approximating it with the nuclear norm, i.e., by the sum of the singular values, of the Hankel matrix built from finite impulse response (FIR) coefficients. The resulting nuclear norm regularization for the rank of a matrix is the analogue to the L1 regularization for vector cardinality, which is a well-known heuristic that produces sparse solutions. These regularization methods have been studied in detail by a number of researchers and set the foundation of the recently developed compressed sensing frameworks for measurement, coding and signal estimation [2-4].

The proposed technique minimizes the nuclear norm of the Hankel matrix of FIR coefficients while constraining the fitting error between model and data to a desired level of accuracy. This method allows one to directly choose a desired accuracy and then poses a convex optimization problem to find the lowest order model that achieves it, rather than iteratively tuning the order of the model, as is common practice.

Nuclear norm regularization has been recently suggested [2,5], as a way to promote the identification of low order models out of complete data sets. This work shows how the nuclear norm regularization is especially attractive, when the data sets have missing entries, i.e. for the missing data problem.

A sensitivity analysis of the identification algorithm is performed on different structures of missing data in the outputs: structured missing data and randomly distributed missing data. The randomly distributed missing data is relevant for the cases where the measurements are affected by unforeseeable failures and breakdowns of certain sensors during the acquisition time, whereas the structured missing data addresses the problem when inputs and outputs are collected at different sampling rates or asynchronously, i.e., multi-rate sampled-data systems. The proposed method is compared under these scenarios to commonly used methods for identification with missing data and several case studies are performed on experimental data sets taken from the DaISy Database [6]. NucID is also applied to a multicolumn chromatographic process, called simulated moving bed. The identified reduced order model is compared to a linearized first principles model.

NucID is found to consistently identify systems from complete data sets or data missing at random within the imposed error tolerance, a task at which the standard methods sometimes fail. In the case of structured missing data, NucID is shown to be particularly effective and clearly outperform existing procedures. This poses NucID as an attractive tool for the identification from data sets with missing entries and multi-rate sampled-data systems.

----------------------

[1] P. Kadlec, B. Gabrys. and S. Strandt, ?Data-driven Soft Sensors in the process industry', Computers & Chemical Engineering, vol. 33, no. 4, pp 795-814, 2009.

[2] B. Recht and M. Fazel and P.A. Parrilo, ?Guaranteed minimum-rank solutions of linear matrix

equations via nuclear norm minimization,' arXiv:0706.4138v1, 2007.

[3] E. Candes, J. Romberg, and T. Tao, ?Stable signal recovery from incomplete and inaccurate measurements', Communications On Pure And Applied Mathematics, vol. 59, no. 8, pp. 1207?1223, 2006

[4] D. Donoho, ?Compressed sensing', IEEE transactions on information theory, vol. 52, no. 4, pp. 1289?1306, 2006.

[5] Z. Liu and L. Vandenberghe, ?Interior-point method for nuclear norm approximation with application to system identification', Submitted to Mathematical Programming, 2008.

[6] B. De Moor, ?Daisy: Database for the identification of systems', Department of Electrical Engineering, ESAT/SISTA, K.U.Leuven, Belgium. URL: http://homes.esat.kuleuven.be/ smc/daisy/.