(204d) Applying Statistical Methodology and Multivariate Analysis on Big Data to Improve the Data Quality and Process Operation | AIChE

(204d) Applying Statistical Methodology and Multivariate Analysis on Big Data to Improve the Data Quality and Process Operation

Authors 

McClung, A., The Dow Chemical Company
Mengel, M., The Dow Chemical Company
Johnson, E., The Dow Chemical Company

Traditionally at Dow Chemical, multivariate analysis is often applied on process data for process troubleshooting, quality improvement and design of inferential sensors, to name a few. The case study involves multiple batch operation units with one quality measurement for each batch and the goal is to find the key process variables (x variables) that are highly correlated with the quality measurement (y variable). The ultimate purpose is to reduce variability in the y variable. The historical data available for analysis was sampled at every minute from the last two years (with data matrix of 4.7 million columns and 200 rows) but without any success in drawing any statistical conclusions. A big dataset does not guarantee the ability of extracting useful information successfully. The reason is because of lack of structure in the data due to sampling and process variability, which is a common problem of a big dataset. To overcome the challenge, a statistical design of experiment is performed in the plant with multiple measurements associated with each batch. Combination of both statistical linear mixed model and multivariate batch analysis reveal the major contributors to the variation in the y which eventually help the plant to achieve the goal.