(193a) Smart Process Data Analytics: Automated and Robust Data Analytics for Manufacturing Processes | AIChE

(193a) Smart Process Data Analytics: Automated and Robust Data Analytics for Manufacturing Processes

Authors 

Braatz, R. D., Massachusetts Institute of Technology
Process data analytics is the application of statistics and related mathematical tools to data in order to understand, develop, and improve manufacturing processes (e.g., [1]). Process data analytics can be applied from individual unit operations to the level of the entire manufacturing process, and have the potential to inform decision making in many ways. One of the areas of application of process data analytics is in the construction of predictive models of the quality variables that are suitable for quality assurance and process modification, optimization, and control. Another area of application is for real-time anomaly detection and diagnosis (e.g., [2]).

Despite their usefulness, challenges can arise during the application of process data analytics. Firstly, the data that are available may be of insufficient quantity and quality to fully characterize the manufacturing process, especially for systems that have high intrinsic complexity. Secondly, oftentimes direct real-time measurement of some key process variables is too costly to be practical. For example, such situations often arise in organic synthesis, in which undesired reaction byproducts have nearly identical spectra as the desired product, with the concentration of each undesired reaction byproduct being below the minimum detection limit of the spectroscopic device. Thirdly, dozens of data analytics methods are available, including chemometrics methods such as principal component analysis and partial least squares [3], time series analysis such as autocorrelation and spectral analysis [4], and machine learning methods such as Bayesian networks, support vector machines, and elastic nets [5]. A substantial level of expertise is required to select the best data analytics method for a specific application to ensure that the resulting model is reliable and accurate. Data analytics methods come from diverse branches of mathematics, science, and engineering, making it difficult for any one person to become knowledgeable in all of the different methods. In reality, practitioners typically default to using familiar methods, which can be highly sub-optimal, or simply trying all methods that they can think of, which can result in overfitting. Neither of these approaches is optimal.

This presentation describes a robust and automated approach for process data analytics model selection, which allows the user to focus on the modelling objectives rather than spending extensive time and effort in learning and selecting among the large number of potential methods. Building on Severson et al. [6], the model selection takes a bottom-up approach based on the specific data characteristics and available domain knowledge. The model selection and fitting procedure incorporates artificial intelligence and expertise in process analytics and can be applied for various objectives including model prediction, classification, and process monitoring.

[1] Qin, S. J. (2014). Process data analytics in the era of big data. AIChE Journal, 60, 3092-3100.

[2] Chiang, L.H., Russell, E.L., and Braatz, R.D. (2000). Fault Detection and Diagnosis in Industrial Systems. Springer Verlag, London.

[3] Brereton, R. G. (2003). Chemometrics: Data Analysis for the Laboratory and Chemical Plant. John Wiley & Sons, Chichester, United Kingdom.

[4] Brillinger, D. R. (1981). Time Series: Data Analysis and Theory. SIAM Press, New York.

[5] Monostori, L. (2003). AI and machine learning techniques for managing complexity, changes and uncertainties in manufacturing. Engineering Applications of Artificial Intelligence, 16(4), 277-291.

[6] Severson, K. A., VanAntwerp, J. G., Natarajan, V., Antoniou, C., Thömmes, J., and Braatz, R. D. (2018). A systematic approach to process data analytics in pharmaceutical manufacturing: The data analytics triangle and its application to the manufacturing of a monoclonal antibody. In Multivariate Analysis in the Pharmaceutical Industry, edited by A. P. Ferreira, J. C. Menezes, and M. Tobyn, Elsevier, Chapter 12, 295-312.