(108a) Powerful and Novel Nonlinear Modeling and Multivariate Statistical Approaches in Big Data Sets in Chemical Processes and in Data Mining with Applications to Bio-, Medical-, and Material-Informatics
AIChE Spring Meeting and Global Congress on Process Safety
2019
2019 Spring Meeting and 15th Global Congress on Process Safety
Industry 4.0 Topical Conference
Invited Tutorial Session - Approaches in Big Data Analytics
Tuesday, April 2, 2019 - 1:30pm to 2:15pm
Advanced statistical methodologies have key roles to contribute in modeling chemical process data, data mining and informatics for large data sets. Over the years, our research has developed a number of statistical techniques for modeling plant data, exploiting multivariate analysis and methodologies in many applications including bioinformatics, specifically, micro-array data sets in a number of applications, medical-informatics, including disease diagnosis and discovery, and in material-informatics, including the development and evaluation of material properties and testing techniques. In this talk we present the tools and methodologies that we have developed over the years and discuss their attributes and strengths. The two primary multivariate statistical methodologies that we have exploited have been principal component analysis (PCA) and cluster analysis (CA). Our approach for modeling plant data has particular strengths when inputs are highly correlated and low signal-to-noise ratio. This talk will break these techniques down for the non-expert and then demonstrate their strengths in handling large data sets to extract critical information that can be exploited in modeling, control, analysis, inference, diagnosis, and discovery.