(108a) Powerful and Novel Nonlinear Modeling and Multivariate Statistical Approaches in Big Data Sets in Chemical Processes and in Data Mining with Applications to Bio-, Medical-, and Material-Informatics | AIChE

(108a) Powerful and Novel Nonlinear Modeling and Multivariate Statistical Approaches in Big Data Sets in Chemical Processes and in Data Mining with Applications to Bio-, Medical-, and Material-Informatics

Authors 

Rollins, D. - Presenter, Iowa State University
Advanced statistical methodologies have key roles to contribute in modeling chemical process data, data mining and informatics for large data sets. Over the years, our research has developed a number of statistical techniques for modeling plant data, exploiting multivariate analysis and methodologies in many applications including bioinformatics, specifically, micro-array data sets in a number of applications, medical-informatics, including disease diagnosis and discovery, and in material-informatics, including the development and evaluation of material properties and testing techniques. In this talk we present the tools and methodologies that we have developed over the years and discuss their attributes and strengths. The two primary multivariate statistical methodologies that we have exploited have been principal component analysis (PCA) and cluster analysis (CA). Our approach for modeling plant data has particular strengths when inputs are highly correlated and low signal-to-noise ratio. This talk will break these techniques down for the non-expert and then demonstrate their strengths in handling large data sets to extract critical information that can be exploited in modeling, control, analysis, inference, diagnosis, and discovery.