(196b) Powerful and Novel Multivariate Statistical Approaches in Big Data Sets and in Data Mining with Applications to Bio-, Medical-, and Material-Informatics | AIChE

(196b) Powerful and Novel Multivariate Statistical Approaches in Big Data Sets and in Data Mining with Applications to Bio-, Medical-, and Material-Informatics

Authors 

Rollins, D. Sr. - Presenter, Iowa State University
Advanced statistical metholodogies have key roles to contribute in data mining and informatics for large data sets. Over the years, our research has developed a number of statistical techniques exploiting multivariate analysis and methodologies in many applications including bioinformatics, specifically, microarray data sets in a number of applications, medical informatics, including disesase diagnosis and discovery, and in material informatics, including the development and evaluation of material properties and testing techniques. In this talk we present the tools and methodologies that we have developed over the years and discuss their attributes and strengths. The two primary multivariate statistical methodologies that we have exploited have been principal component analysis (PCA) and cluster analysis (CA). This talk will break this techniques down for the non-expert and then demonstrate their strengths in handling large data sets to extract critical information that can be exploited in analysis, inference, diagnosis, and discovery.