(101a) Dataset Considerations for Rapid Product Development Applications
AIChE Spring Meeting and Global Congress on Process Safety
2022
2022 Spring Meeting and 18th Global Congress on Process Safety Proceedings
Industry 4.0 Topical Conference
Data-Driven and Hybrid Approaches to Development of New Products I
Tuesday, April 12, 2022 - 1:30pm to 2:00pm
This presentation will discuss how to identify and overcome common pitfalls in formulation datasets, and will draw examples from various industries including polymers, specialty chemicals, and foods. ProSensusâ FormuSense software will be used to illustrate the typical steps required to optimally preprocess a raw formulation dataset for latent variable modeling and numerical optimization. Topics covered will include:
-structuring the raw data (such as identifying ingredient classes)
-detecting and resolving data anomalies (such as misspellings and missing ingredients)
-handling categorical variables (such as subject-matter expert knowledge)
-calculating ingredient class ratios and mixture properties to evaluate the impact of new ingredients
-meaningful statistics and visualizations to evaluate data suitability for modeling
-modeling approaches in the presence of missing data (such as raw material properties)