(575b) Knowledge-Informed Data-Driven Modeling for Robust Prediction of Microbial Inactivation in Food
AIChE Annual Meeting
2022
2022 Annual Meeting
Topical Conference: Next-Gen Manufacturing
Next-Gen Manufacturing in Pharma, Food, and Bioprocessing II
Wednesday, November 16, 2022 - 3:51pm to 4:12pm
Prevention of the growth of harmful pathogens or spoilers in food products is an important requirement for ensuring food safety and quality. Predictive mathematical models can serve as a useful tool for optimizing microbial inactivation processes in food. A common interest is to identify a functional dependency of the D-value (i.e., the time for a microbial population to shrink to 10% of its initial level) on a variety of potential factors, including temperature, pH, water activity, etc. While polynomial functions are typically used for this purpose, the empirical choice of the terms often results in underfitted or overfitted models that lack accuracy and robustness in predictions. This problem occurs when the model includes fewer terms than required or more terms than needed, requiring a systematic approach to determine optimal model structures by accounting for a trade-off model accuracy vs. complexity. In this work, we apply an advance data-driven approach - termed Sparse Identification of Nonlinear Dynamics (SINDy) - to address this issue. SINDy enables automatic discovery of model equations through a parsimonious selection of appropriate terms from a pre-built library. Through case studies using various literature data, we demonstrate how to systematically develop microbial inactivation models using the data-driven modeling approach and how to improve model accuracy by choosing appropriate functional forms of basic input and output variables by leveraging expertsâ knowledge (i.e., based on the analysis of associated mechanistic models here). Beyond enhanced quantitative data fit, we generalize the workflow to systematically identify models without assuming a priori model structure and robust stepwise model tuning through a balanced compromise between accuracy and complexity as gauged by the Akaike information criterion. We also integrate global sensitivity analysis with the optimized model to evaluate the impacts of individual factors (and their interactions thereof) on microbial inactivation. Beyond significantly facilitating the model building process for optimizing, the proposed workflow enables systematic identification of individual and combined impacts of key variables/terms on the target variable (i.e., D-value).