(185g) Optimization Methods for Exploring Accuracy Versus Robustness of a Regression Prediction in Process Analytical Technology
AIChE Annual Meeting
2022
2022 Annual Meeting
Computing and Systems Technology Division
Data Science/Analytics for Process Applications
Monday, November 14, 2022 - 5:24pm to 5:43pm
The development of the regression model is data-dependent, and thus the trade-off between the two objectives is not known a priori. The trade-off between accuracy versus robustness can be posed with optimization. We explore both multi-objective optimization and hierarchical programming, as ways of representation. Multi-objective formulations illustrate this trade-off by obtaining Pareto-optimal solutions. Yet, when investigating conflicting objectives Pareto-optimal solutions are not necessarily bilevel optimal.
We utilize examples from multivariate calibration to quantify the performance between the different methods. The optimization problems are solved in a Bayesian optimization framework, i.e., an asymptotically complete method [6], using ENTMOOT [7]. We observe that the robustness metric leads in general to simpler models, i.e., with a smaller number of latent variables, but gives weaker predictions in the training data. The accuracy metric predicts better in the known data, but may lead to overparametrized models, the performance of which can deteriorate in new samples. This accuracy/robustness trade-off is in accordance with what previous literature [2,8] describes as a critical topic of investigation. The combination of the two metrics shows high potential in generating regression models with high predictive ability that are not overparametrized, and therefore are less amenable to degradation in time.
References:
[1] Igne, B., Shi, Z., Drennen III, J. K., & Anderson, C. A. (2014). Effects and detection of raw material variability on the performance of near-infrared calibration models for pharmaceutical products. Journal of Pharmaceutical Sciences, 103(2), 545-556.
[2] Li, Y., Anderson, C. A., Drennen III, J. K., Airiau, C., & Igne, B. (2019). Development of an in-line near-infrared method for blend content uniformity assessment in a tablet feed frame. Applied Spectroscopy, 73(9), 1028-1040.
[3] De Leersnyder, F., Peeters, E., Djalabi, H., Vanhoorne, V., Van Snick, B., Hong, K., Hammond, S., Liu, A. Y., Ziemons, E., Vervaet, C., & De Beer, T. (2018). Development and validation of an in-line NIR spectroscopic method for continuous blend potency determination in the feed frame of a tablet press. Journal of Pharmaceutical and Biomedical Analysis, 151, 274-283.
[4] Rato, T. J., & Reis, M. S. (2019). SS-DAC: a systematic framework for selecting the best modeling approach and pre-processing for spectroscopic data. Computers & Chemical Engineering, 128, 437-449.
[5] Alam, M. A., Liu, Y. A., Dolph, S., Pawliczek, M., Peeters, E., & Palm, A. (2021). Benchtop NIR method development for continuous manufacturing scale to enable efficient PAT application for solid oral dosage form. International Journal of Pharmaceutics, 601, 120581.
[6] Neumaier, A. (2004). Complete search in continuous global optimization and constraint satisfaction. Acta Numerica, 13, 271-369.
[7] Thebelt, A., Kronqvist, J., Mistry, M., Lee, R. M., Sudermann-Merx, N., & Misener, R. (2021). ENTMOOT: a framework for optimization over ensemble tree models.Computers & Chemical Engineering, 151, 107343.
[8] Wise, B. M., & Roginski, R. T. (2015). A calibration model maintenance roadmap. IFAC-PapersOnLine, 48(8), 260-265.