(185g) Optimization Methods for Exploring Accuracy Versus Robustness of a Regression Prediction in Process Analytical Technology | AIChE

(185g) Optimization Methods for Exploring Accuracy Versus Robustness of a Regression Prediction in Process Analytical Technology

In process analytical technology applications, there is a conflict between accuracy of a regression prediction and robustness of that prediction to sources of variability, e.g., size of the plant [1,2]. Yet, both concepts are crucial for developing useful regression models [3-5]. We explore a trade-off between accuracy, in terms of root mean squared error of the prediction, and robustness, in terms of moment-matching of the first and second sample moments of the predictions across all different realizations of a known variability source. Our mathematical definition of robustness is based on the method of moments from statistics and enables insensitivity of the prediction to variations of a known source.

The development of the regression model is data-dependent, and thus the trade-off between the two objectives is not known a priori. The trade-off between accuracy versus robustness can be posed with optimization. We explore both multi-objective optimization and hierarchical programming, as ways of representation. Multi-objective formulations illustrate this trade-off by obtaining Pareto-optimal solutions. Yet, when investigating conflicting objectives Pareto-optimal solutions are not necessarily bilevel optimal.

We utilize examples from multivariate calibration to quantify the performance between the different methods. The optimization problems are solved in a Bayesian optimization framework, i.e., an asymptotically complete method [6], using ENTMOOT [7]. We observe that the robustness metric leads in general to simpler models, i.e., with a smaller number of latent variables, but gives weaker predictions in the training data. The accuracy metric predicts better in the known data, but may lead to overparametrized models, the performance of which can deteriorate in new samples. This accuracy/robustness trade-off is in accordance with what previous literature [2,8] describes as a critical topic of investigation. The combination of the two metrics shows high potential in generating regression models with high predictive ability that are not overparametrized, and therefore are less amenable to degradation in time.

References:

[1] Igne, B., Shi, Z., Drennen III, J. K., & Anderson, C. A. (2014). Effects and detection of raw material variability on the performance of near-infrared calibration models for pharmaceutical products. Journal of Pharmaceutical Sciences, 103(2), 545-556.

[2] Li, Y., Anderson, C. A., Drennen III, J. K., Airiau, C., & Igne, B. (2019). Development of an in-line near-infrared method for blend content uniformity assessment in a tablet feed frame. Applied Spectroscopy, 73(9), 1028-1040.

[3] De Leersnyder, F., Peeters, E., Djalabi, H., Vanhoorne, V., Van Snick, B., Hong, K., Hammond, S., Liu, A. Y., Ziemons, E., Vervaet, C., & De Beer, T. (2018). Development and validation of an in-line NIR spectroscopic method for continuous blend potency determination in the feed frame of a tablet press. Journal of Pharmaceutical and Biomedical Analysis, 151, 274-283.

[4] Rato, T. J., & Reis, M. S. (2019). SS-DAC: a systematic framework for selecting the best modeling approach and pre-processing for spectroscopic data. Computers & Chemical Engineering, 128, 437-449.

[5] Alam, M. A., Liu, Y. A., Dolph, S., Pawliczek, M., Peeters, E., & Palm, A. (2021). Benchtop NIR method development for continuous manufacturing scale to enable efficient PAT application for solid oral dosage form. International Journal of Pharmaceutics, 601, 120581.

[6] Neumaier, A. (2004). Complete search in continuous global optimization and constraint satisfaction. Acta Numerica, 13, 271-369.

[7] Thebelt, A., Kronqvist, J., Mistry, M., Lee, R. M., Sudermann-Merx, N., & Misener, R. (2021). ENTMOOT: a framework for optimization over ensemble tree models.Computers & Chemical Engineering, 151, 107343.

[8] Wise, B. M., & Roginski, R. T. (2015). A calibration model maintenance roadmap. IFAC-PapersOnLine, 48(8), 260-265.