(189e) Variable Selection in Multivariate Modeling of Drug Product Formula and Process | AIChE

(189e) Variable Selection in Multivariate Modeling of Drug Product Formula and Process

Authors 

Cui, Y. - Presenter, Genentech, Inc.
Song, X., Genentech, Inc.
Chuang, K., Genentech, Inc.
Lee, S., Genentech, Inc.
Venkatramani, C., Genentech, Inc.
Gallegos, G., Genentech Inc.
Venkateshwaran, T., Genentech, Inc.
Xie, M., Genentech, Inc.


Multivariate data analysis methods such as partial least square (PLS) modeling have been increasingly applied to product development, particularly in the quality-by design paradigm. This study applied the PLS modeling to analyze a product development dataset combining a design of experiment study and historical batch data. Attention was paid in particular to the assessment of the importance of predictor variables, and subsequently the variable selection in the PLS modeling. The results indicate that in conducting PLS modeling irrelevant and collinear predictors could be extensively present in the initial model. Therefore, variable selection is an important step in the model optimization. The VIP and coefficient values can be employed to rank the importance of predictors and to help remove irrelevant predictors. To reduce collinear predictors, on the other hand, multiple rounds of PLS modeling on different combinations of predictors may be necessary. To this end, stepwise reduction of predictors based on their VIP/coefficient ranking was introduced and appeared to be an effective approach to identify and remove redundant collinear predictors.