(193d) Machine Learning through a Parametric Programming Lens

Conference

AIChE Annual Meeting

Year

2019

Proceeding

2019 AIChE Annual Meeting

Group

Computing and Systems Technology Division

Session

Advances in Machine Learning Methods for Process Systems Engineering

Time

Monday, November 11, 2019 - 4:27pm to 4:46pm

Authors

Tso, W. W. - Presenter, Texas A&M University

Burnak, B., Texas A&M University

Onel, M., Texas A&M Energy Institute, Texas A&M University

Katz, J., Texas A&M University

Pistikopoulos, E., Texas A&M Energy Institute, Texas A&M University

The bias-variance tradeoff is an important concept in machine learning for selecting an optimal statistical model from a set of candidate ones. The selected model should have low bias and variance so that it generalizes well to outside data not seen during model training. Too much variance leads to overfitting and too high bias leads to underfitting [1, 2].

Including a regularization term in the loss function is the most utilized approach for fitting a machine learning algorithm to find an optimal model that balances this bias-variance tradeoff. The regularization term includes an exogenous hyperparameter that is set prior to training the model. This hyperparameter controls the importance and weight of the regularization term, which affects the resulting optimization solution of the machine learning algorithm. Some algorithms also have other hyperparameters that need to be prespecified.

In this manner, finding the optimal model amounts to correctly tuning these hyperparameters. Typical strategies for hyperparameter tuning involve discretizing the parameter space and implementing an iterative grid or random search to approximate the optimal hyperparameter and thereby the optimal model [3]. This involves solving an indefinite number of optimization problems during the validation or cross-validation steps.

Instead in this work, hyperparameter tuning is viewed as a parametric programming problem, in which each optimization variable to the machine learning algorithm is derived as a single piecewise affine function of the hyperparameter. Optimizing the hyperparameter then becomes determining the global minimum of these piecewise affine expressions. Finding the optimal hyperparameter from parametric programming is exact.

We discuss a parametric programming approach toward LASSO regression [4] and linear support vector machines [5], two popular regression and classification algorithms. We demonstrate the effectiveness of this strategy through a set of computational studies.

[1] Abu-Mostafa, Yaser S., Malik Magdon-Ismail, and Hsuan-Tien Lin. Learning From Data. Vol. 4. New York, NY: AMLBook, 2012.

[2] James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. An Introduction to Statistical Learning. Vol. 112. New York, NY: Springer, 2013.

[3] Claesen, Marc, and Bart De Moor. "Hyperparameter Search in Machine Learning." arXiv preprint arXiv:1502.02127 (2015).

[4] Tibshirani, Robert. "Regression Shrinkage and Selection via the Lasso." Journal of the Royal Statistical Society: Series B (Methodological) 58, no. 1 (1996): 267-288.

[5] Cortes, Corinna, and Vladimir Vapnik. "Support-vector networks." Machine Learning 20, no. 3 (1995): 273-297.

Topics

Computing and Systems Engineering

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: December 2024

CEP: November 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.