(373r) Data-Driven Bi-Level Optimization of Hyperparameters for Machine Learning Models
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Computing and Systems Technology Division
10C: Interactive Session: Systems and Process Operations
Tuesday, October 29, 2024 - 3:30pm to 5:00pm
Motivated by this, we formulate the cross-validated hyperparameter optimization problem for nonlinear ML models as a bi-level program and address their solution using data-driven optimization. The formulation poses the hyperparameter decisions at the upper level to minimize the mean squared validation error, whereas the parameter estimation for the training error is minimized at the lower level for each fold. We address this bi-level multi-follower optimization problem using the DOMINO framework and approximate it as a single-level optimization problem using data-driven techniques [8]. We test the performance of various local and global derivative-free optimizers within DOMINO and evaluate their hyperparameter tuning performance on four different chemical processes [ 9, 10, 11, 12] for regression and classification tasks. Our results show that local optimizers such as NOMAD can accurately tune linear ML models with one hyperparameter while its computation cost is significantly lower than that of global optimizers. Inversely, global optimization methods such as DIRECT, Particle Swarm Optimization (PSO), and Evolutionary Algorithms (EA) perform superior for models with a higher number of hyperparameters and nonlinear characteristics like support vector machines with radial basis function kernel for classification. Furthermore, our results show that EA and PSO are the most computationally expensive among the methods utilized in this work. However, they still outperform conventional grid search and Bayesian methods in terms of tuning precision and computation costs.
References:
[1] Luo, Gang. "A review of automatic selection methods for machine learning algorithms and hyper-parameter values."Network Modeling Analysis in Health Informatics and Bioinformatics 5 (2016): 1-16.
[2] Yu, Tong, and Hong Zhu. "Hyper-parameter optimization: A review of algorithms and applications." arXiv preprint arXiv:2003.05689 (2020).
[3] Anguita, Davide, Luca Ghelardoni, Alessandro Ghio, Luca Oneto, and Sandro Ridella. "The'K'in K-fold Cross Validation." In ESANN, vol. 102, pp. 441-446. 2012.
[4] Wu, Jia, Xiu-Yun Chen, Hao Zhang, Li-Dong Xiong, Hang Lei, and Si-Hao Deng. "Hyperparameter optimization for machine learning models based on Bayesian optimization." Journal of Electronic Science and Technology 17, no. 1 (2019): 26-40.
[5] Tso, William W., Baris Burnak, and Efstratios N. Pistikopoulos. "HY-POP: Hyperparameter optimization of machine learning models through parametric programming." Computers & Chemical Engineering 139 (2020): 106902.
[6] Sinha, Ankur, Pekka Malo, Peng Xu, and Kalyanmoy Deb. "A bilevel optimization approach to automated parameter tuning." In Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, pp. 847-854. 2014.
[7] Alibrahim, Hussain, and Simone A. Ludwig. "Hyperparameter optimization: Comparing genetic algorithm against grid search and bayesian optimization." In 2021 IEEE Congress on Evolutionary Computation (CEC), pp. 1551-1559. IEEE, 2021.
[8] Beykal, Burcu, Styliani Avraamidou, Ioannis PE Pistikopoulos, Melis Onel, and Efstratios N. Pistikopoulos. "Domino: Data-driven optimization of bi-level mixed-integer nonlinear problems."Journal of Global Optimization 78 (2020): 1-36.
[9] Ghalavand, Younes, Hasan Nikkhah, and Ali Nikkhah. "Heat pump assisted divided wall column for ethanol azeotropic purification." Journal of the Taiwan Institute of Chemical Engineers 123 (2021): 206-218.
[10] Nikkhah, Hasan, and Burcu Beykal. "Process design and technoeconomic analysis for zero liquid discharge desalination via LiBr absorption chiller integrated HDH-MEE-MVR system." Desalination 558 (2023): 116643.
[11] Beykal, Burcu, Melis Onel, Onur Onel, and Efstratios N. Pistikopoulos. "A dataâdriven optimization algorithm for differential algebraic equations with numerical infeasibilities." AIChE journal 66, no. 10 (2020): e16657.
[12] Aghayev, Z., Szafran, A.T., Tran, A., Ganesh, H.S., Stossi, F., Zhou, L., Mancini, M.A., Pistikopoulos, E.N. and Beykal, B., 2023. Machine learning methods for endocrine disrupting potential identification based on single-cell data. Chemical Engineering Science, 281, p.119086.