(328a) Variable and Term Selection of Approximations for Data-Driven Optimization
AIChE Annual Meeting
2017
2017 Annual Meeting
Computing and Systems Technology Division
Advances in Data Analysis, Information Management, and Intelligent Systems II
Tuesday, October 31, 2017 - 12:30pm to 12:51pm
In this work, several challenges of surrogate-based optimization will be studied through a comprehensive comparison of the performance of various types of approximation methods for problems with increasing dimensionality and in the absence or presence of noise. First, most existing data-driven optimization methods [3] (also known as black/grey-box optimization or derivative-free optimization) require the a-priori knowledge of the important degrees of freedom, which may not be true in real systems or industrial applications. Second, as the dimensionality of data-driven systems increases, the curse-of-dimensionality currently limits the performance and applicability of surrogate-based optimization methods. Lastly, oftentimes input-output data contains noise that may significantly affect surrogate model fitting and prediction accuracy.
The above challenges will be studied in this work via the simulation of synthetic data sets from a large set of benchmark problems for optimization, ranging from problems with tens to hundreds of variables with and without noise. First, we investigate the simultaneous variable selection and surrogate term selection by using support vector regression (SVR) [4,5]. Support vector machines (SVM) have been successfully used for high dimensional feature selection, dimensionality reduction and function estimation through the SVR extension [6]. SVM-based methods rely on a ε-insensitive loss function that has the ability to reduce the risk of overfitting as well as reduce the complexity of the surrogate function via regularization. Here, we use the effects of ε-sensitivity and regularization to develop surrogate functions that are specifically targeted for optimization. More specifically we develop (a) functions that are more accurate in regions where minimum/maximum values are observed, and (b) functions with a low number of terms and reduced complexity. The performance of the proposed SVR approach is compared to several popular methods for surrogate model identification for optimization, such as kriging or Gaussian process modeling, quadratic approximations, polynomial approximations, and radial basis functions.
References:
1. Forrester, A.I.J., A. Sóbester, and A.J. Keane, Engineering Design via Surrogate Modelling - A Practical Guide. 2008: John Wiley & Sons.
2. Cozad, A., N.V. Sahinidis, and D.C. Miller, Learning surrogate models for simulation-based optimization.AIChE Journal, 2014. 60(6): p. 2211-2227.
3. Boukouvala, F., R. Misener, and C.A. Floudas, Global optimization advances in Mixed-Integer Nonlinear Programming, MINLP, and Constrained Derivative-Free Optimization, CDFO.European Journal of Operational Research, 2016. 252(3): p. 701-727.
4. Vapnik, V., Golowich, S. E., and Smola, A. J. Support vector method for function approximation, regression estimation and signal processing. In Advances in Neural Information Processing Systems, 1997. 9:p. 281-287.
5. H. Drucker, C. J. C. Burges, L. Kaufman, A. J. Smola, and V. Vapnik. Support vector regression machines. In Advances in Neural Information Processing Systems, 1997. 9:p. 155-161.
6. V. N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag New York, New York, NY, USA, 1995.