(328a) Variable and Term Selection of Approximations for Data-Driven Optimization

Conference

AIChE Annual Meeting

Year

2017

Proceeding

2017 Annual Meeting

Group

Computing and Systems Technology Division

Session

Advances in Data Analysis, Information Management, and Intelligent Systems II

Time

Tuesday, October 31, 2017 - 12:30pm to 12:51pm

Authors

Boukouvala, F. - Presenter

Kim, S. H., Georgia Institute of Technology

Zhai, J., Georgia Institute of Technology

The motivation of this work is ultimately enabling design and optimization of processes/systems using high-dimensional input-output data streams, through the use of surrogate or approximating functions.Â In other words, we aim to optimize problem {min f(x); s.t. g(x)â‰¤0; xâˆˆ[x_lo,x_up ]; xâˆˆR}, for which any of the real objective function (f(x)) and/or the constraints (g(x)) cannot be algebraically expressed and purely depend on data. Due to the lack of the real functions (f,g), problemÂ {minÂ f'(x); s.t. g'(x)â‰¤0; xâˆˆ[x_lo,x_up ]; xâˆˆR} is formulated, which uses approximate functions, or surrogate functionsÂ f'(x)Â andÂ g'(x).Â Surrogate functions have been proven to be very useful as the intermediate step towards the search of optimal points. At the same time, their accuracy and complexity is critical towards the performance of data-driven optimization methods [1,2].

In this work, several challenges of surrogate-based optimization will be studied through a comprehensive comparison of the performance of various types of approximation methods for problems with increasing dimensionality and in the absence or presence of noise. First, most existing data-driven optimization methods [3] (also known as black/grey-box optimization or derivative-free optimization) require the a-priori knowledge of the important degrees of freedom, which may not be true in real systems or industrial applications. Second, as the dimensionality of data-driven systems increases, the curse-of-dimensionality currently limits the performance and applicability of surrogate-based optimization methods. Lastly, oftentimes input-output data contains noise that may significantly affect surrogate model fitting and prediction accuracy.

The above challenges will be studied in this work via the simulation of synthetic data sets from a large set of benchmark problems for optimization, ranging from problems with tens to hundreds of variables with and without noise. First, we investigate the simultaneous variable selection and surrogate term selection by using support vectorÂ regression (SVR) [4,5]. Support vector machines (SVM) have been successfully used for high dimensional feature selection,Â dimensionality reduction and function estimation through theÂ SVR extension [6]. SVM-based methods rely on a Îµ-insensitive loss function that has the ability to reduce the risk of overfitting as well as reduce the complexity of the surrogate function via regularization. Here, we use the effects of Îµ-sensitivity and regularization to develop surrogate functions that are specifically targeted for optimization. More specifically we develop (a) functions that are more accurate in regions where minimum/maximum values are observed, and (b) functions with a low number of terms and reduced complexity. The performance of the proposed SVR approach is compared to several popular methods for surrogate model identification for optimization, such as kriging or Gaussian process modeling, quadratic approximations, polynomial approximations, and radial basis functions.

References:

1. Forrester, A.I.J., A. Sóbester, and A.J. Keane, Engineering Design via Surrogate Modelling - A Practical Guide. 2008: John Wiley & Sons.

2. Cozad, A., N.V. Sahinidis, and D.C. Miller, Learning surrogate models for simulation-based optimization.AIChE Journal, 2014. 60(6): p. 2211-2227.

3. Boukouvala, F., R. Misener, and C.A. Floudas, Global optimization advances in Mixed-Integer Nonlinear Programming, MINLP, and Constrained Derivative-Free Optimization, CDFO.European Journal of Operational Research, 2016. 252(3): p. 701-727.

4. Vapnik, V., Golowich, S. E., and Smola, A. J. Support vector method for function approximation, regression estimation and signal processing. In Advances in Neural Information Processing Systems, 1997. 9:p. 281-287.

5. H. Drucker, C. J. C. Burges, L. Kaufman, A. J. Smola, and V. Vapnik. Support vector regression machines. In Advances in Neural Information Processing Systems, 1997. 9:p. 155-161.

6. V. N. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag New York, New York, NY, USA, 1995.

Topics

Computing and Systems Engineering

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: November 2024

CEP: October 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.