(40f) Pyddsbb: A Python Package for Simulation-Optimization Using Data-Driven Branch-and-Bound Techniques | AIChE

(40f) Pyddsbb: A Python Package for Simulation-Optimization Using Data-Driven Branch-and-Bound Techniques

Authors 

Ravutla, S. - Presenter, Georgia Institute of Technology
Boukouvala, F. - Presenter, Georgia Institute of Technology
Zhai, J., Georgia Institute of Technology
High-fidelity computer simulations are essential for quantitative analysis and decision-making in chemical engineering and can ideally be used for process design, synthesis and control tasks [1]. Simulation-based optimization problems are often solved with black-box solvers, where the simulation is treated as an input-out data generator because of the complexity or inaccessibility of the simulation equations [2-4]. Machine-Learned or regression surrogate models have been frequently used to expedite the search for optima in simulation-based optimization [4, 5]. However, due to the absence of derivative information of the original simulation and limitations in sampling, existing simulation-optimization approaches are often highly-dependent on a single type of surrogate model that may be suboptimal for a certain problem; optimal solutions vary with different reinitialization of the algorithm; convergence is not guaranteed and no information on the quality of the incumbent solution is provided [3, 6, 7]. Moreover, the process of training an accurate surrogate model is a challenging task, and it has been reported that slight changes in the available data, lead to different surrogate model parametrizations and hence different optimal solutions [8]. Finally, the use of complex nonparametric surrogate models (i.e., Gaussian Process Models, Neural Networks) lead to nonconvex formulations that require custom global optimization algorithms [9].

To tackle these challenges, we have previously presented a data-driven equivalent of a spatial branch-and-bound algorithm (DDSBB). A key component of the algorithm is the use of a linear programming formulation to build convex underestimators of data from simulations. These underestimators are used to solve easier “relaxed” subproblems and are embedded within a branch-and-bound algorithm that adaptively samples in non-pruned subspaces. Through a large set of benchmark problems, we have shown that the validity of these underestimators increases as we add more data generated from fitted surrogate models, in addition to the “high-fidelity” simulation data (multi-fidelity convex underestimators) [8]. Using this approach, we avoid direct optimization of nonconvex surrogate models; dependence on a single surrogate model; and convergence to different local solutions after different initialization. More importantly, the algorithm provides an upper bound and an approximate lower bound on the incumbent solution at any intermediate stopping point (due to sampling limitations), and the gap between these bounds improves as more data is collected.

In this work, we showcase the development of an open-source Python package of the DDSBB algorithm (PyDDSBB) and demonstrate its capabilities through a series of benchmark and simulation-based problems. The PyDDSBB algorithm has the capacity of employing a variety of different surrogate models for low-fidelity data generation (i.e., linear, quadratic, Support Vector Regression, Gaussian Process Models and Neural Networks), and offers the capability of user-based additions of new surrogate models. The PyDDSBB library also contains different types of underestimating options (i.e., linear and convex quadratic models). While the objective function is assumed to be simulation-based (black-box), PyDDSBB allows the user to include simulation-based constraints (unknown or black-box) and equation-based (known) constraints into the formulation. A variety of Machine-Learning-based branching techniques and branch-and-bound heuristics are considered, and default options are recommended for the case of box-constrained and general constrained problems. The premise of this software is to provide a user-friendly simulation-based optimization framework for both expert and non-expert users, benefiting from the high-level features of Python. PyDDSBB follows the object-oriented programming paradigm and is designed to allow easy extension of the core functionality by users and developers.

References:

  1. McBride, K. and K. Sundmacher, Overview of Surrogate Modeling in Chemical Process Engineering. Chemie Ingenieur Technik, 2019. 91(3): p. 228-239.
  2. Rios, L.M. and N.V. Sahinidis, Derivative-free optimization: a review of algorithms and comparison of software implementations. Journal of Global Optimization, 2013. 56(3): p. 1247-1293.
  3. Boukouvala, F., R. Misener, and C.A. Floudas, Global optimization advances in Mixed-Integer Nonlinear Programming, MINLP, and Constrained Derivative-Free Optimization, CDFO. European Journal of Operational Research, 2016. 252(3): p. 701-727.
  4. Bhosekar, A. and M. Ierapetritou, Advances in surrogate based modeling, feasibility analysis, and optimization: A review. Computers & Chemical Engineering, 2018. 108: p. 250-267.
  5. Alexander Thebelt, et al., ENTMOOT: A Framework for Optimization over Ensemble Tree Models. arXiv:2003.04774v2, 2020.
  6. Amaran, S., et al., Simulation optimization: a review of algorithms and applications. 4OR, 2014. 12(4): p. 301-333.
  7. Larson, J., M. Menickelly, and S.M. Wild, Derivative-free optimization methods. Acta Numerica, 2019. 28: p. 287-404.
  8. Zhai, J. and F. Boukouvala, Data-driven Spatial Branch-and-bound Algorithm for Box-constrained Simulation-based Optimization. 2020, Living Archive for Process Systems Engineering.
  9. Schweidtmann, A.M. and A. Mitsos, Deterministic Global Optimization with Artificial Neural Networks Embedded. Journal of Optimization Theory and Applications, 2019. 180(3): p. 925-948.