(119e) Source Code Transformation for GPU-Enhanced Deterministic Global Optimization | AIChE

(119e) Source Code Transformation for GPU-Enhanced Deterministic Global Optimization

Authors 

Gottlieb, R. - Presenter, University of Connecticut
Stuber, M., University of Connecticut
Deterministic global optimization is rife with computationally challenging problems. Visually simple nonlinear programs may take multiple hours to solve, and problems with added complexity, such as global dynamic optimization problems, can be even more taxing or even impossible to solve with current state-of-the-art approaches. All mature software packages that can address these classes of problems rely on some variation of spatial branchand- bound (B&B), and yet despite the high computational expense, these algorithms are predominantly designed to run serially on CPUs.

While there are successful efforts to parallelize B&B on multiple CPU cores and CPU clusters (e.g., MPI-based parallelizations in MAiNGO [1] and Octeract [2]), other areas of interest to process systems engineers, such as simulation and control, were able to achieve improved performance by utilizing alternative computing architectures such as graphics processing units (GPUs) [3], which have great potential to accelerate parallel processes. Unfortunately, to date and to the best of the authors’ knowledge, little progress has been made adapting deterministic global optimization to such alternative hardware. Over the last year, we unveiled preliminary results for a source code transformation (SCT) approach to generate and evaluate McCormick-based relaxations that is compatible with GPU computing [4, 5]. These results were then formalized in a submitted paper (currently under review) along with a novel parallel B&B implementation designed to operate with a GPU.

The SCT method and parallel B&B algorithm represented the first step towards an effective way to achieve performant global optimization on alternative computing architectures. However, this first implementation lacked several important features that would make it even more competitive with best-in-class serial algorithms. Particularly, no consideration was given for how to handle nontrivial problem constraints, and because only pointwise evaluations of convex/concave relaxations were considered, and not subgradients, many modern lower-bounding methods were inaccessible. Thus, although the parallelized method can process orders-of-magnitude more B&B nodes than its serial competitor, there was only a modest improvement in overall convergence times.

In this presentation, we discuss our work to improve upon the SCT approach to include subgradients and our efforts to advance the parallelized B&B method to make use of the subgradients to accelerate convergence with a tighter lower-bounding method. Additionally, we discuss the incorporation of nontrivial constraints into the parallel algorithm to be able to fully address a wider range of optimization problems. Examples are shown to compare the results of these improvements to the preliminary results showcased last year, and also to compare the parallel B&B algorithm with state-of-the-art methods for serial B&B.

References

[1] Dominik Bongartz, Jaromił Najman, Susanne Sass, and Alexander Mitsos. MAiNGO - McCormick-based Algorithm for mixed-integer Nonlinear Global Optimization. Technical report, RWTH-Aachen, 2018. URL https://www.avt.rwth-aachen.de/global/show_document.asp?id=aaaaaaaaabclahw.

[2] Nikos Kazazakis and Gabriel Lau. Octeract engine, 2017.

[3] David E. Bernal, Carl D. Laird, Stuart M. Harwood, Dimitar Trenev, and Davide Venturelli. Impact of emerging computing architectures and opportunities for process systems engineering applications. In FOCAPO-CPC 2023, San Antonio, TX, January 2023.

[4] Robert X. Gottlieb and Matthew D. Stuber. Global dynamic optimization using hardware-accelerated programming. In AIChE Annual Meeting 2022, Phoenix, AZ, November 2022.

[5] Robert X. Gottlieb, Pengfei Xu, and Matthew D. Stuber. Automatic source code generation of complicated models for deterministic global optimization with parallel architectures. In FOCAPO-CPC 2023, San Antonio, TX, January 2023.