(373g) Improvements to GPU-Enhanced Deterministic Global Optimization | AIChE

(373g) Improvements to GPU-Enhanced Deterministic Global Optimization

Authors 

Stuber, M. - Presenter, University of Connecticut
Gottlieb, R., University of Connecticut
Deterministic global optimization is an invaluable tool for parameter estimation, model validation [1], and robust design applications [2], among many others. However, a longstanding problem faced by deterministic global optimization solvers is the high computational effort required to furnish solutions of nonconvex programs and, particularly, the long wall clock time needed to obtain results of these NP-hard problems [3]. While there have been innumerable advances in algorithmic efficiency in the past several decades, a major bottleneck for this class of software has been the computing hardware on which it runs. To the best of the authors’ knowledge, all implementations of such software packages rely on a computer’s CPU to perform relaxation calculations, and while there exist some examples of algorithms that can exploit multi-CPU systems (e.g., MPI-based parallelization in MAiNGO [4]), surprisingly few implementations make use of the comparatively more powerful and higher-throughput graphics processing units (GPUs). If one is able to make use of GPUs to run deterministic global optimization routines, there is the promise of potentially orders-of-magnitude greater calculation throughput and a corresponding decrease in wall clock time needed to solve a majority of global optimization problems.

At AIChE 2022 [5] and AIChE 2023 [6], we presented on several advances that aim to make GPU-based deterministic global optimization possible and accessible. We began by showcasing the open-source SourceCodeMcCormick.jl package [5] that enabled the evaluation of McCormick relaxations on GPU hardware by applying source code transformation techniques to automatically generate CUDA-compatible evaluation functions. These results were formalized in a submitted paper (currently under review). Last year, we discussed the expanded capabilities of this tool that further enabled the evaluation of subgradients of McCormick-based relaxations [6], which are critical for many existing CPU-based algorithms.

In this presentation, we continue forward by detailing a novel GPU-based linear program (LP) solver—one designed to solve a large number of small LPs in parallel, as opposed to most other GPU-based LP solvers that typically aim to solve individual large-scale LPs [7, 8]—built as an extension of the EAGO.jl software package [9], which can make use of GPU-calculated subgradient evaluations to obtain lower bounds of branch-and-bound nodes without the need to first transfer results to CPU memory. This approach shifts nearly all of the calculation burden in the branch-and-bound algorithm to the GPU and thereby limits the communication overhead required to make use of subgradient information, which is frequently a limiting factor for the speed of GPU-based algorithms [10]. This talk will cover our implementation of a GPU-parallel LP solver, its integration within a branch-and-bound algorithm, and a straightforward use-case example to compare this algorithm against state-of-the-art CPU-based algorithms. The development of algorithms such as this, which can make use of more efficient GPU hardware to achieve faster solutions, will serve to greatly expand the range of possible global optimization applications in the future.


References
[1] Adam B. Singer, James W. Taylor, Paul I. Barton, and William H. Green. Global dynamic optimization for parameter estimation in chemical kinetics. The Journal of Physical Chemistry A, 110(3):971–976, dec 2006. doi: 10.1021/jp0548873.
[2] Matthew D. Stuber, Achim Wechsung, Arul Sundaramoorthy, and Paul I. Barton. Worst-case design of subsea production facilities using semi-infinite programming. AIChE Journal, 60(7):2513–2524, apr 2014. doi: 10.1002/aic.14447.
[3] Reiner Horst and Hoang Tuy. Global Optimization: Deterministic Approaches. Springer Berlin Heidelberg, November 2013. ISBN 9783662025987. URL https://books.google.com/books?id=Pe 1CAAAQBAJ.
[4] Dominik Bongartz, Jaromił Najman, Susanne Sass, and Alexander Mitsos. MAiNGO - McCormick-based Algorithm for mixed-integer Nonlinear Global Optimization. Technical report, RWTH-Aachen, 2018. URL https://www.avt.rwth-aachen.de/global/show document.asp?id=aaaaaaaaabclahw.
[5] Robert X. Gottlieb and Matthew D. Stuber. Global dynamic optimization using hardware-accelerated programming. In AIChE Annual Meeting 2022, Phoenix, AZ, November 2022.
[6] Robert X. Gottlieb, Pengfei Xu, and Matthew D. Stuber. Automatic source code generation of complicated models for deterministic global optimization with parallel architectures. In FOCAPO-CPC 2023, San Antonio, TX, January 2023.
[7] Mohamed Esseghir Lalami, Didier El-Baz, and Vincent Boyer. Multi GPU implementation of the simplex algorithm. In 2011 IEEE International Conference on High Performance Computing and Communications. IEEE, sep 2011. doi: 10.1109/hpcc.2011.32.
[8] François Pacaud, Michel Schanen, Sungho Shin, Daniel Adrian Maldonado, and Mihai Anitescu. Parallel interior-point solver for block-structured nonlinear programs on SIMD/GPU architectures, arXiv preprint arXiv:2301.04869, 2023. doi: 10.48550/ARXIV.2301.048692023.
[9] Matthew E. Wilhelm and Matthew D. Stuber. EAGO.jl: easy advanced global optimization in Julia. Optimization Methods and Software, 37(2):425–450, aug 2022. doi: 10.1080/10556788.2020.1786566.
[10] Jianbin Fang, Chun Huang, Tao Tang, and Zheng Wang. Parallel programming models for heterogeneous many-cores: a comprehensive survey. CCF Transactions on High Performance Computing, 2(4):382–400, July 2020. ISSN 2524-4930. doi: 10.1007/s42514-020-00039-4.