(207c) Necessary Optimality-Constrained Bayesian Optimization (NOBO) for Efficiently Learning Complex Control Policies from Closed-Loop Data

Conference

AIChE Annual Meeting

Year

2023

Proceeding

Computing and Systems Technology Division

Session

Tuesday, November 7, 2023 - 8:36am to 8:54am

Authors

Makrygiorgos, G. - Presenter, UC Berkeley

Paulson, J., The Ohio State University

Mesbah, A., University of California, Berkeley

The control of complex systems involves several challenges due to the unknown (black-box) relationship between control policy parameters and the reward function that can only be observed through expensive and noisy simulations or experiments. Bayesian optimization (BO) has recently gained popularity for globally optimizing expensive black-box functions, which is a problem frequently encountered in learning-based control applications, thanks to its data efficiency [1,2]. Traditional BO methods construct a probabilistic surrogate model of the performance function and employ an acquisition function that approximates the value of information for future sample points, effectively balancing the exploration-exploitation tradeoff inherent in searching the design space [3]. Recent studies have shown that BO's convergence performance can be improved by incorporating additional information, such as derivative observations [4].

First-order BO methods mainly focus on standard acquisition functions and indirectly incorporate derivative measurements into the probabilistic surrogate model to enhance local predictions [5]. Nevertheless, these methods can exhibit drawbacks, such as potentially significant increases in training and optimization costs due to greater model complexity, and may fail if gradient observations are heavily obscured by noise [6]. In this talk, we propose a computationally efficient approach to simultaneously utilize performance (zero-order) and derivative (first-order) data within a single acquisition optimization subproblem. Our core idea involves imposing a series of black-box constraints that mimic the necessary optimality conditions for the original global optimization problem at each iteration. The proposed necessary-optimality BO (NOBO) method [7] employs Gaussian process surrogates for the objective's partial derivatives to approximately enforce first-order optimality conditions as black-box constraints in the acquisition function. These constraints establish a feasible set that explicitly accounts for the uncertainty in estimating partial gradients from data, which is updated as new data is observed. Consequently, the feasible set allows for narrowing down the design space search to regions that are jointly informative concerning both zeroth- and first-order information.

We examine the theoretical performance and regret bounds associated with the proposed algorithm and demonstrate in practice that incorporating these constraints, which restrict the allowable search space, leads to faster convergence rates compared to conventional BO. We further validate these performance enhancements on a reinforcement learning (RL) benchmark problem based on the linear quadratic regulator (LQR) problem [8], where the reward function's derivatives can be estimated directly from closed-loop data using the policy gradient theorem.

References:

[1] Shahriari, Bobak, et al. "Taking the human out of the loop: A review of Bayesian optimization." Proceedings of the IEEE 104.1 (2015): 148-175.

[2] Paulson, Joel A., Georgios Makrygiorgos, and Ali Mesbah. "Adversarially robust Bayesian optimization for efficient autoâ€tuning of generic control structures under uncertainty." AIChE Journal 68.6 (2022): e17591.

[3] Frazier, Peter I. "A tutorial on Bayesian optimization." arXiv preprint arXiv:1807.02811 (2018).

[4] Shekhar, Shubhanshu, and Tara Javidi. "Significance of gradient information in bayesian optimization." International Conference on Artificial Intelligence and Statistics. PMLR, 2021.

[5] Wu, Jian, et al. "Bayesian optimization with gradients." Advances in neural information processing systems 30 (2017).

[6] Penubothula, Santosh, Chandramouli Kamanchi, and Shalabh Bhatnagar. "Novel first order bayesian optimization with an application to reinforcement learning." Applied Intelligence 51 (2021): 1565-1579.

[7] Makrygiorgos, Georgios, Joel A. Paulson, and Ali Mesbah. " No-Regret Bayesian Optimization with Gradients using Local Optimality-based Constraints: Application to Closed-loop Policy Search", 62nd IEEE Conference on Decision and Control (CDC). IEEE, 2023

[8] Recht, Benjamin. "A tour of reinforcement learning: The view from continuous control." Annual Review of Control, Robotics, and Autonomous Systems 2 (2019): 253-279.

Topics

Process Automation & Control

Computing and Systems Engineering

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

The Foundations of Computer Aided Process Design (FOCAPD) Conference

Foundations of Molecular Modeling and Simulation (FOMMS 2024)

Upcoming Conferences & Events

The Foundations of Computer Aided Process Design (FOCAPD) Conference

2024 BASF Sponsored CCPS Faculty Workshop

Artificial Intelligence in PSM: First Steps

Foundations of Molecular Modeling and Simulation (FOMMS 2024)

2024 Brazil Student Regional Conference

2024 Dow Sponsored CCPS Process Safety Faculty Workshop

2024 International Mammalian Synthetic Biology Workshop (mSBW)

2024 Chemical Ventures Conference

2024 China Chem-E-Car Competition

CEP: June 2024

CEP: May 2024

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

(207c) Necessary Optimality-Constrained Bayesian Optimization (NOBO) for Efficiently Learning Complex Control Policies from Closed-Loop Data

AIChE Annual Meeting

2023

2023 AIChE Annual Meeting

Computing and Systems Technology Division

Advances in Process Control I

Tuesday, November 7, 2023 - 8:36am to 8:54am

Authors

Topics

More Conference Links

Visit Orlando

Universal Studios Offer

Cancellation Policy

Code of Conduct

Beware of Hotel and Attendee-list Scams

Code of Conduct

Beware of Hotel and Attendee-list Scams