(16g) Stochastic Optimal Control of Polynomial Jump-Diffusion Processes Via Local Occupation Measures
AIChE Annual Meeting
2021
2021 Annual Meeting
Computing and Systems Technology Division
Estimation and Control under uncertainty
Sunday, November 7, 2021 - 5:24pm to 5:43pm
In this work, we address this limitation by introducing the concept of local occupation measures. Local occupation measures are obtained by restriction of the state-action occupation measure associated with a stochastic control system to a subset of the time domain. This straightforward generalization allows to bridge the gap between the weak and strong formulation of the associated optimal control problem by discretizing the time domain and imposing constraints on the system trajectories in a weak form over the resultant collection of time intervals instead of over the entire time horizon as a whole (as is done traditionally). As a consequence, the generalized moment problems and associated SDP relaxations generated this way are not only tighter but explicitly reflect the causal temporal structure that is inherent to optimal control problems - a feature that is absent from the traditional formulation. From a practical perspective, this structure can crucially be exploited by distributed optimization algorithms offering the potential to improve scalability. Moreover, the use of local occupation measures provides a new mechanism to tighten the generated SDP relaxations via refinement of the time discretization at the cost of only a linear increase in problem size. This is in stark contrast to the traditionally used tightening mechanism which relies on increasing the truncation order of moment sequences associated with multivariate measures, hence suffers from combinatorial scaling. As an aside, we note that the proposed approach inherits key properties of the original occupation measure framework; most notably, convergence of the optimal value of the SDP relaxations to the true optimal value of the stochastic control problem can be established under mild regularity conditions and the dual SDPs furnish piecewise polynomial subsolutions to the Hamilton-Jacobi-Bellman equations, providing useful information for controller design [4,8]. We demonstrate the effectiveness and versatility of the proposed framework with examples from systems biology and population control.
[1] Wendell H. Fleming and Domokos Vermes. Convex duality approach to the optimal control of diffusions. SIAM Journal on Control and Optimization, 27(5):1136â1155, 1989
[2] Abhay G. Bhatt and Vivek S. Borkar. Occupation Measures for Controlled Markov Processes: Characterization and Optimality. The Annals of Probability, 24(3):1531â1562, 1996
[3] Thomas G. Kurtz and Richard H. Stockbridge. Existence of Markov Controls and Characterization of Optimal Markov Controls. SIAM Journal on Control and Optimization, 36(2):609â653, 1998
[4] Jean B. Lasserre, Didier Henrion, Christophe Prieur, and Emmanuel Trélat. Nonlinear Optimal Control via Occupation Measures and LMI-Relaxations. SIAM Journal on Control and Optimization, 47(4):1643â1666, 2008.
[5] Carlo Savorgnan, Jean B. Lasserre, and Moritz Diehl. Discrete-time stochastic optimal control via occupation measures and moment relaxations. Proceedings of the IEEE Conference on Decision and Control, pages 519â524,2009.
[6] Milan Korda, Didier Henrion, and Jean B. Lasserre. Moments and Convex Optimization for Analysis and Control of Nonlinear Partial Differential Equations. arXiv preprint arXiv:1804.07565, 2018
[7] Jean B. Lasserre. Moments, Positive Polynomials and Their Applications, volume 1. World Scientific, 2010.
[8] Milan Korda, Didier Henrion, and Colin N. Jones. Controller design and value function approximation for nonlinear dynamical systems. Automatica,67:54â66, 2016.