Data-Driven Sample Average Approximation with Covariate Information

Type

Conference Presentation

Conference Type

AIChE Annual Meeting

Presentation Date

November 10, 2021

Duration

19 minutes

Skill Level

Intermediate

PDHs

0.50

Stochastic programming is a powerful modeling framework for decision-making under uncertainty that finds applications in engineering, operations research, and economics [1]. Because the distributions of the underlying random vectors in stochastic programming models are typically unknown, popular data-driven approaches such as sample average approximation (SAA) only assume access to a finite sample. In many real-world applications, the random vector in a stochastic programming formulation (e.g., demand for water and energy) can be predicted using available covariate information (e.g., weather).

We study optimization for data-driven decision-making when we have observations of the uncertain parameters within the optimization model together with concurrent observations of covariates. Given a new covariate observation, the goal is to choose a decision that minimizes the expected system cost conditioned on this observation. Applications of this framework include (i) shipment planning under uncertainty [2], where historical demands, weather forecasts, and web search results can be used to predict productsâ€™ demands before making production and inventory decisions, (ii) grid scheduling under uncertainty [3], where seasonality, weather, and historical demand data can be used to predict the load and wind energy availability before creating generator schedules, and (iii) portfolio optimization under market uncertainty [4], where stock prices can be predicted using economic indicators and historical stock data before making investment decisions.

We investigate three data-driven frameworks that integrate a machine learning prediction model within a sample average approximation for approximating the solution to this conditional stochastic program [5,6]. Two of the SAA frameworks are new and use out-of-sample residuals of leave-one-out prediction models for scenario generation. The frameworks we investigate are flexible and accommodate parametric, nonparametric, and semiparametric regression techniques. The generality of our framework enables decision-makers to choose the modeling approach that works best for their application. We derive conditions on the data generation process, the prediction model, and the stochastic program under which solutions of these data-driven SAAs are consistent and asymptotically optimal, and also derive convergence rates and finite sample guarantees. Computational experiments on a resource allocation model validate our theoretical results, demonstrate the potential advantages of our data-driven formulations over existing approaches (even when the prediction model is misspecified), and illustrate the benefits of our new data-driven formulations in the limited data regime. Our approach provides a modular framework for using covariate information in stochastic optimization and can be readily generalized to the multi-stage and distributionally robust optimization settings [7].

1. Shapiro, Alexander, Darinka Dentcheva, and Andrzej Ruszczynski. Lectures on stochastic programming: modeling and theory. Society for Industrial and Applied Mathematics, 2014.

2. Bertsimas, Dimitris, and Nathan Kallus. From predictive to prescriptive analytics. Management Science 66.3 (2020): 1025-1044.

3. Donti, Priya L., Brandon Amos, and J. Zico Kolter. Task-based end-to-end model learning in stochastic optimization. arXiv preprint arXiv:1703.04529 (2017).

4. Dou, Xialiang, and Mihai Anitescu. Distributionally robust optimization with correlated data from vector autoregressive processes. Operations Research Letters 47.4 (2019): 294-299.

5. Kannan, Rohit, Güzin Bayraksan, and James R. Luedtke. Data-driven sample average approximation with covariate information. Available on Optimization Online (July 2020).

6. Kannan, Rohit, Güzin Bayraksan, and James R. Luedtke. Heteroscedasticity-aware residuals-based contextual stochastic optimization. arXiv preprint arXiv:2101.03139 (2021).

7. Kannan, Rohit, Güzin Bayraksan, and James R. Luedtke. Residuals-based distributionally robust optimization with covariate information. arXiv preprint arXiv:2012.01088 (2020).

Presenter(s)

Güzin Bayraksan

Rohit Kannan

James Luedtke

Once the content has been viewed and you have attested to it, you will be able to download and print a certificate for PDH credits. If you have already viewed this content, please click here to login.

Language

English

Checkout

Do you already own this?

Pricing

Individuals

AIChE Member Credits	0.5
AIChE Pro Members	$19.00
AIChE Graduate Student Members	Free
AIChE Undergraduate Student Members	Free
AIChE Explorer Members	$29.00
Non-Members	$29.00

Other Sites & Tools

Technical Groups

Technical

Professional/Personal Growth

Societal Needs

Leadership

2025 Spring Meeting and 21st Global Congress on Process Safety

2025 AIChE Annual Meeting

Upcoming Conferences & Events

CEP: July 2025

CEP: June 2025

Explore Areas of Advancement:

Learning Center:

Want to be an Entrepreneur? Personal Stories From Three Successful Entrepreneurs Who Have Traveled This Path.

You are here