(417b) Rare-Event Sampling for Efficient Scenario Generation for Stochastic Programs
AIChE Annual Meeting
2021
2021 Annual Meeting
Computing and Systems Technology Division
Design and Operations Under Uncertainty
Wednesday, November 10, 2021 - 8:19am to 8:38am
In many stochastic programs, the most probable scenarios are enough to estimate the distribution for a high-quality solution. There are, however, some problems, such as those in safety and reliability engineering, supply chain resilience, and health services, in which the least likely events have the most impact on the optimal decisions. These events occur in the tail(s) of the distribution and are likely to be not accurately represented within the scenarios. To fully describe the tail(s), it is necessary to either create a large number of samples or use a rare-event sampling technique. In these types of problems, the need to properly represent the tail(s) of the distribution while maintaining a computationally tractable model becomes even more of an issue.
In this work, we evaluate the efficacy of two different implementations of importance sampling (Papavasiliou and Oren, 2013; Parpas et al., 2014) and two clustering approaches to generate scenarios for SP models emphasizing the rare-event space efficiently. Importance sampling is a variance reduction technique in which samples are taken from an auxiliary distribution, the importance distribution (ID), which emphasizes the region of the distribution that has the greatest impact on a measure of interest. The samples are then tied back to the original distribution through an assigned weight, the likelihood ratio, to maintain an unbiased estimation of that measure of interest. The determination of an ID is a non-trivial task and is where the two tested implementations differ. The first implementation (Parpas et al., 2014) uses a Markov Chain Monte Carlo approach to generate samples from a theoretical ID where the normalization constant is unknown, and the ID is approximated using kernel density estimation. The approximation of the ID is then used to generate samples with known probabilities. The second approach (Papavasiliou and Oren, 2013) utilizes a large number of samples from the original distribution to accurately estimate the expectation of the measure of interest. A subset of the initial samples is then chosen using an approximated ID, constructed using the estimated expectation of the measure of interest. Clustering is a data analysis technique that takes in a large dataset and groups the data into subsets, or clusters, of points that share similarities, originally used as a classification technique (Driver and Kroeber, 1932). This technique can be used for scenario reduction by clustering a large dataset and generating representative samples from each of the clusters as the scenarios for an optimization problem (Li et al., 2020). The clustering techniques used are the k-means and affinity propagation algorithms. The k-means is a general-purpose algorithm that separates the data set into k different clusters based on the distance between the mean vectors of the data (Yadav and Sharma, 2013). The affinity propagation algorithm utilizes a similarity measure to tie data points to a representative point, known as an exemplar, to form the clusters (Dueck, 2009). We also include a crude Monte Carlo generation approach as a baseline in the framework.
We test the scenario reduction techniques on a healthcare-related problem, in which the objective is to maximize the expected life-years gained by screening for Colorectal cancer (CRC). The solution identifies at what ages screening tests should occur within a population to maximize the benefit. For this problem, only a small percentage of the population, 4-4.5 % (American Cancer Society, 2020), develops CRC within their lifetime. Hence, for the majority of the population, screening for CRC is a burden. To properly maximize the expected benefit of screening, it is necessary to accurately portray the impact of the lower probability scenarios on the objective function. After the scenario sets are generated, we solved the optimization problem and compare the resulting objective function to a rigorous simulation to determine how effective the approximation is for the various techniques. This presentation will discuss the effectiveness in estimating the true objective value, the quality of the solutions obtained, and the computation budget needed for each scenario generation method.
References
American Cancer Society, 2020. Colorectal Cancer Facts & Figures [WWW Document]. URL https://www.cancer.org/research/cancer-facts-statistics/colorectal-cance... (accessed 2.27.19).
Birge, J.R., Louveaux, F., 2011. Introduction to stochastic programming. Springer Science & Business Media.
Driver, H.E., Kroeber, A.L., 1932. Quantitative expression of cultural relationships. Berkeley: University of California Press.
Dueck, D., 2009. Affinity propagation: clustering data by passing messages. (Thesis) Citeseer.
Henrion, R., Römisch, W., 2018. Problem-based optimal scenario generation and reduction in stochastic programming. Math. Program. 1â23.
Li, B., Sedzro, K., Fang, X., Hodge, B.M., Zhang, J., 2020. A clustering-based scenario generation framework for power market simulation with wind integration. J. Renew. Sustain. Energy 12. https://doi.org/10.1063/5.0006480
Papavasiliou, A., Oren, S.S., 2013. Multiarea stochastic unit commitment for high wind penetration in a transmission constrained network. Oper. Res. 61, 578â592. https://doi.org/10.1287/opre.2013.1174
Park, S., Xu, Q., Hobbs, B.F., 2019. Comparing scenario reduction methods for stochastic transmission planning. IET Gener. Transm. Distrib. 13, 1005â1013.
Parpas, P., Ustun, B., Webster, M., Tran, Q.K., 2014. Importance Sampling in Stochastic Programming: A Markov Chain Monte Carlo Approach. INFORMS J. Comput. 27, 358â377. https://doi.org/10.1287/ijoc.2014.0630
Shapiro, A., Dentcheva, D., RuszczyÅski, A., 2014. Lectures on Stochastic Programming: Modeling and Theory, Second Edition. Lect. Stoch. Program. Model. Theory, Second Ed. https://doi.org/10.1137/1.9781611973433
Yadav, J., Sharma, M., 2013. A Review of K-mean Algorithm. Int. J. Eng. trends Technol. 4, 2972â2976.