(373p) Partial Optimal Transport for Data-Driven Distributionally Robust Optimization with Consideration of Outliers
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Computing and Systems Technology Division
10C: Interactive Session: Systems and Process Operations
Tuesday, October 29, 2024 - 3:30pm to 5:00pm
In data-driven DRO, addressing the outlier issue is paramount for ensuring the reliability and effectiveness of the optimization process. Improper handling of the outliers in data can distort analysis and lead to suboptimal solutions. Unfortunately, there is very limited work in the literature on addressing outlier issues for data-driven DRO [3, 4].
Wasserstein DRO is a popular DRO approach and has received lots of studies in the literature. It is based on the optimal transport (OT) concept, which seeks to minimize the total transportation cost across all data points by finding the optimal transport plan [5]. OT facilitates the construction of the Wasserstein ambiguity set, allowing for a tractable formulation of the Wasserstein DRO. However, similar to other DRO approaches, Wasserstein DRO is likewise sensitive to outliers. This is because outliers can distort the optimal transport plan, thereby skewing the OT and the Wasserstein ambiguity set [6].
To enhance the resilience of DRO against outliers, the approach of partial optimal transport (partial OT) presents a promising solution. Compared to traditional OT, partial OT focuses on optimizing the transportation of a selected subset [7, 8]. This selective optimization enables more efficient resource allocation and makes it a powerful tool in situations where complete mass transport is impractical or unnecessary. When applying partial OT to a dataset contaminated with outliers, these outliers can be excluded from the regular mass transport. This exclusion helps prevent outliers from influencing the OT process. In this work, we propose to use partial OT in data-driven DRO to address the potential outlier issue, particularly in ambiguity set construction for DRO. By selectively optimizing the transportation of a subset of data points within a given data set, partial OT allows for the creation of ambiguity sets that capture the most relevant and representative sources of uncertainty, without being affected by outliers. We obtained tractable formulations for the data-driven DRO problem based on the ambiguity set constructed using the partial OT. In our numerical case studies, we illustrated how the proposed approach surpasses traditional Wasserstein DRO in improving solution quality for problems affected by outlier-contaminated data.
References
[1] S. B. Yang and Z. Li, "Distributionally robust chanceâconstrained optimization with Sinkhorn ambiguity set," AIChE Journal, vol. 69, no. 10, p. e18177, 2023.
[2] C. Shang, X. Huang, and F. You, "Data-driven robust optimization based on kernel learning," Computers & Chemical Engineering, vol. 106, pp. 464-479, 2017.
[3] S. Nietert, Z. Goldfeld, and S. Shafiee, "Outlier-Robust Wasserstein DRO," Advances in Neural Information Processing Systems, vol. 36, 2024.
[4] A. Esteban-Pérez and J. M. Morales, "Distributionally robust stochastic programs with side information based on trimmings," Mathematical Programming, vol. 195, no. 1, pp. 1069-1105, 2022.
[5] C. Villani, Optimal transport: old and new. Springer, 2009.
[6] Z. Zhang, P. Tang, and T. Corpetti, "Time adaptive optimal transport: A framework of time series similarity measure," IEEE Access, vol. 8, pp. 149764-149774, 2020.
[7] L. Chapel, M. Z. Alaya, and G. Gasso, "Partial optimal tranport with applications on positive-unlabeled learning," Advances in Neural Information Processing Systems, vol. 33, pp. 2903-2913, 2020.
[8] E. Del Barrio and C. Matrán, "Rates of convergence for partial mass problems," Probability Theory and related fields, vol. 155, no. 3, pp. 521-542, 2013.