(11g) Decision-Making Optimization of Hybrid Energy Management System for Curtailed Renewable Energy through Deep Reinforcement Learning
AIChE Annual Meeting
2022
2022 Annual Meeting
Computing and Systems Technology Division
Data-Driven and Hybrid Modeling for Decision Making
Sunday, November 13, 2022 - 5:12pm to 5:29pm
The EMS is composed of photovoltaic and wind power sources[5], the battery energy storage system (BESS), and the alkaline water electrolyzer (AWE). The RL agent is trained to make a decision on how much energy to distribute to BESS or AWE using the Proximal Policy Optimization (PPO) algorithm. Regarding the RL architecture, the state (e.g., state of charge) in the environment which expresses EMS is changed by the action and the numerical reward is calculated according to the state. The agent receives the reward as a result of its action and updates the policy network in the direction of maximizing the reward. The RL training result shows more than 93% performance compared to the mixed-integer linear programming (MILP) optimal solution which is the profit value of EMS operation, even in the absence of a curtailed energy prediction model. In addition, the policy of the agent was evaluated in other years and provided about 90% performance which means proving applicability to other scenarios. These comparisons with mathematical programming (MP) based optimal solution allows for assessing the quantitative performance of RL, unlike the previous papers that stopped at the implementation of RL. The trained policy also showed excellent evaluation results even in the presence of uncertainty in the curtailed energy, which performed better than that solved by stochastic optimization (SO). We conducted the Monte Carlo simulation of SO and RL and proved that RL can better reject parameter uncertainty. Finally, by visually expressing the action and state mapping of the trained policy, the correlation and logic of the EMS can be understood as opposed to the order in which a person makes a decision.
This result shows that RL can implement real-time and multi-period optimal planning in EMS. We confirmed that RL does not require much training time and computing power, unlike MP according to the length of the period or the size of uncertainty. Therefore, RL can be a promise for the decision-making model of the hybrid EMS problem that contains uncertainty.
[1] I. Dincer, Renewable energy and sustainable development: a crucial review, Renewable and Sustainable Energy Reviews 4 (2) (2000) 157â175.
[2] H. Saber, M. Moeini-Aghtaie, M. Ehsan, M. Fotuhi-Firuzabad, A scenario- based planning framework for energy storage systems with the main goal of mitigating wind curtailment issue, International Journal of Electrical Power Energy Systems 104 (2019) 414â422.
[3] X. Dui, G. Zhu, L. Yao, Two-stage optimization of battery energy storage capacity to decrease wind power curtailment in grid-connected wind farms, IEEE Transactions on Power Systems 33 (3) (2018) 3296â3305.
[4] C. Li, H. Shi, Y. Cao, J. Wang, Y. Kuang, Y. Tan, J. Wei, Comprehensive review of renewable energy curtailment and avoidance: A specific example in china, Renewable and Sustainable Energy Reviews 41 (2015) 1067â1079.
[5] Caiso curtailed energy dataset, https://www.caiso.com/informed/Pages/ManagingOversupply.aspx.