(373ai) Multi Agent Reinforcement Learning and Graph Neural Networks for Inventory Management
AIChE Annual Meeting
2024
2024 AIChE Annual Meeting
Computing and Systems Technology Division
10C: Interactive Session: Systems and Process Operations
Tuesday, October 29, 2024 - 3:30pm to 5:00pm
Multi-agent reinforcement learning (MARL) offers a promising solution to address the limitations of traditional RL and LP methods. MARL does not require global online information and can operate as a distributed decision-making framework, allowing individual agents to make decisions based on local observations and interactions. This decentralized approach enhances adaptability and coordination in large-scale supply chains, making it a viable alternative for optimizing inventory control. The framework proposed follows the Centralized Training, Decentralized Execution Paradigm, where central state information is shared offline during training, but solely local state information is required at an online level. This is shown in Figure 1.
In this work, we develop a multi agent RL framework with Graph Neural Networks (GNNs) for multi-echelon inventory management, leveraging the inherent graph structure of supply chains to learn hidden interdependencies. Unlike other RL studies, we redefine the action space to parametrize a heuristic inventory policy (s,S), enhancing adaptability, practicality, and explainability for real-world implementation. Our first framework uses the aggregated vector from the GNN to train the critic, guiding the learning process for more effective inventory strategies. To address increased computational complexity with more agents, our second framework, illustrated in Figure 2, employs global mean pooling to aggregate the vectors, reducing dimensionality and computational complexity without compromising the criticâs effectiveness. Both frameworks leverage the supply chain's structure to learn hidden interdependencies, enhancing communication and coordination between entities for improved decision-making in the multi-agent system.
The effectiveness of our collaborative approach is shown by testing the trained policies on a series of disruptions such as the bullwhip effect and fluctuations in costs. Our framework shifts computational costs from online to offline, ensuring faster decision-making compared to traditional optimization methods used in inventory control. As a result, the methodology proposed shows promising scalability with number of agents for a decentralized and online decision-making framework while maintaining collaboration between entities. In summary, the contribution of this work is two-fold: the parametrization of a heuristic policy enables explainability and early adoption of state-of-the-art methods in industry, and the synergy between multi-agent RL and GNNs highlights the importance of leveraging the inherent graph structure of supply chains. This approach paves the way for more efficient, adaptable, and resilient supply chain operations.
References
[1] Jackson, I., Tolujevs, J. and Kegenbekov, Z., 2020. Review of inventory control models: a classification based on methods of obtaining optimal control parameters. Transport and Telecommunication, 21(3), pp.191-202.
[2] Brunaud, B., LaÃnezâAguirre, J.M., Pinto, J.M. and Grossmann, I.E., 2019. Inventory policies and safety stock optimization for supply chain planning. AIChE journal, 65(1), pp.99-112.
[3] Mousa, M., van de Berg, D., Kotecha, N., del Rio-Chanona, E.A. and Mowbray, M., 2023. An Analysis of Multi-Agent Reinforcement Learning for Decentralized Inventory Control Systems. arXiv preprint arXiv:2307.11432.
[4] Liu, X., Hu, M., Peng, Y. and Yang, Y., 2022. Multi-Agent Deep Reinforcement Learning for Multi-Echelon Inventory Management. Available at SSRN.