Reinforcement Learning for Optimal Replenishment in Stochastic Assembly Systems

Lativa Sid Ahmed Abdellahi
Zeinebou Zoubeir
Yahya Mohamed
Ahmedou Haouba
Sidi Hmetty

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This study presents a reinforcement learning–based approach to optimize replenishment policies in the presence of uncertainty, with the objective of minimizing total costs, including inventory holding, shortage, and ordering costs. The focus is on single-level assembly systems, where both component delivery lead times and finished product demand are subject to randomness. The problem is formulated as a Markov Decision Process (MDP), in which an agent determines optimal order quantities for each component by accounting for stochastic lead times and demand variability. A Deep Q-Network (DQN) algorithm is adapted and employed to learn optimal replenishment policies over a fixed planning horizon. To enhance learning performance, we develop a tailored simulation environment that captures multi-component interactions, random lead times, and variable demand, along with a modular and realistic cost structure. The environment enables dynamic state transitions, lead time sampling, and flexible order reception modeling, providing a high-fidelity training ground for the agent. To further improve convergence and policy quality, we incorporate local search mechanisms and multiple action space discretizations per component. Experimental results show that the proposed method significantly reduces stockouts and overall costs while improving the system’s adaptability to uncertainty. These findings highlight the potential of deep reinforcement learning as a data-driven and dynamic approach to inventory management in complex and uncertain supply chain environments.

Version published to 10.20944/preprints202505.2062.v1
May 27, 2025

Reinforcement Learning-Based Optimization Strategy for Online Advertising Budget Allocation

This article has 4 authors:
1. Mengfei Yang
2. Qiong Cao
3. Lingyun Tong
4. Jiawen Shi
This article has no evaluationsLatest version May 28, 2025
Integrating Bayesian Learning and Discrete Event Modeling for Adaptive Facility Layout in Remanufacturing

This article has 4 authors:
1. Toluwalase Olajoyegbe
2. Fatemeh Mozaffar
3. Xiaoou Yang
4. Beshoy Morkos
This article has no evaluationsLatest version May 16, 2025
Accelerating Vehicle Routing via AI-Initialized Genetic Algorithms

This article has 8 authors:
1. Ido Greenberg
2. Piotr Sielski
3. Hugo Linsenmaier
4. Rajesh Gandham
5. Shie Mannor
6. Alex Fender
7. Gal Chechik
8. Eli Meirom
This article has no evaluationsLatest version Apr 17, 2025

Listed in

Abstract

Article activity feed

Related articles

Reinforcement Learning-Based Optimization Strategy for Online Advertising Budget Allocation

Integrating Bayesian Learning and Discrete Event Modeling for Adaptive Facility Layout in Remanufacturing

Accelerating Vehicle Routing via AI-Initialized Genetic Algorithms