Comparative Analysis of Reinforcement Learning Approaches for Dynamic Pricing of Perishable Goods

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Dynamic pricing of perishable products is a challenging optimization problem with limited shelf life, random demand, and inventory capacity constraints. Fixed or rule-based price policies fail to change in response to market movements and do not yield maximum revenue. In this research, we consider the use of reinforcement learning (RL) techniques for learning adaptive price policies that maximize profitability and inventory usage. We train and contrast four leading RL methods Deep Q-Networks(DQN), Double DQN(DDQN), Proximal Policy Optimization(PPO) and Quantile Regression DQN(QR-DQN) in a simulated retail setting with price and age sensitivity in demand. We compare the RL agents to fixed-price policies in order to measure revenue, inventory loss, and pricing conduct. Our findings show that PPO attains the maximum revenue with minimal waste, performing better than both baselines and other learning-based methods.

Article activity feed