Deep Reinforcement Learning for Optimum Order Execution: Mitigating Risk and Maximizing Returns

Khabbab Zakaria

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Optimal Order Execution is a well-established problem in finance that pertains to the flawless execution of a trade (buy or sell) for a given volume within a specified time frame. This problem revolves around optimizing returns while minimizing risk, yet recent research predominantly focuses on addressing one aspect of this challenge. In this paper, we introduce an innovative approach to Optimal Order Execution within the US market, leveraging Deep Reinforcement Learning (DRL) to effectively address this optimization problem holistically. Our study assesses the performance of our model in comparison to two widely employed execution strategies: Volume Weighted Average Price (VWAP) and Time Weighted Average Price (TWAP). Our experimental findings clearly demonstrate that our DRL-based approach outperforms both VWAP and TWAP in terms of return on investment and risk management. The model's ability to adapt dynamically to market conditions, even during periods of market stress, underscores its promise as a robust solution.

Version published to 10.20944/preprints202511.1391.v1
Nov 19, 2025

Learning Utility Models for Dynamic Inventory Control : A Reinforcement Learning Framework

This article has 1 author:
1. Milon
This article has no evaluationsLatest version Jan 23, 2026
Do Classical Methods Still Win? Revisiting Forecasting Strategies for Curtailment Mitigation in Brazil

This article has 5 authors:
1. Ricardo Accorsi Casonatto
2. Eugênia Cornils Monteiro da Silva
3. Sanderson César Macedo Barbalho
4. Marcelo Carneiro Gonçalves
5. Maria Gabriela Mendonça Peixoto
This article has no evaluationsLatest version Dec 10, 2025
AI-Enhanced Portfolio Management in China’s A-Share Market: A Dynamic Factor Integration Framework

This article has 1 author:
1. Ng Zhe Xin
This article has no evaluationsLatest version Jan 16, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Learning Utility Models for Dynamic Inventory Control : A Reinforcement Learning Framework

Do Classical Methods Still Win? Revisiting Forecasting Strategies for Curtailment Mitigation in Brazil

AI-Enhanced Portfolio Management in China’s A-Share Market: A Dynamic Factor Integration Framework