Energy-Aware Autonomous UAV Navigation via Deep Reinforcement Learning: DQN, PPO, and SAC with Battery-Constrained Reward

Sayeed Omar

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Battery endurance limits commercial quadcopter UAVs to 15–25 minutes per charge. Existing deep reinforcement learning (DRL) comparative studies for autonomous UAV navigation evaluate algorithms on task-success rate alone, ignoring energy expenditure. This paper proposes an energy-aware multi-objective reward function with a per-step energy penalty (w_e = −0.20) and a battery-scaled goal bonus (+200·(1+0.5·b/100)), creating a 43% reward differential between energy-efficient and energy-wasteful arrivals. Three algorithms — Deep Q-Network (DQN), Proximal Policy Optimisation (PPO with GAE), and Soft Actor-Critic (SAC with reparameterisation trick and twin critics) — are implemented in pure NumPy and compared across five random seeds over 200,000 training steps. SAC achieves Pareto-optimality: 82.2±2.7% success with 24.2±1.8% battery use; PPO: 71.7±3.1% / 29.2±1.8%; DQN: 57.8±2.6% / 36.1±2.2%; A*+PID: 43.5±5.2% / 48.9±4.7% (with full obstacle knowledge). ANOVA yields F = 93.96 (p < 0.001); all pairwise comparisons are significant after Bonferroni correction; Cohen's d ≥ 3.6. Ablation confirms each reward component contributes independently. SAC maintains above 68.7% success under combined sensor noise and wind disturbance without retraining. All code is available in Appendix A.

Version published to 10.31224/6839
Apr 16, 2026

Piezoelectric Energy Harvesting Coupled with Energy-Aware Deep Reinforcement Learning for Extended-Endurance Autonomous UAVs

This article has 1 author:
1. Sayeed Omar
This article has no evaluationsLatest version Apr 17, 2026
Trust Guided Reinforcement Learning for Safe Robot Navigation with Dynamic Window Approach

This article has 4 authors:
1. Yuhan Wang
2. Haonan Li
3. Hu Luo
4. Gebel Elena Sergeevna
This article has no evaluationsLatest version Apr 17, 2026
Lightweight Autonomous Navigation on Nano-UAVs: A Stem-Optimized Depthwise Separable CNN with 540K MACs

This article has 2 authors:
1. Ashwin Kumar
2. P. Bavithra Matharasi
This article has no evaluationsLatest version Apr 14, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Piezoelectric Energy Harvesting Coupled with Energy-Aware Deep Reinforcement Learning for Extended-Endurance Autonomous UAVs

Trust Guided Reinforcement Learning for Safe Robot Navigation with Dynamic Window Approach

Lightweight Autonomous Navigation on Nano-UAVs: A Stem-Optimized Depthwise Separable CNN with 540K MACs