Transforming Opportunistic Routing: A Deep Reinforcement Learning Framework for Reliable and Energy-Efficient Communication in Mobile Cognitive Radio Sensor Networks
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The Mobile Reliable Opportunistic Routing (MROR) protocol improves data-forwarding reliability in Cognitive Radio Sensor Networks (CRSNs) through mobility-aware virtual contention groups and handover zoning. However, its heuristic decision logic is difficult to optimize under highly dynamic spectrum access and random node mobility. To address this limitation, we present DRL-MROR, a refined routing framework that incorporates deep reinforcement learning (DRL) to enable intelligent and adaptive forwarding decisions. In DRL-MROR, the secondary users (SUs) act as autonomous agents that observe local state information, including primary-user activity, link quality, residual energy, and neighbor-mobility patterns. Each agent learns a forwarding policy through a Deep Q-Network (DQN) optimized for long-term network utility in terms of throughput, delay, and energy efficiency. We formulate routing as a Markov Decision Process (MDP) and use experience replay with prioritized sampling to improve learning stability and convergence. The DQN used at each node is intentionally lightweight, requiring 5514 trainable parameters, about 21.5 kB of weight storage in 32-bit precision, and approximately 5.4k multiply-accumulate operations per inference, which supports practical deployment on edge-capable CRSN nodes. Extensive simulations show that DRL-MROR outperforms the original MROR protocol and representative AI-based routing baselines such as AIRoute under diverse operating conditions. The results indicate gains of up to 38% in throughput, 42% in goodput, a 29% reduction in energy consumed per packet, and an approximately 18% improvement in network lifetime, while maintaining high route stability and fairness. DRL-MROR also reduces control overhead by about 30% and average end-to-end delay by up to 32%, maintaining strong performance even under elevated PU activity and higher node mobility. These results show that augmenting opportunistic routing with lightweight DRL can substantially improve adaptability and efficiency in next-generation IoT-oriented CRSNs.