Adaptive Reinforcement Learning with Temporal Prediction for Routing in Congestion-Aware Dynamic IoT Network
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Despite significant progress in Wireless networks, especially Mobile Ad Hoc Networks (MANETs) and Internet of Things (IoTs) networks, current approaches still face challenges in providing reliable, low-latency and energy-efficient communication in highly dynamic environments. Existing routing protocols, such as Ad hoc On-Demand Distance Vector (AODV), suffer from frequent route interruptions, high end-to-end delay, and high energy consumption, making them unsuitable for large-scale and real-time IoT applications. Reinforcement learning (RL) and Q-learning based protocols remain largely reactive, responding only to present or historical conditions without predictive foresight, which leads to higher packet loss and delay in highly dynamic environments. Similarly, LSTM and other ML- based models excel at temporal prediction but are rarely integrated into real-time routing frameworks. Moreover, most existing protocols focus on optimizing a single performance metric, such as delay or reliability, while neglecting the need for multi-metric optimization that simultaneously balances latency, Packet Delivery Ratio, and energy efficiency. To address these shortcomings, this research introduces a new LSTM-based Q- Learning routing protocol (LSTM- Q) that combines reinforcement learning with long short-term memory (LSTM) networks to enable adaptive and predictive route selection. The LSTM subsystem enables the protocol to capture temporal dependencies in behavior so that link stability and congestion patterns can be predicted more accurately, while Q-learning facilitates efficient decision-making through reward- based adaptation. Extensive simulations across varying node densities demonstrate that the designed protocol outperforms AODV and baseline Q-learning methods. In particular, LSTM-Q achieves a Packet Delivery Ratio (PDR) of 98.32%, reduces end-to-end delay to 14.42 ms, increases throughput to 19.15 Kbps and decreases average energy consumption to 0.57 J, signifying improvement of more than 50% in reliability and 80% in energy efficiency over AODV. These results substantiate that LSTM-Q offers an efficient, scalable, and energy-aware routing methodology, rendering it highly efficient for dynamic wireless networks like IoT, vehicular networks, and mission-critical communication networks.