BeamCraft: Deep Reinforcement Learning-DrivenMulti-Objective Beamforming for ISAC

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Integrated sensing and communication (ISAC) is central to the vision of 5G-Advanced and 6G systems. Dynamic resource allocation remains one of the most challenging topics in ISAC, yet existing optimization methods struggle with fast-varying environment and inherent non-convexity of ISAC problems. This paper presents a deep reinforcement learning (DRL) framework that enables robust, real-time adaptive beamforming by jointly optimizing communication and sensing performance under dynamic environments. The proposed approach employs a Twin Delayed Deep Deterministic Policy Gradient with Prioritized Experience Replay (TD3-PER) algorithm to efficiently learn the optimal resource allocation strategy under continuous state and action spaces. A multi-objective reward formulation allows flexible adjustment of the ISAC performance weights, enabling the system to prioritize either communication or sensing performance depending on operational requirements. The training process consists of an offline phase, where the agent learns from extensive interactions with a simulated environment until convergence, and an online phase, where the trained model performs real-time inference and adapts to time-varying channel and target conditions. Simulation results demonstrate that the proposed DRL-based framework achieves fast online execution (millisecondlevel), strong robustness, and superior adaptability compared to conventional deterministic and other learning-based methods, making it a promising candidate for practical ISAC deployment in next-generation wireless networks.

Article activity feed