Empowering Aerial Maneuver Games Through Model-Based Constrained Reinforcement Learning

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Achieving full autonomy in Within-Visual-Range air combat with a single, end-to-end learning policy is a formidable challenge, where agents must navigate stochastic dynamics and sparse rewards to master the delicate trade-off between aggression and survival. We introduce a Model-Based Reinforcement Learning agent that combines the Dreamer framework with safety-aware objectives to tackle this. To enhance learning stability and foresight in this demanding domain, we augment Dreamer's WM with an Information Noise-Contrastive Estimation loss for long-range dependencies, categorical predictors to robustly model outcomes, Dyna-style actor-critic updates to ground the policy, and a Lipschitz regularizer to constrain value error. Furthermore, our framework integrates a population-based self-play pipeline with curriculum initialization, enabling rapid strategic discovery without expert priors. To validate our approach, we conducted evaluations in a high-fidelity 6-Degree-of-Freedom simulation, where our agent demonstrated superior zero-shot performance, significantly higher sample efficiency than model-free baselines, and rapid fine-tuning against novel opponents, highlighting a viable path toward deployable autonomous agents.

Article activity feed