Characterization of a Fixed Reinforcement Learning Policy for Aerial Robot with Suspended Payload under Variable Flight Conditions

Ali Tahir Karasahin
Ziniu Wu
Basaran Bahadir Kocer

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Flights with suspended payloads are particularly challenging because of their coupled dynamics, which lead to instability and increased sensitivity to disturbances. Although reinforcement learning (RL) has successfully achieved controller performance, the generalization and robustness of a single policy remain significant areas of investigation. In this study, we characterized the performance and robustness of a single RL policy for an aerial robot with different trajectory profiles, including a smooth, feasible lemniscate curve and a sharp-turning, infeasible pentagram, under varying velocity references (0.5 m/s and 1.0 m/s) and crosswind disturbances (1.0 m/s). We trained a single RL policy using Proximal Policy Optimization (PPO) with collective thrust and body-rate (CTBR) control using a high-fidelity physics simulator based on the SimpleFlight framework. Real-world experimental results on the Crazyflie 2.1 platform show that the single RL policy successfully generalizes to different trajectory profiles and velocity references and maintains stability under a crosswind disturbance of up to 1.0 m/s which is a substantial challenge for this small class platform and even smaller payload underneath, where such aerodynamic forces are significant compared to the available control authority and system mass. Furthermore, the single RL policy was systematically evaluated using the Mean Euclidean distance (MED) error, cable length transitions, and swing angle distributions. Although the single RL policy maintained a robust control performance, the experimental results indicated performance degradation at higher velocities owing to increased dynamic challenges such as nonlinear aerodynamic drag and actuator saturation. This study provides a detailed performance characterization that highlights the generalization capability of a single-payload-aware RL policy in real-world applications and the limitations arising from the hybrid dynamics of the system.

Version published to 10.21203/rs.3.rs-8603104/v1 on Research Square
Mar 4, 2026

Learning-based Path Planning Techniques for Autonomous Unmanned Aerial Vehicles

This article has 4 authors:
1. Ahmad Bani Younes
2. Khaled Hatamleh
3. Ahmad M. Al-Shorman
4. Hasan Khalil Al-Asa'd
This article has no evaluationsLatest version Mar 9, 2026
Enhanced geometry control Powered by AI for UAVS with a robotic arm for compensating for disturbances

This article has 4 authors:
1. Khaled Oqda
2. Eman M. El-Gendy
3. Hanaa Salem Marie
4. Mohamed Akalla
This article has no evaluationsLatest version Mar 24, 2026
Robust Quadrupedal Locomotion on Complex Terrains via Adaptive Entropy Learning

This article has 4 authors:
1. Jiale Chen
2. Lingyun Kong
3. ZhenYao Zhang
4. Zhipeng Xue
This article has no evaluationsLatest version Apr 14, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Learning-based Path Planning Techniques for Autonomous Unmanned Aerial Vehicles

Enhanced geometry control Powered by AI for UAVS with a robotic arm for compensating for disturbances

Robust Quadrupedal Locomotion on Complex Terrains via Adaptive Entropy Learning