Safe Reinforcement Learning for Vision-Based Robotic Manipulation in Human-Centered Environments

Fawad Khan
Wei Feng
Zhiyong Wang
Tianlun Huang
Xiao Liu
Yunduan Cui
Wang Weijun

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Autonomous systems performing object manipulation in human-robot collaboration scenarios face fundamental challenges in balancing adaptability with safety constraints. We present a RL framework that addresses these challenges through safety-aware policy learning. Building upon OpenAI Safety Gym, we extend its capabilities by implementing a robotic arm model for object manipulation tasks. Our approach employs end-to-end policy learning, comparing a constrained Lagrangian variant of Proximal Policy Optimization (cPPO) against standard PPO and Soft Actor-Critic (SAC) baselines. To handle high-dimensional 1 visual inputs, we develop a structured representation learning method that effectively captures multiple skills, objects, and their interactions. The framework enables goal-conditioned manipulation across object configurations, demonstrating strong compositional generalization: the agent trained on simple two-cubic object scenarios successfully generalized to tasks with three distinct objects in more cluttered settings. Due to computational constraints of high-mass objects in the simulation environment, testing was limited to scenarios with up to three objects. Experimental results show that cPPO achieves superior safety performance with an average episode cost of 15.26 compared to 18.03 for PPO and 19.48 for SAC. While cPPO’s task performance (average episode reward ∼30) is slightly lower than PPO’s (∼35), it significantly outperforms SAC (∼12). The algorithms demonstrate convergence by 200,000 environment steps, with cPPO achieving rapid safety compliance while exhibiting steady convergence in learning performance. These findings demonstrate the effectiveness of integrating safety constraints within RL for autonomous manipulation, advancing the practical deployment of collaborative robotic systems.

Version published to 10.21203/rs.3.rs-6736564/v1 on Research Square
Jun 16, 2025

UniROS: ROS-Based Reinforcement Learning Across Simulated and Real-World Robotics

This article has 4 authors:
1. Jayasekara Kapukotuwa
2. Brian Lee
3. Declan Devine
4. Yuansong Qiao
This article has no evaluationsLatest version Jul 1, 2025
Deformable and Fragile Object Manipulation: A Review and Prospect

This article has 3 authors:
1. Yicheng Zhu
2. David Yang
3. Yangming Lee
This article has no evaluationsLatest version Aug 4, 2025
Deep Reinforcement and Imitation Learning for Autonomous Driving: A Systematic Review in the CARLA Simulation Environment

This article has 4 authors:
1. Piotr Czechowski
2. Bartosz Kawa
3. Mustafa Sakhai
4. Maciej Wielgosz
This article has no evaluationsLatest version Jul 14, 2025

Listed in

Abstract

Article activity feed

Related articles

UniROS: ROS-Based Reinforcement Learning Across Simulated and Real-World Robotics

Deformable and Fragile Object Manipulation: A Review and Prospect

Deep Reinforcement and Imitation Learning for Autonomous Driving: A Systematic Review in the CARLA Simulation Environment