Speeding up hierarchical reinforcement learning using state-independent temporal skills

Leila Azadkhah
Omid Davoodi
Mohammad Ghazanfari
Nasser Mozayani

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Hierarchical reinforcement learning has the potential to expedite long-term decision-making by abstracting policies into multiple levels. Encouraging outcomes have been observed in challenging reward environments through the use of skills, defined as sequences of basic actions. While existing methods based on offline data have shown promise, the resulting lower-level policies may suffer from unreliability due to limited demonstration coverage or shifts in distribution. To address this limitation, we propose a novel approach to autonomously identify state-independent temporal skills (SITS) by extracting the most repetitive action sequences from a trained agent. These skills, acquired in a simpler source task, can then be transferred to a more complex target task to enhance the agent’s training efficiency, particularly during the exploration phase. Our method is independent of other hierarchical reinforcement learning techniques and can be used in conjunction with them. Experimental results demonstrate the efficacy of incorporating SITS in addressing complex RL challenges.

Version published to 10.21203/rs.3.rs-4678044/v1 on Research Square
Jul 29, 2024

Adaptive Confidence-Weighted Policy Aggregation: A Novel Method for Federated Reinforcement Learning

This article has 3 authors:
1. Nematollah Ab Azar
2. Aref Shahmansoorian
3. Mohsen Davoudi
This article has no evaluationsLatest version May 21, 2025
DynamicRL: Data-Driven Estimation of Trial-by-Trial Reinforcement Learning Parameters

This article has 4 authors:
1. Hua-Dong Xiong
2. Li Ji-An
3. Marcelo G Mattar
4. Robert C Wilson
This article has no evaluationsLatest version Jun 1, 2025
Model-based Individual Learning for Competitive Agents

This article has 5 authors:
1. Yinghui Pan
2. Fanke Chen
3. Biyang Ma
4. Yifeng Zeng
5. Prashant Doshi
This article has no evaluationsLatest version Jun 13, 2025

Listed in

Abstract

Article activity feed

Related articles

Adaptive Confidence-Weighted Policy Aggregation: A Novel Method for Federated Reinforcement Learning

DynamicRL: Data-Driven Estimation of Trial-by-Trial Reinforcement Learning Parameters

Model-based Individual Learning for Competitive Agents