Speeding up hierarchical reinforcement learning using state-independent temporal skills

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Hierarchical reinforcement learning has the potential to expedite long-term decision-making by abstracting policies into multiple levels. Encouraging outcomes have been observed in challenging reward environments through the use of skills, defined as sequences of basic actions. While existing methods based on offline data have shown promise, the resulting lower-level policies may suffer from unreliability due to limited demonstration coverage or shifts in distribution. To address this limitation, we propose a novel approach to autonomously identify state-independent temporal skills (SITS) by extracting the most repetitive action sequences from a trained agent. These skills, acquired in a simpler source task, can then be transferred to a more complex target task to enhance the agent’s training efficiency, particularly during the exploration phase. Our method is independent of other hierarchical reinforcement learning techniques and can be used in conjunction with them. Experimental results demonstrate the efficacy of incorporating SITS in addressing complex RL challenges.

Article activity feed