M-Learning: A Computationally Efficient Heuristic for Reinforcement Learning with Delayed Rewards

Marlon Sneider Mora Cortes
Cesar Andrey Perdomo Chary
Oscar J. Perdomo

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The current design of reinforcement learning methods demands exhaustive computing. Algorithms such as Deep Q-Network achieved outstanding results in the development of the area. However, the need for thousands of parameters and training episodes is still a problem. Thus, this document proposes a comparative analysis of the Q-Learning algorithm (the inception to create Deep Q Learning) and our proposed method termed M-Learning. The comparison among algorithms using Markov decision processes with delayed reward as a general testbench framework. Firstly, a full description of the main problems related to implementing Q-Learning, mainly about its multiple parameters. Then, the foundations of our proposed heuristic with its formulation and the whole algorithm were reported in detail. Finally, the methodology chosen to compare both algorithms was to train the algorithms in the Frozen Lake environment. The experimental results and an analysis of the best solutions found that our proposed algorithm highlights the differences in the number of episodes necessary and their standard variations. The code will be available on a GitHub repository once the paper is published.

Version published to 10.20944/preprints202407.2253.v1
Jul 29, 2024

Multi-agents Rounding Strategy based on DeepReinforcement Learning GCMSA Algorithm

This article has 2 authors:
1. Zhaotian Wei
2. Ruixuan Wei
This article has no evaluationsLatest version Jul 30, 2024
Speeding up hierarchical reinforcement learning using state-independent temporal skills

This article has 4 authors:
1. Leila Azadkhah
2. Omid Davoodi
3. Mohammad Ghazanfari
4. Nasser Mozayani
This article has no evaluationsLatest version Jul 29, 2024
Reinforcement Learning: Tutorial and Survey

This article has 2 authors:
1. Benyamin Ghojogh
2. Ali Ghodsi
This article has no evaluationsLatest version Jul 22, 2024

Listed in

Abstract

Article activity feed

Related articles

Multi-agents Rounding Strategy based on DeepReinforcement Learning GCMSA Algorithm

Speeding up hierarchical reinforcement learning using state-independent temporal skills

Reinforcement Learning: Tutorial and Survey