Actor-Critic Networks with Analogue Memristors Mimicking Reward-Based Learning

Kevin Portner*
Till Zellweger*
Flavio Martinelli
Laura Bégon-Lours
Valeria Bragaglia
Christoph Weilenmann
Daniel Jubin
Donato Falcone
Felix Hermann
Oscar Hrynkevych
Tommaso Stecconi
Antonio La Porta
Ute Drechsler
Antonis Olziersky
Bert Offrein
Wulfram Gerstner
Mathieu Luisier
Alexandros Emboras

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Advancements in memristive devices have given rise to a new generation of specialized hardware for bio-inspired computing. However, most of these implementations draw only partial inspiration from the architecture and functionalities of the mammalian brain. Moreover, the use of memristive hardware is typically restricted to specific elements within the learning algorithm, leaving computationally expensive operations to be executed in software. Here, we demonstrate reinforcement learning through an actor-critic temporal difference (TD) algorithm implemented on analogue memristors, mirroring the principles of reward-based learning in a neural network architecture similar to the one found in biology. Memristors are used as multi-purpose elements within the learning algorithm: They act as synaptic weights that are trained online, they calculate the weight updates associated with the TD-error directly in hardware, and they determine the actions to navigate the environment. Thanks to these features, weight training can take place entirely in-memory, eliminating data movement. We test our framework on two navigation tasks - the T-maze and the Morris water-maze - using analogue memristors based on the valence change memory (VCM) effect. Our approach represents a first step towards fully in-memory and online neuromorphic computing engines based on bio-inspired learning schemes.

Version published to 10.21203/rs.3.rs-3993700/v2 on Research Square
Oct 10, 2025
Version published to 10.21203/rs.3.rs-3993700/v1 on Research Square
Mar 21, 2024

Rapidly Reconfigurable Dynamic Computing in Neural Networks with Fixed Synaptic Connectivity

This article has 5 authors:
1. Kai Mason
2. Sonia Sennik
3. Claudia Clopath
4. Aaron Gruber
5. Wilten Nicola
This article has no evaluationsLatest version Oct 6, 2025
Robust distillation for compute-in-memory: Realizing reliable intelligence using imperfect memristors

This article has 13 authors:
1. Yang Gao
2. Wenbin Li
3. Dongdong Ren
4. Zhiping Wu
5. Dehe Kong
6. Liuyuan Wen
7. Li Du
8. Yuan Du
9. Yuekun Yang
10. Yinghuan Shi
11. Cong Wang
12. Feng Miao
13. Hongbing Pan
This article has no evaluationsLatest version Oct 24, 2025
A hardwired neural circuit for temporal difference learning

This article has 8 authors:
1. Malcolm G Campbell
2. Yongsoo Ra
3. Zhiqin Chen
4. Shudi Xu
5. Mark Burrell
6. Sara Matias
7. Mitsuko Watabe-Uchida
8. Naoshige Uchida
This article has no evaluationsLatest version Sep 18, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Rapidly Reconfigurable Dynamic Computing in Neural Networks with Fixed Synaptic Connectivity

Robust distillation for compute-in-memory: Realizing reliable intelligence using imperfect memristors

A hardwired neural circuit for temporal difference learning