SIT-Conversion: Transformer Spiking Neural Networks with Spiking-Softmax Function

Xuhang Li
Qianzi Shen
Haitao Wang
Zijian Wang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Studies on integrating Spiking Neural Networks (SNNs) with the Transformer architecture holds promise for enabling models to achieve ultra-low energy consumption while possessing the performance of the Transformer architecture. Currently, studies on ANN-to-SNN conversion of integrating Spiking Neural Networks (SNNs) with the Transformer architecture mainly focuses on simple activation functions in MLPs, and has not yet addressed the mismatch between the Softmax activation function in the self-attention mechanism and the computation rules of SNNs. Consequently, the ANN-to-SNN conversion efforts have consistently failed to make the Transformer architecture directly applicable to SNNs. To address this challenge, we propose the Spiking-Softmax method, which integrates Spiking Exponential Neuron (SI-exp) and Spiking Collaboration Normalized Neuron (SI-norm). The Spiking-Softmax method accurately simulates the Softmax activation function with only 12 time steps. Building upon this, we propose the Spike Integrated Transformer conversion (SIT-conversion) method, which enables the conversion of the Transformer architecture to SNNs. The SNNs generated by the SIT-conversion of Transformer models of various sizes achieve accuracy nearly identical to their ANNs counterparts, achieving nearly lossless and ultra-low-latency ANN-to-SNN conversion. This work represents the first implementation of simulating the Softmax activation function and fully converting the Transformer architecture into SNNs through spike firing.

Version published to 10.20944/preprints202410.2403.v1
Oct 30, 2024

The Spike Processing Unit (SPU): An IIR Filter Approach to Hardware-Efficient Spiking Neurons

This article has 1 author:
1. Hugo Puertas de Araújo
This article has no evaluationsLatest version Jan 14, 2026
E-SKAN: Breaking the Efficiency-Accuracy Frontier in Neuromorphic Computing via Event-Driven Kolmogorov-Arnold Networks

This article has 2 authors:
1. Nihal Anil
2. Noora Sajil
This article has no evaluationsLatest version Jan 28, 2026
The Difference Neuron — A new spiking Neuron Model

This article has 4 authors:
1. Jacob Kanev
2. Chris Christodoulou
3. Achilleas Koutsou
4. Klaus Obermayer
This article has no evaluationsLatest version Feb 2, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

The Spike Processing Unit (SPU): An IIR Filter Approach to Hardware-Efficient Spiking Neurons

E-SKAN: Breaking the Efficiency-Accuracy Frontier in Neuromorphic Computing via Event-Driven Kolmogorov-Arnold Networks

The Difference Neuron — A new spiking Neuron Model