Efficient and reproducible pipelines for spike sorting large-scale electrophysiology data

Alessio P. Buccino
Arjun Sridhar
David Feng
Karel Svoboda
Joshua H. Siegle

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The scale of in vivo electrophysiology has expanded in recent years, with simultaneous recordings across thousands of electrodes now becoming routine. These advances have enabled a wide range of discoveries, but they also impose substantial computational demands. Spike sorting, the procedure that extracts spikes from extracellular voltage measurements, remains a major bottleneck: a dataset collected in a few hours can take days to spike sort on a single machine, and the field lacks rigorous validation of the many spike sorting algorithms and preprocessing steps that are in use. Advancing the speed and accuracy of spike sorting is essential to fully realize the potential of large-scale electrophysiology. Here, we present an end-to-end spike sorting pipeline that leverages parallelization to scale to large datasets. The same workflow can run reproducibly on individual workstations, high-performance computing clusters, or cloud environments, with computing resources tailored to each processing step to reduce costs and execution times. In addition, we introduce a benchmarking pipeline, also optimized for parallel processing, that enables systematic comparison of multiple sorting pipelines. Using this framework, we show that Kilosort4 , a widely used spike sorting algorithm, outperforms Kilosort2.5 (Pachitariu et al. 2024). We also show that 7× lossy compression, which substantially reduces the cost of data storage, has minimal impact on spike sorting performance. Together, these pipelines address the urgent need for scalable and transparent spike sorting of electrophysiology data, preparing the field for the coming flood of multi-thousand-channel experiments.

Version published to 10.1101/2025.11.12.687966 on bioRxiv
Nov 13, 2025

Scalable spike sorting across thousands of neurons by modeling neural dynamics with NeuroSort

This article has 11 authors:
1. Ling Liu
2. Zhengwei Hu
3. Songming Zhang
4. Yang Liu
5. Huilin Jia
6. Hongji Sun
7. Shengyi Jia
8. Xiangdong Sun
9. Jian K. Liu
10. Xiaojie Duan
11. Xiaojian Li
This article has no evaluationsLatest version Nov 12, 2025
RawBench: A Comprehensive Benchmarking Framework for Raw Nanopore Signal Analysis Techniques

This article has 4 authors:
1. Furkan Eris
2. Ulysse McConnell
3. Can Firtina
4. Onur Mutlu
This article has no evaluationsLatest version Oct 6, 2025
CUSP: Complex Spike Sorting from Multi-electrode Array Recordings with U-net Sequence-to-Sequence Prediction

This article has 4 authors:
1. Chenhao Bao
2. Robyn Mildren
3. Adam S. Charles
4. Kathleen E. Cullen
This article has no evaluationsLatest version Nov 18, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Scalable spike sorting across thousands of neurons by modeling neural dynamics with NeuroSort

RawBench: A Comprehensive Benchmarking Framework for Raw Nanopore Signal Analysis Techniques

CUSP: Complex Spike Sorting from Multi-electrode Array Recordings with U-net Sequence-to-Sequence Prediction