High performance sorting of motor unit action potentials with EMUsort
Curation statements for this article:-
Curated by eLife
eLife Assessment
This useful study presents a new method to identify the activity of single motor units from intramuscular EMG recordings. Validation against state-of-the-art techniques is limited to a small sample of simulated motor units; consequently, the evidence supporting the method's accuracy remains incomplete. The manuscript would be significantly strengthened by using more unbiased simulations for validation, validating the method with experimental datasets, comparing it against more recent techniques, and investigating how muscle physiology impacts accuracy.
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (eLife)
Abstract
Understanding how neural signals control muscle activity during behavior is a key challenge in motor neuroscience. To this end, recent advances in intramuscular multielectrode arrays have enabled high-quality multichannel recordings of many motor unit action potentials (MUAPs) in freely moving subjects. However, identifying individual MUAP events within multichannel recordings is a significant challenge for existing spike sorting methods, which are typically optimized for identifying action potentials from neurons in the brain. To overcome this challenge, we developed the Enhanced Motor Unit sorter (EMUsort), an extension of Kilosort4 (KS4) that achieves high-performance MUAP spike sorting. We applied EMUsort to high-resolution intramuscular recordings from rat forelimb during locomotion and monkey forelimb during a reaching task. EMUsort improves upon prior methods by addressing key challenges encountered with MUAP datasets, including: 1) long time delays across electrodes due to propagation along muscle fibers, 2) more complex waveform shapes compared to neuronal action potentials, and 3) a high degree of MUAP overlap due to cumulative motor unit recruitment. We compared EMUsort to existing spike sorting methods quantitatively using simulated datasets that closely emulated the rat and monkey datasets we recorded. EMUsort provided median error rate reductions of 67.5% and 49.9% during periods of high motor unit activation for the rat and monkey datasets, respectively. In sum, EMUsort provides a substantial improvement to MUAP spike sorter accuracy, especially during regions of high MUAP overlap, in an easy-to-use software package.
Article activity feed
-
eLife Assessment
This useful study presents a new method to identify the activity of single motor units from intramuscular EMG recordings. Validation against state-of-the-art techniques is limited to a small sample of simulated motor units; consequently, the evidence supporting the method's accuracy remains incomplete. The manuscript would be significantly strengthened by using more unbiased simulations for validation, validating the method with experimental datasets, comparing it against more recent techniques, and investigating how muscle physiology impacts accuracy.
-
Reviewer #1 (Public review):
Summary:
The authors introduce EMUsort, an open-source algorithm for the automatic decomposition of high-resolution intramuscular EMG recordings. The method builds upon the Kilosort4 framework and incorporates modifications designed to better handle the spatial and temporal characteristics of intramuscular signals. The performance of EMUsort is evaluated on openly available datasets and compared against KS4 and MUEdit, demonstrating improved motor unit accuracy.
Strengths:
(1) The manuscript is clearly written, technically detailed, and well structured.
(2) The open-source software is thoroughly documented, both within the manuscript and in the accompanying repository README, facilitating adoption by the community.
(3) The availability of both code and datasets is a major strength, enabling reproducibility …
Reviewer #1 (Public review):
Summary:
The authors introduce EMUsort, an open-source algorithm for the automatic decomposition of high-resolution intramuscular EMG recordings. The method builds upon the Kilosort4 framework and incorporates modifications designed to better handle the spatial and temporal characteristics of intramuscular signals. The performance of EMUsort is evaluated on openly available datasets and compared against KS4 and MUEdit, demonstrating improved motor unit accuracy.
Strengths:
(1) The manuscript is clearly written, technically detailed, and well structured.
(2) The open-source software is thoroughly documented, both within the manuscript and in the accompanying repository README, facilitating adoption by the community.
(3) The availability of both code and datasets is a major strength, enabling reproducibility and independent validation.
(4) The authors provide quantitative comparisons with existing decomposition algorithms, which is essential for contextualizing the proposed method.
(5) The methodological details are sufficiently described to allow replication and further development by other researchers.
Weaknesses:
While the manuscript is strong overall, I have several suggestions that could further strengthen its impact and clarity.
(1) Benchmarking and community integration
A recent work has proposed standardized datasets and benchmarking pipelines for high-density surface EMG decomposition ("MUniverse: A Simulation and Benchmarking Suite for Motor Unit Decomposition", Mamidanna*, Klotz*, Halatsis* et al, NeurIPS 2025). A similar effort for intramuscular EMG would be highly valuable to the field. The authors may consider discussing how their dataset and algorithm could be integrated into broader benchmarking initiatives (e.g., platforms such as MUniverse), enabling systematic comparisons across multiple datasets and decomposition methods.
(2) Comparison with additional decomposition algorithms
Since the manuscript compares EMUsort with MUEdit, it would be appropriate to also include a comparison with Swarm-Contrastive Decomposition (SCD), which has been proposed for both surface and intramuscular EMG signals. Including this comparison, or explicitly discussing why it was not feasible, would strengthen the positioning of EMUsort relative to the current state of the art.
(3) Manual editing and post-processing
In practical EMG decomposition workflows, manual inspection and editing of motor units are often required after automatic decomposition. It would be useful for readers to know whether EMUsort provides (or is compatible with) a graphical interface or workflow for manual refinement, or how the authors envision this step being handled.
(4) Ablation analysis of algorithmic modifications
EMUsort is described as an extension of Kilosort4. An ablation analysis examining the impact of the main modifications introduced relative to KS4 would help clarify which changes contribute most to the observed performance improvements and under which conditions.
(5) Failure modes and limitations
A more explicit discussion of when EMUsort is likely to fail or degrade in performance would be valuable. For example, sensitivity to the number of channels, recording duration, signal quality, or motor unit density could be discussed to guide users.
(6) Generalisability to surface EMG
Given the shared methodological foundations between surface and intramuscular EMG decomposition, it would be helpful to know whether EMUsort has been tested on high-density surface EMG datasets or whether the authors expect limitations when applied outside the intramuscular domain.
(7) Applicability to human intramuscular recordings
The authors could clarify whether EMUsort has been tested on human intramuscular EMG, and discuss any expected differences in performance due to anatomical or physiological factors.
(8) Parameter sensitivity
Clustering-based methods can be sensitive to parameter choices. Reporting a parameter sensitivity analysis, or at least discussing the robustness of EMUsort to parameter variations, would increase confidence in the method's reliability and ease of use.
(9) Differences between template matching and BSS methods
Since the manuscript proposes a new template matching algorithm, but it compares its performance with a BSS one (MUedit), BSS algorithms should be described in the introduction. The differences between the methodologies should be highlighted, and the pros and cons of each described.
Conclusion:
The authors largely achieve their stated aims, and the results mostly support the main conclusions. EMUsort represents a meaningful contribution to the EMG decomposition literature, particularly for researchers working with high-resolution intramuscular recordings. With additional clarification regarding benchmarking, algorithmic ablations, and limitations, the manuscript would be further strengthened and likely to have a substantial impact on the field.
-
Reviewer #2 (Public review):
Summary:
This work presents a new spike sorter, EMUsort, to target the challenging task of spike sorting Motor Unit Action Potentials (MUAP). EMUsort is essentially a modified version of Kilosort, with some key extensions to target EMG data: correct for large delays due to propagation across channels, spike detection of highly overlapping and large units via multiple thresholds, an increased number of waveform templates for spike detection, and an extended representation of waveforms to grasp complex MUAP spike shapes. The results on simulated data show solid evidence that the applied modifications make a difference for EMG recordings. All in all, I believe that EMUsort will greatly improve spike sorting performance for high-density EMG data.
Strengths:
The manuscript is well written, and the methods and …
Reviewer #2 (Public review):
Summary:
This work presents a new spike sorter, EMUsort, to target the challenging task of spike sorting Motor Unit Action Potentials (MUAP). EMUsort is essentially a modified version of Kilosort, with some key extensions to target EMG data: correct for large delays due to propagation across channels, spike detection of highly overlapping and large units via multiple thresholds, an increased number of waveform templates for spike detection, and an extended representation of waveforms to grasp complex MUAP spike shapes. The results on simulated data show solid evidence that the applied modifications make a difference for EMG recordings. All in all, I believe that EMUsort will greatly improve spike sorting performance for high-density EMG data.
Strengths:
The manuscript is well written, and the methods and modifications to the Kilosort pipeline are well-motivated, well-explained, and clear. The simulation results provide strong evidence that the presented modifications make spike sorting of high-density EMG data more accurate.
Weaknesses:
The method is overall only validated on 15 simulated motor units. The monkey dataset, in particular, seems too "easy" and not challenging enough to highlight weaknesses of any of the spike sorters. A second weakness is in the distribution of the code, which is shipped with submodules for Kilosort and SpikeInterface, and makes it hard to maintain long-term, and pins to old versions of these key dependencies.
-
Reviewer #3 (Public review):
Summary
This paper introduces EMUsort, an extension of Kilosort4 designed to sort motor unit action potentials from high-density intramuscular EMG recordings. Using rat and monkey forelimb recordings, the authors generate realistic simulated datasets with known ground truth and demonstrate that EMUsort substantially outperforms Kilosort4 and MUedit, particularly during periods of high motor unit overlap.
Strengths
This is a timely study in light of recent advances in intramuscular muscle recording technologies and the growing interest in automated methods for decoding neural and neuromuscular signals. The work leverages state-of-the-art electrode arrays and combines them with advanced signal processing tools to address a challenging and relevant problem in motor unit analysis.
Weaknesses
There are several …
Reviewer #3 (Public review):
Summary
This paper introduces EMUsort, an extension of Kilosort4 designed to sort motor unit action potentials from high-density intramuscular EMG recordings. Using rat and monkey forelimb recordings, the authors generate realistic simulated datasets with known ground truth and demonstrate that EMUsort substantially outperforms Kilosort4 and MUedit, particularly during periods of high motor unit overlap.
Strengths
This is a timely study in light of recent advances in intramuscular muscle recording technologies and the growing interest in automated methods for decoding neural and neuromuscular signals. The work leverages state-of-the-art electrode arrays and combines them with advanced signal processing tools to address a challenging and relevant problem in motor unit analysis.
Weaknesses
There are several aspects of the study that substantially limit the interpretation of the main results and conclusions. The following major points should be carefully considered by the authors.
(1) Choice of experimental model and validation framework: The study aims to validate a new methodology for EMG decomposition, yet the rationale for the chosen experimental models is unclear. Specifically, it is not evident why the authors focused on intramuscular recordings from two animal models performing dynamic tasks. Given the extensive literature on the development and validation of EMG decomposition methods, the choice of an experimental design that substantially deviates from established approaches is insufficiently justified. In particular, it is unclear why the authors did not consider more standard validation paradigms based on (i) isometric contractions, (ii) human data, (iii) surface EMG recordings, or (iv) combinations of their recording technologies with previously validated motor unit identification methods. This methodological divergence makes it difficult to interpret the findings in the context of existing evidence.
(2) Lack of manual EMG decomposition as reference: Related to the previous point, it is unclear why standard manual EMG decomposition methods were not used to generate reference datasets for validation. Manual decomposition has been shown to be reliable under specific conditions (low contraction levels, slow dynamics, etc.) and would have substantially strengthened the validation of the proposed algorithm.
(3) Neglect of muscle deformation effects: While the manuscript discusses several factors that complicate EMG decomposition relative to brain recordings, it does not address the well-known effects of muscle deformation during contractions on motor unit action potential shapes. There is extensive literature demonstrating that dynamic muscle contractions lead to systematic changes in action potential morphology, representing a major challenge for EMG decomposition and a fundamental difference from brain recordings. Additionally, even small relative movements of intramuscular electrodes can produce waveform changes that are large relative to muscle fiber dimensions. These issues are particularly relevant given the highly dynamic tasks studied here (e.g., treadmill walking in rats), yet they are not discussed or incorporated into the analysis.
(4) Exclusive reliance on simulated data for validation: The primary validation of EMUsort is based on simulated data, which represents a major limitation of the study. This reliance should be clearly and explicitly stated in the abstract, introduction, and discussion. Moreover, the simulation approach itself raises concerns. The simulated EMG signals are generated using templates derived from the same sorting framework being validated, which introduces a potential methodological bias. The linear combination of components used to synthesize waveforms constitutes an unjustified modeling assumption that may favor template-based approaches such as EMUsort. Additionally, the spike time generation procedure appears unnecessarily complex and insufficiently justified. Previous validation studies typically modeled motor units as firing at relatively stable levels along their recruitment curves, producing long spike trains with pseudo-random relative timing and diverse overlap conditions. This framework would likely provide a more robust and interpretable validation. If the authors believe their simulation approach is superior, a stronger justification is required. Finally, the limited number of simulated motor units is difficult to reconcile with the expected level of motor unit recruitment during the studied behaviors, and this choice is not adequately justified.
(5) Incomplete reporting and visualization of experimental data: The manuscript would benefit from a clearer description of the number of rats and monkeys used, which should be reported explicitly in the abstract. In addition, visualizations of the raw multichannel EMG data across different task phases and activation levels would substantially improve transparency. Providing comprehensive visualizations of motor unit action potential shapes across all channels and identified units (for both rats and monkeys) would also help readers assess the spatiotemporal features that underpin unit identification and sorting reliability.
(6) Physiological limitations of conduction delay correction: The proposed method for correcting conduction delays across channels is physiologically suboptimal. First, motor unit conduction velocities differ substantially across units, implying that delay correction should be applied at the unit level rather than uniformly across channels. Second, conduction delays depend on fiber orientation and distance relative to electrode geometry; if fibers are oriented at different angles with respect to the array, a single delay correction becomes invalid. Additionally, the schematic in Figure 2A appears to contradict the proposed correction approach: if electrode threads are arranged perpendicular to muscle fibers, conduction delays across channels within a single thread should be minimal.
(7) Clarity issues in Figure 4: Figure 4 (panels A-D) is potentially misleading. It should be clearly stated whether the signals shown are artificial examples or derived from real recordings; ideally, real data should be used to illustrate the advantages of dynamic thresholds. In panels B-D, the depiction of overlapping action potentials is difficult to interpret due to the thickness of the traces, and it is unclear whether overlaps with neighboring action potentials are absent by design or expected to occur in real data. If overlaps are expected, one would also expect to observe contamination in the extracted waveforms, which is not evident in the figure.
(8) Concerns regarding method comparisons: The comparison with existing methods raises methodological concerns. It appears that EMUsort was carefully optimized, whereas the competing algorithms were not equivalently fine-tuned. The literature clearly shows that EMG decomposition performance depends strongly on adapting algorithms to the signal type (intramuscular vs. surface, species, electrode geometry). Furthermore, it is surprising that MUedit is reported to perform particularly poorly during periods of motor unit overlap, as blind source separation methods were explicitly developed to handle convolutive mixtures and overlapping sources, especially in surface EMG (which is an extreme case of motor unit overlapping). This discrepancy requires further explanation.
(9) Insufficient characterization of motor unit firing properties: The study does not provide sufficient information about the firing characteristics of the identified motor units in experimental data. Relevant metrics that should be reported include average, minimum, and maximum firing rates; coefficients of variation of discharge rate; signal-to-noise ratios of motor unit action potentials; potential evidence of motor unit rotation over time; and stability of firing behavior across recording intervals.
(10) Lack of theoretical framing: Given the scope and claims of the paper, it would be valuable to include a more theory-driven introduction explaining why different sorting approaches (e.g., template matching vs. blind source separation) may be more or less suitable depending on the nature of the recorded signals. A clearer conceptual rationale for why the proposed approach is expected to outperform existing methods would substantially strengthen the manuscript.
(11) Limitations of validation metrics: Some of the metrics used to evaluate performance are not ideal. For example, reporting 0% accuracy for a unit is misleading and should instead be described as a failure to identify that unit. Similarly, comparisons of total spike counts are of limited interpretive value and may be misleading, as correct spike counts do not necessarily imply correct spike identities.
(12) Clarification of computational performance claims: Finally, the discussion of computation times should clarify that some existing methods require substantial time for offline estimation of projection vectors but can operate in near real time once these vectors are learned and remain stable. This distinction is important for a fair comparison of practical usability.
-