Automated filtering of particle images in single particle cryoEM

Sony Malhotra
Daniel Hatton
Samuel Jackson
Matthew Iadanza
Agnel Praveen Joseph
Colin M. Palmer
Jeyan Thiyagalingam
Tom Burnley
Yuriy Chaban

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Continued exponential growth in the number of structures resolved by single particle cryoEM, as seen in the last decade, requires ever more effective data analysis workflows. Datasets are rarely homogeneous, demanding a multistep procedure for discarding outliers. Since individual particles are very noisy, either 2D or 3D averages are normally used for discrimination. This becomes challenging when the 2D classes themselves are heterogeneous, leading to selection of contaminants or discarding useful rare views/poses. The 3D model-based discrimination requires trustworthy 3D maps and a correct assignment of Euler angles, which in turn depends on the quality of the initial data and might not be available at the very early stages of the analysis. We propose a novel deep-learning approach for improving quality of single particle datasets. The two-stage procedure consists of denoising single particle images using Variational AutoEncoder framework followed by particle quality filtering based on the score inferred for every particle by Domain Adaptation Neural Network trained on a large data set of categorised 2D averages. This approach allows an automated scoring of noisy raw images using data patterns learned from the high signal-to-noise ratio, externally derived 2D classes. Consequently, a higher quality data set enters computationally expensive steps of the data analysis, reducing the need for protracted and expensive calculations. Importantly, our method does not require any prior knowledge about the data or existence of a 3D model, making it universally applicable. Tests on publicly available datasets demonstrated that our approach largely outperformed 2D class-based particle discrimination. Smaller subsets of the top-scoring particles selected with our method were required to obtain the author-reported 3D model resolution. When applied to the user data in the automated on-the-fly data processing pipeline, the method rescued 30% of cases, which otherwise would not reach confidence threshold required for making decision to proceed to the 3D model refinement. It also led to general improvements in the quality of the 3D models for many datasets which were selected for the high-resolution processing.

Version published to 10.1101/2025.11.12.688030 on bioRxiv
Nov 13, 2025

SAF-YOLO: Super-Resolution Augmented Detection Model with Visual State Space Enhancement for Safflower Filament Picking

This article has 6 authors:
1. Mengyu Duan
2. Xiaorong Wang
3. Linwei Qiu
4. Menghao Li
5. Jinrong Chen
6. He Liang
This article has no evaluationsLatest version Dec 18, 2025
A meshless data-tailored approach to compute statistics from scattered data with adaptive radial basis functions

This article has 3 authors:
1. Damien Rigutto
2. Manuel Ratz
3. Miguel Alfonso Mendez
This article has no evaluationsLatest version Dec 10, 2025
Multiview CNN Architectures for Primary Particle Classification in Extensive Air Showers

This article has 4 authors:
1. Xochitl Silvestre-Gutierrez
2. Raquel Diaz-Hernandez
3. Leopoldo Altamirano-Robles
4. Saul Zapotecas-Martinez
This article has no evaluationsLatest version Jan 14, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

SAF-YOLO: Super-Resolution Augmented Detection Model with Visual State Space Enhancement for Safflower Filament Picking

A meshless data-tailored approach to compute statistics from scattered data with adaptive radial basis functions

Multiview CNN Architectures for Primary Particle Classification in Extensive Air Showers