Improved open modification searching via unified spectral search with predicted libraries and enhanced vector representations in ANN-SoLo

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Summary

The primary computational challenge in mass spectrometry-based proteomics is determining the peptide sequence responsible for generating each measured tandem mass spectrum. This task is traditionally addressed through sequence database searching, as well as alternative approaches such as spectral library searching. ANN-SoLo is a powerful spectral library search engine optimized for open modification searching, enabling the detection of peptides carrying any post-translational modification. Here, we present an enhanced version of ANN-SoLo that combines strengths of both spectral library searching and sequence database searching, by integrating with Prosit to generate predicted spectral libraries from protein sequence databases. Additionally, it provides functionality to generate decoys at both the spectrum and the peptide level, introduces an optimized internal file structure for large-scale analytics, and improves search accuracy by incorporating neutral loss information into spectrum vector representations. These advancements collectively address challenges associated with missing spectral libraries and enhance peptide identification in large-scale and complex proteomics workflows.

Availability and Implementation

ANN-SoLo is available as open source under the permissive Apache 2.0 license on GitHub at https://github.com/bittremieux-lab/ANN-SoLo .

Article activity feed