ProteoParc: A tool to generate protein reference databases for ancient and non-model organisms

Guillermo Carrillo-Martin
Johanna Krueger
Tomas Marques-Bonet
Esther Lizano

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Over the last few years, the increasing interest in analysing the proteome of extinct and non-model organisms has generated a new field of research expanding the scope of proteomics. The lack of curated databases and/or molecular data from these organisms forces researchers to manually search in different public repositories for related protein sequences, either for MS/MS peptide identification or ZooMS marker annotation. This can lead to format incongruences and hinder reproducibility between studies. To address this issue, we introduce ProteoParc, a user-friendly software that generates reference databases by systematically downloading and processing protein sequences from the most widely used public repositories. The pipeline’s output is a non-redundant protein database, formatted to be interpreted by typical peptide identification software. Moreover, the user can adjust the database dimension and composition by applying different criteria to include only a certain number of genes or species. Thus, ProteoParc is an easy and fast, custom-made bioinformatic tool useful for future paleoproteomics analysis in ancient samples related to understudied organisms.

Version published to 10.1101/2025.07.31.667843 on bioRxiv
Aug 2, 2025

META-DIFF: a k-mer-based pipeline that detects differentially abundant sequences in metagenomics whole genome sequencing

This article has 8 authors:
1. Louis-Maël Guéguen
2. Alban Mathieu
3. Simon Pelletier
4. Anthony Woo
5. Namita Misra
6. Magali Moreau
7. Olivier Perin
8. Arnaud Droit
This article has no evaluationsLatest version Jan 29, 2026
Sequenoscope: A Modular Tool for Nanopore Adaptive Sequencing Analytics and Beyond

This article has 9 authors:
1. Abdallah Meknas
2. Kyrylo Bessonov
3. Shannon H.C. Eagle
4. Christy-Lynn Peterson
5. James Robertson
6. Nicole Ricker
7. Tara Signorelli
8. John Nash
9. Aleisha Reimer
Reviewed by Access Microbiology

This article has 7 evaluationsLatest version Dec 18, 2025Latest activity Jan 25, 2026
TaxoFlow: The Tutorial. An Educational Nextflow Pipeline for Metagenomics Taxonomic Profiling

This article has 2 authors:
1. Jeferyd Yepes-García
2. Laurent Falquet
This article has no evaluationsLatest version Dec 22, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

META-DIFF: a k-mer-based pipeline that detects differentially abundant sequences in metagenomics whole genome sequencing

Sequenoscope: A Modular Tool for Nanopore Adaptive Sequencing Analytics and Beyond

TaxoFlow: The Tutorial. An Educational Nextflow Pipeline for Metagenomics Taxonomic Profiling