MetaPepticon: automated prediction of anticancer peptides from microbial genomes and metagenomes
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Anticancer peptides (ACPs) are increasingly recognized as promising therapeutic candidates due to their ability to selectively target cancer cells. However, the systematic discovery of novel ACPs, particularly from high-throughput sequencing datasets, remains hindered by technical and methodological limitations. Current prediction frameworks require pre-extracted peptide sequences, involve manual preprocessing, and yield variable results, which restricts their applicability for large-scale, data-driven discovery.
To address these limitations, we developed MetaPepticon, a modular, end-to-end pipeline for the discovery of candidate ACPs from diverse sequencing inputs, including raw genomic, metagenomic, transcriptomic, and metatranscriptomic reads, as well as assembled contigs and peptide sequences. MetaPepticon automates quality control, filtering, assembly, small open reading frame prediction, ACP classification using multiple predictive algorithms, and in silico toxicity filtering. By employing a consensus-based strategy and supporting heterogeneous data types, MetaPepticon facilitates scalable, reproducible, and high-confidence identification of candidate ACPs.
Applied to 41,171 microbial genomes and 4,072,884 peptides, MetaPepticon identified 79,587 novel candidate ACPs, including 13,149 high-confidence, non-toxic peptides. By providing a standardized, automated framework for large-scale ACP discovery across various input types, MetaPepticon facilitates therapeutic peptide exploration and is freely available at: https://github.com/arikanlab/MetaPepticon