PepPharmaHub: A Cloud-Based Platform Integrating Multimodel Language Architectures with Curated Data Resources for Therapeutic Peptide Discovery
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Therapeutic peptides represent a rapidly expanding class of drug candidates due to their diverse biological activities and high specificity. However, accurately predicting peptide functions directly from sequence information remains a major challenge in computational peptidomics. Current tools, typically standalone applications or functionally constrained web servers, lack the flexibility and scalability essential for modern peptide discovery workflows. Therefore, it is necessary to develop a cloud-based, no-code platform that enables customizable modeling and high-throughput functional screening of therapeutic peptides. Results PepPharmaHub (http://bioinmed.jflab.ac.cn:18090/peppharmahub/) provides a cloud-based, end-to-end platform that integrates advanced sequence-based language modeling with curated benchmark datasets and interactive visualization modules. The platform features a high-throughput screening module powered by a diverse set of 24 models targeting 20 therapeutic properties, alongside a customizable model training pipeline for user-defined screening tasks. Comprehensive benchmarking on 24 public datasets demonstrates that PepPharmaHub matches or surpasses state-of-the-art predictors, significantly improving the efficiency of large-scale peptide screening. Compared with existing public web servers, PepPharmaHub attains a higher, more tightly distributed accuracy on 3,475 newly reported bioactive peptides from 2023–2025 (20 independent tasks), indicating stronger generalization and practical utility. Conclusions PepPharmaHub enables accurate, high-throughput prediction of peptide functions through customizable deep learning models and a no-code interface. By outperforming existing tools across multiple benchmarks and supporting interpretable sequence analysis, the platform offers a practical solution for accelerating peptide-based drug discovery.