ProVenTL: A Transfer-Learning Framework for Predicting Peptide–Protein Interactions Derived from Snake Venom for Cancer Therapeutics
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Accurate prediction of peptide–protein interactions (PepPI) is crucial for advancing peptide-based anticancer drug design. In this study, we introduce ProVenTL, a computer-aided molecular design framework that leverages transfer learning and protein language model embeddings to enhance PepPI prediction accuracy and interpretability. Two complementary strategies were explored: (i) fine-tuning a CAMP model pretrained on large-scale PepPI data from the Protein Data Bank (PDB) using a curated dataset of Calloselasma rhodostoma venom peptides and cancer-related proteins, and (ii) integrating ProtT5 embeddings with stacked autoencoder–deep neural networks (SAE–DNN) and TabNet classifiers. Models were comprehensively benchmarked against existing deep-learning approaches using standard classification metrics, while biological relevance was evaluated via functional enrichment and pathway analysis of top-ranked predictions. The ProtT5-based SAE–DNN achieved the highest performance (accuracy = 0.78; ROC–AUC = 0.86), identifying key targets such as TRBC2, CD274, HIF1AN, PCSK9, and PLAU, which are associated with pathways involved in immune suppression, hypoxia regulation, lipid metabolism, and metastasis. These findings demonstrate the potential of transfer learning and molecular representation models for computational peptide–protein interaction design and provide a basis for subsequent experimental validation of snake-venom-derived anticancer peptides.