pCPPs-sADNN: Predicting cell-penetrating Peptides using Self-attention based Deep Neural Network
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Cell-penetrating peptides (CPPs) are short peptides consisting of 5 to 50 amino acids which is useful for drug delivery and intracellular localization. Laboratory-based techniques are often lengthy and resource-intensive, whereas computational approaches offer a rapid and cost-effective solution. However, the precision and dependability of these computational approaches require further enhancement to meet rigorous scientific standards. To address these limitations, this research introduces a predictive framework called pCPPs-DNN leveraging feature fusion, integrating embeddings from the protein pre-trained language models ProtT5 and ESM-2, along with CTF-based features. By combining the distinct derived feature sets, generates an enhanced and robust features vector. Furthermore, we employed Random Forest-based Recursive Feature Elimination (RF-RFE) for feature selection and used the Adaptive Synthetic Sampling Approach (ADASYN), an advanced variant of SMOTE, to address class imbalance by generating synthetic minority samples. The hybrid feature set was subsequently utilized to train a deep neural network enhanced with an attention mechanism. The proposed pCPPs-DNN model achieved a high training accuracy of 98.58% and an AUC of 0.99. In evaluation on test dataset, pCPPs-DNN demonstrated strong performance with an accuracy of 96.84% and an AUC of 0.99.