AmyloDeep: pLM-based ensemble model for predicting amyloid propensity from the amino acid sequence

Alisa Davtyan
Anahit Khachatryan
Rafayel Petrosyan

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Amyloids are predominantly β-sheet-rich, stable protein structures that can maintain their presence in the human body for multiple years. Amyloid protein aggregates contribute to the development of multiple neurodegenerative diseases, such as Alzheimer’s, Parkinson’s, and Huntington’s, and are involved in different vital functions, such as memory formation and immune system function. Here, we used advanced machine learning and deep learning techniques to predict amyloid propensity from the amino acid sequence. First, we aggregated labeled amino acid sequence data from multiple sources, obtaining a roughly balanced dataset of 2366 sequences for binary classification. We leveraged that data to both fine-tune the ESM2 model and to train new models based on protein embeddings from ESM2 and UniRep. The predictions from these models were then unified into a single soft voting ensemble model, yielding highly robust and accurate results. We further made a tool where users can provide the amino acid sequence and get the amyloid formation probabilities of different segments of the input sequence. Users can access the light version of AmyloDeep through the web server at https://amylodeep.com/ , and the full model is available as a Python package at https://pypi.org/project/amylodeep/ .

Version published to 10.1101/2025.09.16.676495 on bioRxiv
Sep 18, 2025

The Evolution of the AlphaFold Architecture

This article has 1 author:
1. Y.C.B.J. Dissanayaka
This article has no evaluationsLatest version Jan 9, 2026
Multi-View Autoencoder Framework with Feature Recalibration and Ensemble Learning for Predicting Heart Disease

This article has 2 authors:
1. Abulfadhel Amer Saihood Altufaili
2. Dunya Mohammed Shleej
This article has no evaluationsLatest version Dec 11, 2025
Integrating Evolutionary and Compositional Features with ML and DL for Robust and Interpretable Druggable Protein Prediction

This article has 5 authors:
1. Mujeebu Rehman
2. Qinghua Liu
3. Muhammad Javed
4. Ali Ghulam
5. Teerath Kumar
This article has no evaluationsLatest version Dec 11, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

The Evolution of the AlphaFold Architecture

Multi-View Autoencoder Framework with Feature Recalibration and Ensemble Learning for Predicting Heart Disease

Integrating Evolutionary and Compositional Features with ML and DL for Robust and Interpretable Druggable Protein Prediction