Protein Language Model Based Structure-guided Antibody Screening for Disordered Protein Targets

Akshay Chenna
Prasoon Priyadarshi
Keshav Kolluru
Saurabh Singal
Gaurav Goel

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

A crucial step in the pathogenesis of Parkinson’s disease involves cell-to-cell transmission of α -Synuclein proto-fibrils via endocytosis, driven primarily by the interaction of its disordered C-terminal peptide with domain 1 of Lymphocyte Activation Gene 3 (LAG3) neuronal receptors. High-affinity antibodies have been proposed as therapeutic modalities to delay this progression and subsequent amyloid formation. In our work, we develop an end-to-end computational pipeline to enable rapid screening of antibody sequences that have a high-affinity for the disordered C-terminal peptide of α -Synuclein using no information of known binders. This de novo screening was enabled by a structural bioinformatics based in silico data generation pipeline combined with a deep learning framework. Our simple feed forward network model built upon sequence embeddings from a protein language model ranked the binding affinities (ΔG) of antibodies to α -Synuclein with a high accuracy (Spearman ρ = 0.86) when the training and the evaluation datasets contained sequences having some overlap in the complementarity determining regions (CDRs). However, for vastly different CDR sequences, a transformer encoder model trained using the antibody sequence embeddings showed a low Spearman rank correlation of ρ = 0.18. The models have a mean Precision@100 of 38 and 12 respectively, significantly outperforming a random process. Overall, our work demonstrates a computational protocol for generating a high quality dataset of antibody-antigen complexes spanning a very large diversity in antibody sequences followed by training of a deep learning model for prediction of high-affinity antibody sequences for a specific protein target with no known binders.

Version published to 10.1101/2025.06.21.660895 on bioRxiv
Jun 26, 2025

Protein Language Models Rescue Variant Pathogenicity Prediction in Intrinsically Disordered Regions Through Synergistic Integration with Structure-Based Methods

This article has 1 author:
1. Hayden Farquhar
This article has no evaluationsLatest version Feb 4, 2026
The Evolution of the AlphaFold Architecture

This article has 1 author:
1. Y.C.B.J. Dissanayaka
This article has no evaluationsLatest version Jan 9, 2026
Discovery of β-Sheet Peptide Assembly Codes via an Experimentally Validated Predictive Computational Platform

This article has 4 authors:
1. Wei Han
2. Hang Zheng
3. Ke Huang
4. Chi-Sing Lee
This article has no evaluationsLatest version Jan 14, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Protein Language Models Rescue Variant Pathogenicity Prediction in Intrinsically Disordered Regions Through Synergistic Integration with Structure-Based Methods

The Evolution of the AlphaFold Architecture

Discovery of β-Sheet Peptide Assembly Codes via an Experimentally Validated Predictive Computational Platform