Machine Learning Dataset and Benchmark for Accurate T Cell Receptor-pHLA Binding Prediction

Fuli Feng
Xinyuan Zhu
Jiadong Lu
Yeqing Lu
Yuyan Zhang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

A central challenge in immunology and therapeutic design is accurately predicting the diverse interactions between T cell receptors (TCRs) and peptide-HLA (pHLA) complexes. Existing machine learning tools are hindered by incomplete sequence data and biased non-binding examples. To overcome this, we present Hi-TPH, a large-scale hierarchical dataset featuring an on-the-fly selection strategy for generating non-binding data. We further develop Hi-TPH-PLMs, a collection of Protein Language Models (PLMs) with varied architectures and scales, fine-tuned on Hi-TPH. These models achieve a 17.4% performance gain over state-of-the-art tools on an external wet-lab test set. Leveraging the hierarchical structure of Hi-TPH, detailed analyses dissect the contribution of different molecular components to binding prediction and reveal their synergistic interplay—for instance, the prediction contribution of HLA relies on the presence of full TCR chains. Hi-TPH and Hi-TPH-PLMs are publicly released to support the development of more reliable tools for advanced immunoinformatics research and personalized immunotherapy.

Version published to 10.21203/rs.3.rs-7286169/v1 on Research Square
Sep 11, 2025

OracleScreen-LILRB4: Machine Learning-Guided Discovery of Myeloid Immune Checkpoint Binders Validated in Patient-Derived Cells

This article has 2 authors:
1. Somaya A. Abdel-Rahman
2. Moustafa T. Gabr
This article has no evaluationsLatest version Jun 21, 2026
Benchmarking Boltz-2 for Screening of Therapeutic Antibody-Antigen Interactions

This article has 10 authors:
1. Alexandra Fieux-Castagnet
2. Julian Waton
3. Alina Glukhonemykh
4. Eric Snow
5. Roshini Ashokkumar
6. Jess Fleming
7. David Champagne
8. Thomas Devenyns
9. Alex Peluffo
10. Chris Anagnostopoulos
This article has no evaluationsLatest version May 14, 2026
Integrating Diffusion and Liquid AI Models for Predicting Peptide Affinity from mRNA Display Selections

This article has 8 authors:
1. Colin M. Leaf
2. Pearl Qi
3. Yash Pragnesh Gandhi
4. Farzad Jalali-Yazdi
5. Justin N. Ong
6. Terry T. Takahashi
7. Rajiv K. Kalia
8. Richard W. Roberts
This article has no evaluationsLatest version May 11, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

OracleScreen-LILRB4: Machine Learning-Guided Discovery of Myeloid Immune Checkpoint Binders Validated in Patient-Derived Cells

Benchmarking Boltz-2 for Screening of Therapeutic Antibody-Antigen Interactions

Integrating Diffusion and Liquid AI Models for Predicting Peptide Affinity from mRNA Display Selections