A Unified Framework for TCR-pMHC Structural Model Assessment
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
T cell receptor (TCR) recognition of peptide-MHC complexes (pMHCs) is a central determinant of cellular immunity. Structural insights are key to understanding TCR-pMHC interactions, however progress is limited by the scarcity of experimental structures (PDB = 232 TCR-pMHC class I complexes, date 01/2025). Although protein modelling tools have rapidly advanced, their accuracy on TCR-pMHC complexes is hampered by CDR loop hypervariability, conformational flexibility, and the lack of reliable quality assessment without experimental references. To evaluate the performance of recent protein modelling algorithms on TCR-pMHC complexes, we benchmarked three general-purpose (AlphaFold3, Boltz-2, Chai-1) and two TCR-specific (tFold-TCR, TCRmodel2) modelling tools using a benchmark set of 20 experimentally determined PDB structures, demonstrating AlphaFold3 superior performance. To expand structural coverage beyond the existing experimental landscape, we present a framework that enables quality assessment of TCR-pMHC structural models without requiring experimental reference structures. To this end, we integrate multiple modelling and interface confidence metrics (pLDDT, ipTM, iPAE, pDockQ) into explainable random forest classifiers trained on 1160 models of 232 experimentally determined PDB TCR-pMHC class I complexes. This approach reliably distinguishes low-, acceptable-, medium-, and high-quality models, outperforming the combination of these metrics using literature-established thresholds. Leveraging our quality tier framework, we used AlphaFold3 to generate and evaluate the largest synthetic dataset of TCR-pMHC class I structural models to date from VDJdb-annotated sequences, comprising 33,820 complexes (169,100 models, 5 models per complex) and representing a >70-fold expansion of available structures. We demonstrate the applicability of our quality tier stratification in three settings: 1) filtering validating vs. non-validating TCR-pMHC interaction in sequence databases (VDJdb), 2) enrichment of biologically validated interactions versus synthetic negatives amongst higher quality complexes, and 3) enhancing the predictive performance of TCR-pMHC pairing models. Altogether our TCR-pMHC structural quality tier framework provides a scalable and interpretable approach for improving modelling and functional analyses of TCR-pMHC class I complexes, with translational applications in T-cell and TCR-based immunotherapies.