REPTRA: Mapping Immune T Cell Receptor Activity from Full Sequences with a Debiased Contrastive Loss

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Contrastive learning is amenable to representing the vast space of interaction between highly specific T cell receptors (TCRs) and the epitopes to which they bind, potentially enabling diverse applications in immune engineering. However, progress in mapping TCR-epitope recognition has been limited by skewed datasets and training approaches that may exacerbate biases and obfuscate model performance. Furthermore, most TCR-epitope models represent only one of three complementaritydetermining regions (CDRs) of the TCR, potentially limiting performance. Here, we present a CLIP-style contrastive-learning model for representation of epitopes and T cell receptor activity (REPTRA), incorporating full TCR sequences and trained using a debiased InfoNCE loss. We trained this model on a dataset with over fivefold more epitope diversity than previous reports, collected using the DECODE platform (Repertoire Immune Medicines). We demonstrate resulting improvements in model performance, and ablate the modified loss and TCR representation to demonstrate the contributions of this approach. Furthermore, we apply an interpretability analysis to the REPTRA attention-pooling projection heads to reveal that CDR1 and CDR2, in addition to CDR3, are important for learning TCR-epitope recognition. In doing so, we develop a performant model for contrastive mapping of T cell receptor activity.

Article activity feed