Unsupervised Machine Learning for Adaptive Immune Receptors with immuneML

Milena Pavlović
Charlotte Würtzen
Chakravarthi Kanduri
Maria Mamica
Lonneke Scheffer
Christin Lund-Andersen
John Gubatan
Theresa Ullmann
Victor Greiff
Geir Kjetil Sandve

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Machine learning (ML) enables adaptive immune receptor repertoires (AIRRs) analyses for biomarker identification and therapeutic development. With the majority of AIRR data partially or imperfectly labeled, unsupervised ML is essential for motif discovery, biologically meaningful clustering, and generation of novel receptor sequences. However, no unified framework for unsupervised ML exists in the AIRR field, hindering the assessment of model robustness and generalizability. Here, we present an immuneML release advancing unsupervised ML in the AIRR field through unified clustering workflows, interpretable generative modeling, integration with protein language model embeddings, dimensionality reduction, and visualization. We demonstrate immuneML’s utility in three use cases: (i) benchmarking generative models for epitope-specific sequence generation, assessing specificity and novelty, (ii) systematic evaluation of clustering approaches on experimental receptor sequences against biological properties, such as epitope specificity and MHC, and (iii) unsupervised analysis of an experimental AIRR dataset to examine potential confounding, a practice widespread in related fields but unexplored in AIRR analyses.

Version published to 10.64898/2026.04.15.718648 on bioRxiv
Apr 18, 2026

Machine learning uncovers circulating biomarkers and molecular heterogeneity in obesity and type 2 diabetes

This article has 9 authors:
1. Erdenetsetseg Nokhoijav
2. Miklós Káplár
3. Sándor Csaba Aranyi
4. András Berzi
5. Göran Bergström
6. Konstantinos Antonopoulos
7. Fredrik Edfors
8. Miklós Emri
9. Éva Csősz
This article has no evaluationsLatest version Apr 20, 2026
Knowledge Inclusive Machine Learning for Disease Gene Prioritisation

This article has 16 authors:
1. Chathura J. Gamage
2. Yu Xia
3. Ravisha Rupasinghe
4. Sachith Seneviratne
5. Damith Senanayake
6. Tamasha Malepathirana
7. Asela Hevapathige
8. Mark Corbett
9. Terence J. O’Brien
10. Steven Petrou
11. Samuel F. Berkovic
12. Ingrid E. Scheffer
13. Jozef Gecz
14. Melanie Bahlo
15. Mark F. Bennett
16. Saman Halgamuge
This article has no evaluationsLatest version May 2, 2026
AI-enabled virtual immunopeptidomics links quantitative neoantigen presentation to immunogenicity

This article has 8 authors:
1. Yuhao Tan
2. Ziqi Yang
3. Tong Wang
4. Hailong Hu
5. Julia Fleming
6. Mingyao Pan
7. Laurence C. Eisenlohr
8. Bo Li
This article has no evaluationsLatest version May 10, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Machine learning uncovers circulating biomarkers and molecular heterogeneity in obesity and type 2 diabetes

Knowledge Inclusive Machine Learning for Disease Gene Prioritisation

AI-enabled virtual immunopeptidomics links quantitative neoantigen presentation to immunogenicity