Lab-in-the-loop therapeutic antibody design with deep learning

Nathan C. Frey
Isidro Hötzel
Samuel D. Stanton
Ryan Kelly
Robert G. Alberstein
Emily Makowski
Karolis Martinkus
Daniel Berenberg
Jack Bevers
Tyler Bryson
Pamela Chan
Alicja Czubaty
Tamica D’Souza
Henri Dwyer
Anna Dziewulska
James W. Fairman
Allen Goodman
Jennifer Hofmann
Henry Isaacson
Aya Ismail
Samantha James
Taylor Joren
Simon Kelow
James R. Kiefer
Matthieu Kirchmeyer
Joseph Kleinhenz
James T. Koerber
Julien Lafrance-Vanasse
Andrew Leaver-Fay
Jae Hyeon Lee
Edith Lee
Donald Lee
Wei-Ching Liang
Joshua Yao-Yu Lin
Sidney Lisanza
Andreas Loukas
Jan Ludwiczak
Sai Pooja Mahajan
Omar Mahmood
Homa Mohammadi-Peyhani
Santrupti Nerli
Ji Won Park
Jaewoo Park
Stephen Ra
Sarah Robinson
Saeed Saremi
Franziska Seeger
Imee Sinha
Anna M. Sokol
Natasa Tagasovska
Hao To
Edward Wagstaff
Amy Wang
Andrew M. Watkins
Blair Wilson
Shuang Wu
Karina Zadorozhny
John Marioni
Aviv Regev
Yan Wu
Kyunghyun Cho
Richard Bonneau
Vladimir Gligorijević

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (Arcadia Science)

Abstract

Therapeutic antibody design is a complex multi-property optimization problem that traditionally relies on expensive search through sequence space. Here, we introduce “Lab-in-the-loop,” a new approach to antibody design that orchestrates generative machine learning models, multi-task property predictors, active learning ranking and selection, and in vitro experimentation in a semi-autonomous, iterative optimization loop. By automating the design of antibody variants, property prediction, ranking and selection of designs to assay in the lab, and ingestion of in vitro data, we enable a holistic, end-to-end approach to antibody optimization. We apply lab-in-the-loop to four clinically relevant antigen targets: EGFR, IL-6, HER2, and OSM. Over 1,800 unique antibody variants are designed and tested, derived from lead molecule candidates obtained via animal immunization and state-of-the-art immune repertoire mining techniques. Four lead candidate and four design crystal structures are solved to reveal mechanistic insights into the effects of mutations. We perform four rounds of iterative optimization and report 3–100 × better binding variants for every target and ten candidate lead molecules, with the best binders in a therapeutically relevant 100 pM range.

Arcadia Science
Mar 31, 2025

In the first round of design, a maximum edit distance cap of 6 edits444from the lead is enforced. In the second round, this cap is increased to 8, and in the445third round, it is increased to 12.

This seems still quite close in sequence space to the original sequences -- how do the results from this method compare to other protein engineering efforts on the same targets? are there cases where the same residues were mutated?

Read the original source
Arcadia Science
Mar 31, 2025

DCS uses a likelihood under a joint density of statistical properties, includ-431ing log-probability under a protein language model, and sequence-based properties432like hydrophobicity and molecular weight, calculated with BioPython

it seems from a first pass to be a little bit circular to use OOD detection on PLM-generated sequences. Presumably PLMs have "learned" various aspects of what makes a good (in-distribution) protein and already incorporate (implicitly) some concept of hydrophobicity, MW, etc. In other words, it would be interesting to see for cases of major disagreement between the generative model and the OOD-detection model, which one is correct?

Read the original source
Arcadia Science
Mar 31, 2025

3

how was this metric chosen?

Read the original source
Arcadia Science
Mar 31, 2025

Our results demonstrate the powerful generalization capabilities of LitL to perform102antibody design across diverse antigen targets and epitopes, without human inter-103vention, while producing real therapeutic antibodies that are viable candidates to104progress in the drug discovery pipeline.

how does this connect to ultimate success/failure and profitability in the clinic? i.e. if we apply this (powerful!) approach across all drug targets from now on, will we get a significant boost in clinical performance? or are failures in the clinic caused by other factors not addressed by this method e.g. incomplete understanding of the disease states, selecting the wrong targets, choosing the wrong patient population etc?

Read the original source
Version published to 10.1101/2025.02.19.639050 on bioRxiv
Feb 24, 2025

A Generative Foundation Model for Antibody Design

This article has 12 authors:
1. Rubo Wang
2. Fandi Wu
3. Jiale Shi
4. Yidong Song
5. Yu Kong
6. Jian Ma
7. Bing He
8. Qihong Yan
9. Tianlei Ying
10. Peilin Zhao
11. Xingyu Gao
12. Jianhua Yao
This article has no evaluationsLatest version Sep 16, 2025
Property Enhancer – a data efficient multi-objective approach for functional antibody optimization

This article has 24 authors:
1. Nataša Tagasovska
2. Jan Ludwiczak
3. Andreas Loukas
4. Emily K. Makowski
5. Homa Mohammadi-Peyhani
6. Sai Pooja Mahajan
7. Karina Zadorozhny
8. Donald Lee
9. Allen Goodman
10. Joshua Yao-Yu Lin
11. Ryan Kelly
12. Isidro Hötzel
13. Jack Bevers
14. Tamica D’Souza
15. James T. Koerber
16. Wei-Ching Liang
17. Julien Lafrance-Vanasse
18. Yongmei Chen
19. Andrew M. Watkins
20. Henri Dwyer
21. Stephen Ra
22. Richard Bonneau
23. Kyunghyun Cho
24. Vladimir Gligorijević
This article has no evaluationsLatest version Oct 10, 2025
Designing proteins with reduced T-cell epitopes through policy optimization

This article has 5 authors:
1. Manvitha Ponnapati
2. Sapna Sinha
3. Brian Lynch
4. Edward S. Boyden
5. Joseph Jacobson
This article has no evaluationsLatest version Sep 29, 2025

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A Generative Foundation Model for Antibody Design

Property Enhancer – a data efficient multi-objective approach for functional antibody optimization

Designing proteins with reduced T-cell epitopes through policy optimization