Antibody affinity engineering using antibody repertoire data and machine learning
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (Arcadia Science)
Abstract
Advanced antibody discovery and engineering workflows take advantage of the combination of high-throughput screening, deep sequencing and machine learning (ML). Most high-throughput methods, however, lack the resolution to provide absolute affinity values of antibody-antigen interactions, limiting their utility for precise engineering of binding kinetics. In this study, we utilize antibody repertoire data, affinity characterization and ML for antibody affinity engineering. Leveraging natural antibody sequence information from repertoires of immunized mice, we identified and experimentally measured affinities for 35 antigen-specific variants. Supervised ML models trained on these sequences achieved remarkable accuracy in predicting affinity, despite the limited dataset size. We utilized the trained ML model to in silico -design eight synthetic antibody variants, of which seven exhibited the desired affinities. Our study illustrates the potential of this streamlined and efficient approach for precise engineering of the affinity of antibodies while reducing extensive experimental screening.
Article activity feed
-
To visually examine the sequence-function relationship of the characterized antibody variants, both a network plot and a phylogenetic tree were generated
Given that your results clearly show a strong relationship between sequence similarity and binding affinity (in both the phylogenetic tree and network analysis), did you consider alternative strategies for sequence encoding? In particular those that might capture some of this evolutionary signal? For example including additional features derived from the phylogenetic tree, network-based distances, or embeddings from protein language models (like ESM)?
These kinds of features might be especially valuable in a small-sample setting like this one and could further boost the predictive power of your models. Very nice study! Great to see creative and effective ways to leverage the power …
To visually examine the sequence-function relationship of the characterized antibody variants, both a network plot and a phylogenetic tree were generated
Given that your results clearly show a strong relationship between sequence similarity and binding affinity (in both the phylogenetic tree and network analysis), did you consider alternative strategies for sequence encoding? In particular those that might capture some of this evolutionary signal? For example including additional features derived from the phylogenetic tree, network-based distances, or embeddings from protein language models (like ESM)?
These kinds of features might be especially valuable in a small-sample setting like this one and could further boost the predictive power of your models. Very nice study! Great to see creative and effective ways to leverage the power of small experimental datasets for protein function prediction.
-