Evaluating Expert Specialization in Mixture-of-Experts Antibody Language Models

Sarah M. Burbach
Simone Spandau
Jonathan Hurtado
Bryan Briney

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Antibody language models (AbLMs) show an impressive aptitude for learning antibody features, but tend to struggle learning the highly diverse, non-templated regions of antibodies. Existing AbLMs use dense architectures, where all model parameters attend to each amino acid token. We hypothesized that the modular nature of antibodies could benefit from a sparse mixture-of-experts (MoE) architecture, allowing specific parameters (referred to as ‘experts’) to specialize in distinct antibody features. While MoE architectures are widely adopted and optimized in natural language processing domains, they are less common in biological modeling. To this end, we assess existing MoE routing strategies and find that token-choice routing strategies outperform expert-choice routing, presumably due to their specialization in CDRH3 residues. We further optimized the token-choice router for AbLMs, by minimizing the routing of padding tokens to enable pre-training with varying sequence lengths. Finally, we show that a large-scale baseline antibody language model with a Top-2 MoE architecture (BALM-MoE), trained on a mixture of unpaired and paired antibody sequences, outperforms its dense counterpart with the same number of active parameters.

Version published to 10.64898/2026.04.17.719246 on bioRxiv
Apr 22, 2026

sdAbs-LLM: Generative Large Language Models For de novo Antibody Design and Agentic Evaluation

This article has 4 authors:
1. Delower Hossain
2. Fuad Al Abir
3. Sixue Zhang
4. Jake Y. Chen
This article has no evaluationsLatest version Apr 21, 2026
Autoresearch Discovery of Interpretable Filter Rules for Antibody Binder Classification

This article has 1 author:
1. Mikel Landajuela
This article has no evaluationsLatest version May 11, 2026
Simple baselines rival protein language models in mutation-dense design of function tasks

This article has 2 authors:
1. Itay Talpir
2. Sarel J. Fleishman
This article has no evaluationsLatest version May 6, 2026

Evaluating Expert Specialization in Mixture-of-Experts Antibody Language Models

Discuss this preprint

Listed in

Abstract

Article activity feed

sdAbs-LLM: Generative Large Language Models For de novo Antibody Design and Agentic Evaluation

Autoresearch Discovery of Interpretable Filter Rules for Antibody Binder Classification

Simple baselines rival protein language models in mutation-dense design of function tasks

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

sdAbs-LLM: Generative Large Language Models For de novo Antibody Design and Agentic Evaluation

Autoresearch Discovery of Interpretable Filter Rules for Antibody Binder Classification

Simple baselines rival protein language models in mutation-dense design of function tasks