peleke-1: A Suite of Protein Language Models Fine-Tuned for Targeted Antibody Sequence Generation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The discovery of therapeutic antibodies is a traditionally arduous process. Today, the lab-based process of antibody discovery consists of several time-consuming steps that involve live animal immunization, B-cell harvesting, hybridoma creation, and then downstream engineering and evaluation. However, the use of artificial intelligence in drug design has previously been shown effective in the rapid generation of protein-specific binders, small molecules, and even antibody therapeutics, thereby replacing some of the primary steps of the drug discovery process. Here we present peleke-1, a suite of protein language models fine-tuned from state-of-the-art large language models using curated antibody-antigen complex data. These models generate targeted antibody Fv sequences for a given antigen sequence input at-scale. This suite of models provides a reliable, artificial intelligence-driven approach for in silico therapeutic antibody discovery along with an open-source framework for future antibody language model tuning.