The Virtual Lab: AI Agents Design New SARS-CoV-2 Nanobodies with Experimental Validation

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Science frequently benefits from teams of interdisciplinary researchers. However, most scientists don’t have access to experts from multiple fields. Fortunately, large language models (LLMs) have recently shown an impressive ability to aid researchers across diverse domains by answering scientific questions. Here, we expand the capabilities of LLMs for science by introducing the Virtual Lab, an AI-human research collaboration to perform sophisticated, interdisciplinary science research. The Virtual Lab consists of an LLM principal investigator agent guiding a team of LLM agents with different scientific backgrounds (e.g., a chemist agent, a computer scientist agent, a critic agent), with a human researcher providing high-level feedback. We design the Virtual Lab to conduct scientific research through a series of team meetings, where all the agents discuss a scientific agenda, and individual meetings, where an agent accomplishes a specific task. We demonstrate the power of the Virtual Lab by applying it to design nanobody binders to recent variants of SARS-CoV-2, which is a challenging, open-ended research problem that requires reasoning across diverse fields from biology to computer science. The Virtual Lab creates a novel computational nanobody design pipeline that incorporates ESM, AlphaFold-Multimer, and Rosetta and designs 92 new nanobodies. Experimental validation of those designs reveals a range of functional nanobodies with promising binding profiles across SARS-CoV-2 variants. In particular, two new nanobodies exhibit improved binding to the recent JN.1 or KP.3 variants of SARS-CoV-2 while maintaining strong binding to the ancestral viral spike protein, suggesting exciting candidates for further investigation. This demonstrates the ability of the Virtual Lab to rapidly make impactful, real-world scientific discovery.

Article activity feed

  1. This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/15776765.

    Suggestions for the manuscript "The Virtual Lab: AI Agents Design New SARS-CoV-2 Nanobodies with Experimental Validation": Below, we outline key strengths of the study as well as areas for improvement, intended to support refinement of the manuscript and broader impact of the work.

    Strengths

    Innovative Multi-Agent System Design

    • Novel orchestration of collaboration between multiple LLM agents with distinct roles (e.g., scientific critic, ML expert), guided by a human-in-the-loop.

    • Modular and customizable architecture — tasks can be assigned to different agents or LLMs; tools like ESM, AlphaFold-Multimer, and Rosetta can be swapped.

     Automated and Efficient Workflow

    • Pipeline automates team formation, tool selection, role assignment, code generation, and iterative improvement steps.

    • Demonstrates low-lift, low-cost screening of trillions of nanobody sequences in minutes, accelerating discovery.

    • Faster iteration via parallel meetings (5–10 mins) followed by human adjudication to select optimal outputs.

    Biological Validation

    • Two promising nanobody candidates targeting new SARS-CoV-2 variants (JN.1 and KP.3) identified and experimentally validated via wet-lab expression and ELISA.

    • Strong hybrid validation: AlphaFold-Multimer and Rosetta scores matched prior high-accuracy benchmarks for RBD-antibody binding.

    Transparency and Reproducibility

    • All steps documented; easy to install and run in Jupyter notebooks.

    • GitHub repo well-organized and allows re-running experiments or customizing prompts.

    Areas for Improvement

    Validation and Evaluation Gaps

    • Only one use case (SARS-CoV-2); need additional domains to demonstrate generalizability.

    • Results not benchmarked against simpler alternatives (e.g., using ChatGPT alone or running same prompts without agent structure).

    • No quantitative assessment of how much each metric (e.g., ESM, Rosetta) contributed to final ranking — weighting system not validated.

    Multi-Agent Design Transparency

    • Lacks detailed description of how the multi-agent architecture was built or optimized.

    • Decisions such as number of parallel meetings (N=3) or agenda structure are not justified; optimization unclear.

    • Would benefit from defining how agents are "given" tool access — Appendix suggests tools may have been run externally.

    Output Quality and Reliability

    • Although agents wrote 99% of words, unclear how many were useful — could be excessive verbosity or hallucinations.

    • Human had to correct errors, including basic file I/O issues and tool execution bugs — suggests limited agent autonomy.

    • Scientific critic, while helpful, could introduce misleading conclusions without oversight.

    Usability and Performance Concerns

    • Agents cannot access real-time data (e.g., newly emerging COVID variants).

    • Pipeline is not fully automated — requires step-by-step human prompting (e.g., between ESM → AlphaFold → Rosetta stages).

    • Variability in output not well quantified; how consistent is the pipeline when re-run?

    Presentation and Accessibility

    • Figures could use larger text and more interpretation

    • Would benefit from a walkthrough video showing pipeline from start to finish.

    • Nanobody biology and software tool functions could be better explained for broader audiences.

    Scientific Rigor and Detail

    • Methods section lacks detail, especially around how LLMs interface with software (e.g., was AlphaFold run internally?).

    • Evaluation could be strengthened by adding in vitro neutralization assays or cell-based functional screens beyond ELISA.

    • Could compare LLM-designed structures against traditional bioinformatics pipelines to assess true value-added.

    Competing interests

    The authors declare that they have no competing interests.