Multimodal AI agents for capturing and sharing laboratory practice

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

We present a multimodal AI laboratory agent that captures and shares tacit experimental practice by linking written instructions with hands-on laboratory work through the analysis of video, speech, and text. While current AI tools have proven effective in literature analysis and code generation, they do not address the critical gap between documented knowledge and implicit lab practice. Our framework bridges this divide by integrating protocol generation directly from researcher-recorded videos, systematic detection of experimental errors, and evaluation of instrument readiness by comparing current performance against historical decisions. Evaluated in mass spectrometry-based proteomics, we demonstrate that the agent can capture and share practical expertise beyond conventional documentation and identify common mistakes, although domain-specific and spatial recognition should still be improved. This agentic approach enhances reproducibility and accessibility in proteomics and provides a generalizable model for other fields where complex, hands-on procedures dominate. This study lays the groundwork for community-driven, multimodal AI systems that augment rather than replace the rigor of scientific practice.

Article activity feed