Self-Explained Thinking Agent for Autonomous Microscopy Restoration
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Choosing image restoration algorithms is ubiquitous when using micoscopy, especially when the image degradation is unknown. However, the process of choosing and tuning algorithms remains labor-intensive, subjective, and bottlenecked by human attention. Most methods remain limited to traditional hand-crafted features or specialist deep models, lacking the ability to generalize to diverse degradation or adaptively reason like human scientists. Here, we present the next paradigm: the rise of multimodal large language model (MLLM)-based thinking agents that integrate perception, reasoning, and action. Our agent engages with both visual and textual inputs to identify degradation types in microscopy images, select appropriate restoration strategies, and iteratively refine its decisions based on feedback. Unlike static pipelines, the agent improves its performance over time by interacting with users or simulated environments, progressively aligning its internal representations with expert reasoning patterns. This enables it not only to choose image retoration algorithms with high accuracy, but also to explain its actions, justify its choices, and adapt to unfamiliar scenarios. We demonstrate the effectiveness of our approach across diverse microscopy modalities and degradation contexts, showing that it can rival or surpass human-level performance while continuously evolving its domain expertise. Our work marks a step toward autonomous systems that do not merely execute image restoration but also provide explainable reasons. To facilitate biologists' use of this exceptional tool to accelerate scientific discoveries, we provide a comprehensive software suite, including a training framework, pre-trained models, a web-based user interface.