Self-Explained Thinking Agent for Autonomous Microscopy Restoration

Bo Yan
Ruian He
Weimin Tan
Chenxi Ma
Zixian Zhang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Choosing image restoration algorithms is ubiquitous when using micoscopy, especially when the image degradation is unknown. However, the process of choosing and tuning algorithms remains labor-intensive, subjective, and bottlenecked by human attention. Most methods remain limited to traditional hand-crafted features or specialist deep models, lacking the ability to generalize to diverse degradation or adaptively reason like human scientists. Here, we present the next paradigm: the rise of multimodal large language model (MLLM)-based thinking agents that integrate perception, reasoning, and action. Our agent engages with both visual and textual inputs to identify degradation types in microscopy images, select appropriate restoration strategies, and iteratively refine its decisions based on feedback. Unlike static pipelines, the agent improves its performance over time by interacting with users or simulated environments, progressively aligning its internal representations with expert reasoning patterns. This enables it not only to choose image retoration algorithms with high accuracy, but also to explain its actions, justify its choices, and adapt to unfamiliar scenarios. We demonstrate the effectiveness of our approach across diverse microscopy modalities and degradation contexts, showing that it can rival or surpass human-level performance while continuously evolving its domain expertise. Our work marks a step toward autonomous systems that do not merely execute image restoration but also provide explainable reasons. To facilitate biologists' use of this exceptional tool to accelerate scientific discoveries, we provide a comprehensive software suite, including a training framework, pre-trained models, a web-based user interface.

Version published to 10.21203/rs.3.rs-7116422/v1 on Research Square
Aug 14, 2025

Embodied Intelligence Unlocks Autonomous Microscopy

This article has 7 authors:
1. Gang Huang
2. Zhengyang Zhang
3. Songlin Zhuang
4. Yang Wu
5. Zhihui Lu
6. Mingsi Tong
7. Huijun Gao
This article has no evaluationsLatest version Aug 18, 2025
A foundation model for multi-task cross-distribution restoration of fluorescence microscopy image

This article has 5 authors:
1. Shenghua Cheng
2. Qiqi Lu
3. Xiuli Liu
4. Qianjin Feng
5. Shaoqun zeng
This article has no evaluationsLatest version Sep 5, 2025
Cultural Heritage-Inspired Deep Framework forSports Action Recognition and Competition BehaviorAnalysis

This article has 1 author:
1. Zonghao Wang
This article has no evaluationsLatest version Oct 6, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Embodied Intelligence Unlocks Autonomous Microscopy

A foundation model for multi-task cross-distribution restoration of fluorescence microscopy image

Cultural Heritage-Inspired Deep Framework forSports Action Recognition and Competition BehaviorAnalysis