Med-AgentX: Multimodal Large Language Model Agents with Explainable Reinforcement Learning for Trustworthy Biomedical Decision Support

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Biomedical decision support systems face persistent challenges in integrating multimodal patient data and providing clinicians with trustworthy, transparent recommendations. While large language models (LLMs) and multimodal foundation models demonstrate remarkable reasoning across text, images, and structured records, their black-box nature and brittleness under noisy or adversarial conditions hinder deployment in safety-critical healthcare settings. We propose Med-AgentX, a multimodal LLM-agent framework enhanced with reinforcement learning (RL) and attribution-based explainability. Med-AgentX fuses clinical text, imaging, and structured data through crossmodal attention, while an RL policy optimizes for diagnostic accuracy, robustness, and explanation fidelity. Human-in-theloop feedback further aligns the system with clinical expertise. Experiments on benchmark datasets demonstrate that Med- AgentX outperforms strong baselines in predictive accuracy and robustness, while offering interpretable rationales validated by clinicians. Taken together, Med-AgentX represents a step toward next-generation biomedical AI that is powerful, accountable, and clinician-aligned.

Article activity feed