A Multimodal Large Reasoning Model For Fair and Interpretable Dermatological Diagnosis Across Skin Tones

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The clinical translation of dermatological artificial intelligence is severely limited by opaque decision reasoning and systematic performance disparities across skin tones. Here, we introduce SkinGPT-R1, a multimodal large reasoning model explicitly designed for trustworthy, interpretable, and fairness-aware skin disease diagnosis. SkinGPT-R1 unifies chain-of-thought diagnostic reasoning with a fairness-aware mixture-of-experts architecture to enable equitable and human-readable diagnostic outputs. Trained on 334,168 clinical samples, SkinGPT-R1 generates comprehensive diagnostic reports comprising visual findings, differential reasoning, and final diagnosis. On six independent external validation datasets covering diverse dermatological conditions and imaging settings, SkinGPT-R1 achieves state-of-the-art diagnostic accuracy. On a challenging 40-class long-tail classification task, it attains 82.5% accuracy, representing an absolute improvement of 19.3% over strong baseline models. In a blinded evaluation by five board-certified dermatologists using 1,000 phenotypically balanced cases, SkinGPT-R1 achieves a mean overall score of 3.6 out of 5, with the highest ratings for safety (3.8/5) and reasoning coherence (3.6/5), confirming that its generated rationales are clinically valid, logically consistent, and suitable for supporting clinical decision-making. Critically, SkinGPT-R1 effectively mitigates algorithmic bias across the full Fitzpatrick skin tone spectrum, achieving a robust worst-group performance of 41.40% on the Fitz17k benchmark and a five-fold relative improvement in lower-bound accuracy on the DDI dataset relative to standard multimodal baselines. These results establish a generalizable framework for fair, interpretable, and clinically trustworthy AI-assisted dermatological diagnosis, addressing key obstacles to real-world clinical deployment and advancing health equity in dermatological care.

Article activity feed