The Guideline for Building Fair Multimodal Medical AI with Large Vision-Language Model

Bo Yan
Wenqi Zeng
Yuqi Sun
Weimin Tan
Xue Zhou
Chenxi Ma

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Multimodal medical artificial intelligence (AI) is increasingly recognized as superior to traditional single-modality approaches due to its ability to integrate diverse data sources, which align closely with clinical diagnostic processes. However, the impact of multimodal information interactions on model fairness remains unknown, leading to a critical challenge for equitable AI deployment in healthcare. Here, we extend fairness research to multimodal medical AI and leverage large-scale medical vision-language models (VLMs) to provide guidelines for building fair multimodal AI. Training on large and diverse datasets enables medical VLMs to discern variances across populations, thereby offering a more equitable insight compared to single data sources. Our analysis covers three key medical domains—dermatology, radiology, and ophthalmology—focusing on how patient metadata interacts with medical images to affect model fairness across dimensions such as gender, age, and skin tone. Our findings reveal that the indiscriminate inclusion of all metadata may negatively impact fairness for protected subgroups and show how multimodal AI utilizes demographic information in metadata to influence fairness. In addition, we conducted an in-depth analysis of how clinical attributes affect model performance and fairness, covering more than 20 different attributes in dermatology. Finally, we proposed a fairness-oriented metadata selection strategy using recent advancements in large medical VLMs to guide attribute selection. Remarkably, we found that the fairness correlations computed by the medical VLM closely align with our experimental results, which required over 500 GPU hours, demonstrating a resource-efficient approach to guide multimodal integration. Our work underscores the importance of careful metadata selection in achieving fairness in multimodal medical AI. We anticipate that our analysis will be a starting point for more sophisticated multimodal medical AI models of fairness.

Version published to 10.21203/rs.3.rs-5015239/v1 on Research Square
Oct 24, 2024

An In-Depth Survey of Multimodal Foundation Models and Their Challenges

This article has 2 authors:
1. Haoran Yijun
2. Shufen Zhihao
This article has no evaluationsLatest version Jul 1, 2025
A synthetic data generation framework for scalable and resource-efficient medical AI assistants

This article has 10 authors:
1. Abdurrahim Yilmaz
2. Furkan Yuceyalcin
3. Rahmetullah Varol
4. Ece Gokyayla
5. Ozan Erdem
6. Donghee Choi
7. Ali Anil Demircali
8. Gulsum Gencoglan
9. Joram M. Posma
10. Burak Temelkuran
This article has no evaluationsLatest version May 18, 2025
Medical Lie Detector (MLD): A Hybrid System for Validating AI Clinical Compiled Summaries

This article has 7 authors:
1. Iyad Sultan
2. Mais Altarawneh
3. Belal Lahham
4. Haitham Aryan
5. Ahmad Nasayreh
6. Hasan Gharaibeh
7. Bayan Altalla
This article has no evaluationsLatest version Jun 3, 2025

Listed in

Abstract

Article activity feed

Related articles

An In-Depth Survey of Multimodal Foundation Models and Their Challenges

A synthetic data generation framework for scalable and resource-efficient medical AI assistants

Medical Lie Detector (MLD): A Hybrid System for Validating AI Clinical Compiled Summaries