Hybrid Active Learning with Privacy-Preserving Synthetic Data for Medical Multimodal LLM Enhancement

xiaochen xiao

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

We propose a novel framework for enhancing medical multimodal learning by integrating hybrid active learning (HAL) with privacy-preserving synthetic data generation, addressing critical challenges of data scarcity and patient privacy. The framework reformulates the Large Language Model (LLM) module as a dynamic multi-agent system, where modality-specific models collaborate through reinforcement learning to optimize both diagnostic accuracy and privacy compliance. The HAL component strategically selects the most informative unlabeled samples by combining uncertainty and diversity metrics, thereby minimizing annotation costs while maximizing model performance. Furthermore, synthetic medical data is generated under rigorous local differential privacy guarantees using a modified GAN architecture, ensuring that synthetic samples do not replicate real patient information. The multi-agent reinforcement learning mechanism dynamically adjusts key parameters, such as the trade-off between active learning criteria and privacy constraints, enabling adaptive optimization during fine-tuning. Experimental validation on multimodal medical datasets demonstrates significant improvements in diagnostic accuracy compared to conventional methods, particularly in low-data regimes. The proposed framework not only mitigates privacy risks inherent in medical data but also enhances the robustness of multimodal fusion by aligning cross-modal representations. This work represents a significant advancement in medical AI by unifying active learning, privacy preservation, and adaptive optimization into a single cohesive system, with broad applicability to clinical decision support and automated diagnostics.

Version published to 10.21203/rs.3.rs-6739954/v1 on Research Square
May 28, 2025

An Efficient Differentially-Private Weighted Support Vector Machine Algorithm with Noisy Gradient Descent

This article has 2 authors:
1. Shuvo Chandra Pall
2. Hafiz Imtiaz
This article has no evaluationsLatest version Jun 13, 2025
AI-Powered Ecosystem for Multilingual Diagnostics and Adaptive Specialty Mapping

This article has 4 authors:
1. G J Rahul
2. Rithika Ramesh Chettiar
3. S A Vinit
4. Manjula Devi P
This article has no evaluationsLatest version May 23, 2025
PrivacyPreserveNet: A Multilevel Privacy-Preserving Framework for Multimodal LLMs via Gradient Clipping and Attention Noise

This article has 2 authors:
1. Yunfei Guo
2. Yiming Yu
This article has no evaluationsLatest version Jun 3, 2025

Listed in

Abstract

Article activity feed

Related articles

An Efficient Differentially-Private Weighted Support Vector Machine Algorithm with Noisy Gradient Descent

AI-Powered Ecosystem for Multilingual Diagnostics and Adaptive Specialty Mapping

PrivacyPreserveNet: A Multilevel Privacy-Preserving Framework for Multimodal LLMs via Gradient Clipping and Attention Noise