Wavelet-Domain Privacy SGD (WDP-SGD): FrequencySelective Privacy-Preserving Medical AI.

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Protecting sensitive medical data during training is critical because transformer gradients can leak patient-specific information. We introduce a privacy-preserving clinical AI framework that integrates three complementary elements: (i) Bayesian synthetic data generation to produce epidemiologically realistic yet non-identifiable electronic health records, (ii) Wavelet-Domain Privacy Stochastic Gradient Descent(WDP-SGD) to apply frequency-selective noise to gradient updates of BERT-based classifiers, and (iii) multi-modal privacy auditing to empirically monitor potential information leakage. Unlike conventional differential privacy, which injects uniform noise, WDP-SGD perturbs high-frequency gradient components that disproportionately encode patient-specific information while preserving low-frequency components containing generalisable medical knowledge. Applied to a large synthetic medical text corpus covering multiple conditions, our approach consistently delivers stronger privacy protection and improved model performance relative to standard DP-SGD while maintaining convergence behaviour close to a non-private baseline. Privacy attack simulations, including membership inference, attribute inference and gradient reconstruction, further demonstrate enhanced resilience to adversarial attempts to extract sensitive information. These results indicate that wavelet-based differential privacy offers a practical pathway to privacy-conscious clinical language models, achieving a more favourable balance between privacy and utility than existing uniform-noise methods.

Article activity feed