NoisyFlow: Differentially Private Optimal Transport Using Neural Networks for Secure Biomedical Data Sharing
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Motivation
Advancing data sharing in biomedical research, particularly for sensitive genomic and clinical datasets, is crucial for improving model performance across diverse patient populations. However, stringent privacy concerns hinder collaboration and limit insights derived from multi-institutional datasets. Current approaches to privacy-preserving data sharing fail to address gaps between data distributions.
Results
We introduce NoisyFlow, a differentially private neural network-based optimal transport framework designed to enable secure and unbiased biomedical data sharing. By integrating optimal transport theory with neural networks and differential privacy mechanisms, our framework aligns data distributions across institutions while preserving individual privacy. NoisyFlow eliminates the need for direct data sharing and reduces distribution shifts caused by covariate and batch effects. Empirical evaluations demonstrate the framework’s effectiveness in handling high-dimensional single-cell genomic data and histopathology images, achieving superior privacy guarantees while maintaining high utility in downstream tasks such as disease classification.
Availability and implementation
The implementation of NoisyFlow is available at https://github.com/liyy2/NoisyFlow .
Contact
mark@gersteinlab.org .
Supplementary information
Supplementary data are available online.