NoisyFlow: Differentially Private Optimal Transport Using Neural Networks for Secure Biomedical Data Sharing

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation

Advancing data sharing in biomedical research, particularly for sensitive genomic and clinical datasets, is crucial for improving model performance across diverse patient populations. However, stringent privacy concerns hinder collaboration and limit insights derived from multi-institutional datasets. Current approaches to privacy-preserving data sharing fail to address gaps between data distributions.

Results

We introduce NoisyFlow, a differentially private neural network-based optimal transport framework designed to enable secure and unbiased biomedical data sharing. By integrating optimal transport theory with neural networks and differential privacy mechanisms, our framework aligns data distributions across institutions while preserving individual privacy. NoisyFlow eliminates the need for direct data sharing and reduces distribution shifts caused by covariate and batch effects. Empirical evaluations demonstrate the framework’s effectiveness in handling high-dimensional single-cell genomic data and histopathology images, achieving superior privacy guarantees while maintaining high utility in downstream tasks such as disease classification.

Availability and implementation

The implementation of NoisyFlow is available at https://github.com/liyy2/NoisyFlow .

Contact

mark@gersteinlab.org .

Supplementary information

Supplementary data are available online.

Article activity feed