Synthetic Image Learning: Preserving Performance and Preventing Membership Inference Attacks

Eugenio Lomurno
Matteo Matteucci

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Generative artificial intelligence has transformed the generation of synthetic data, providing innovative solutions to challenges like data scarcity and privacy, which are particularly critical in fields such as medicine. However, the effective use of this synthetic data to train high-performance models remains a significant challenge. This paper addresses this issue by introducing Knowledge Recycling (KR), a pipeline designed to optimise the generation and use of synthetic data for training downstream classifiers. At the heart of this pipeline is Generative Knowledge Distillation, the proposed technique that significantly improves the quality and usefulness of the information provided to classifiers through a synthetic dataset regeneration and soft labelling mechanism. The KR pipeline has been tested on a variety of datasets, with a focus on six highly heterogeneous medical image datasets, ranging from retinal images to organ scans. The results show a significant reduction in the performance gap between models trained on real and synthetic data, with models based on synthetic data outperforming those trained on real data in some cases. Furthermore, the resulting models show almost complete immunity to Membership Inference Attacks, manifesting privacy properties missing in models trained with conventional techniques.

Version published to 10.31224/4060
Nov 1, 2024

Evaluating the Utility of Synthetic Image Generation for Medical AI: A Review

This article has 3 authors:
1. Israa Atike
2. Asifa Mehmood Qureshi
3. Abhishek Kaushik
This article has no evaluationsLatest version Dec 22, 2025
SULBA: A Task-Agnostic Data Augmentation Framework for Deep Learning in Medical Image Analysis

This article has 2 authors:
1. Ayomide Adeyemi Abe
2. Mpumelelo Nyathi
This article has no evaluationsLatest version Jan 19, 2026
Medical Image Generation using Denoising Diffusion Probabilistic Model

This article has 6 authors:
1. Saritha A N
2. Sarvesh Rastogi
3. Shreya Bharamanna Patil
4. Basavaraj Talawar
5. Shreya Soni
6. Setu Mishra
This article has no evaluationsLatest version Jan 12, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Evaluating the Utility of Synthetic Image Generation for Medical AI: A Review

SULBA: A Task-Agnostic Data Augmentation Framework for Deep Learning in Medical Image Analysis

Medical Image Generation using Denoising Diffusion Probabilistic Model