Metrics That Matter: A Practical Survey on Synthetic Data Evaluation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Assessing the quality of synthetic data (SD) is vital to determine whether it can provide a viable alternative to real data. A wide variety of metrics exist to examine the three archetypal dimensions of SD evaluation: realism (fidelity), task-specific usefulness (utility), and remaining disclosure risk (privacy). Current work in SD generation often relies on the ad-hoc selection of evaluation metrics without a clear justification, while the suitability of metrics strongly depend on the dataset and other contextual factors. This paper surveys the field of SD evaluation, provides guidance regarding metric selection based on four key questions pertaining to the task, goal, data type, and domain of SD, and provides general practical recommendations on SD evaluation. Finally, experiments on an illustrative dataset of electronic health records show how researchers can bring our insights and recommendations for SD evaluation into practice. By doing so, we aim to support researchers and practitioners seeking to generate and evaluate SD.

Article activity feed