Exploring the Role of Synthetic Data in the Future of AI in Healthcare: Frameworks, Challenges, and Implications
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Synthetic data generation is gaining traction as a powerful approach to address privacy, accessibility, and representation challenges in healthcare research. This scoping review examined the breadth of existing techniques and their implications for clinical and research use. A systematic search was conducted across PubMed, IEEE Xplore, and ACM Digital Library, resulting in the inclusion of 42 studies out of an initial pool of 2,906 records. The review identified a diverse range of approaches used to generate synthetic healthcare data, each with varying capabilities for maintaining data utility, privacy, and realism. Key findings indicate a growing interest in multimodal synthesis, privacy-preserving frameworks, and evaluation strategies tailored to healthcare needs. However, inconsistencies in validation methods and the absence of standard benchmarks remain key limitations in the field. This review highlights the need for clearer guidance, robust evaluation protocols, and cross-sector collaboration to support responsible integration of synthetic data into healthcare systems.