Evaluating the Potential of AI-Generated Synthetic Diaries in Parkinson’s Disease Research
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The integration of Artificial Intelligence (AI), particularly large language models like GPT-4o, into Parkinson’s Disease (PD) research presents a novel approach for generating synthetic patient diaries. These technologies offer potential benefits, including addressing data privacy concerns, overcoming limited sample sizes, and accelerating research timelines by providing alternative data sources. By leveraging its internal knowledge, GPT-4o demonstrated the capability to replicate overall symptom prevalence distributions observed in a real PD patient dataset without significant statistical deviation.
Despite these advantages, the widespread utility of AI-generated diaries based solely on internal knowledge is hindered by significant limitations identified in this case study. Key challenges include the failure to capture complex inter-variable correlations essential for understanding symptom co-occurrence, and a lack of the narrative richness, contextual depth, and linguistic nuance found in authentic patient reports. These findings underscore the constraints of current models in replicating real-world patient experiences without specific domain grounding. Addressing these challenges requires a multifaceted approach, including domain-specific fine-tuning, enhanced prompt engineering, and potentially hybrid data strategies to improve fidelity for high-stakes research applications. This case study explored the baseline capabilities and limitations of using GPT-4o’s internal knowledge for synthetic PD diary generation. It emphasizes the need for a balanced approach, acknowledging the potential for exploratory uses while highlighting the necessity for rigorous validation and further development before deployment in contexts requiring high fidelity. By fostering continued research and methodological refinement, AI-driven synthetic data generation can be better harnessed to support PD research and ultimately improve patient understanding.