Patient-Reported Challenges in Lymphoma Diagnosis: Analysis of Online Forum Narratives Using Artificial Intelligence

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Lymphoma diagnosis remains challenging due to diverse subtypes and nonspecific presentations. While prior research focused primarily on clinical accuracy, how patients experience and describe these challenges remain understudied. This study systematically analyzed online patient narratives to investigate their perspectives on diagnostic difficulties.

Methods

We developed an artificial intelligence (AI) pipeline integrating DeepSeek large language model, optical character recognition, and transformer-embedded keyword clustering to analyse online narratives reporting diagnostic discrepancies or misdiagnosis from China’s largest lymphoma forum for patients and caregivers (house086.com). The pipeline extracted patient demographics, timelines, diagnostic barriers, facilitators, outcomes, and AI-graded severity. External validation (n=400) against manually-derived labels assessed pipeline reliability. Multivariable logistic regression examined associations between barriers, facilitators, and the binarized severity scores.

Results

Over the study period (2011-2025), patients reporting diagnostic difficulties doubled, while the rate of severe outcomes declined. From 2016 narratives (median patient age 47; 59% family-authored), AI-assisted keyword taxonomy identified 7 diagnostic facilitators, 11 barriers, and 5 consequences. Psychological distress was the most common consequence of diagnostic challenges (90%). Clinician-related issues (91%) and case complexity (77%) were the most prevalent barriers, but inappropriate initial treatment conferred the greatest risk (OR 19.06, 11.30-32.17). Among facilitators, specialist input reduced severe outcomes by 40% (OR 0.60, 0.44-0.81), while peer networks (OR 0.62) and clinician expertise (OR 0.65) provided additional protection.

Conclusion

This large-scale analysis of patient narratives identified factors underlying patient-perceived diagnostic difficulties in lymphoma. AI enables scalable analysis of patient-generated data, offering insights into targeted quality improvement and digital health interventions in patient safety.

Article activity feed