Character-Level Linguistic Biomarkers for Precision Assessment of Cognitive Decline: A Symbolic Recurrence Approach
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Early and accurate detection of Alzheimer’s disease (AD) remains a critical challenge for precision health. Traditional cognitive assessments often miss subtle, individualized patterns of decline, while conventional linguistic analyses focus on word-level features that may overlook fine-grained speech disruptions. We test the hypothesis that character-level features in speech transcripts capturing pauses, repetitions, and hesitations at the finest linguistic granularity can serve as novel biomarkers for cognitive decline, revealing personalized linguistic signatures that manifest uniquely in each individual. Our biomarker discovery framework employs symbolic character-level encoding followed by recurrence quantification analysis to transform speech transcripts into visual recurrence plots that reveal temporal speech dynamics. Siamese networks learn embeddings from these plots to capture discriminative patterns at the character level. We validate our hypothesis using the DementiaBank corpus, demonstrating that character-level biomarkers achieve superior discriminative capability compared to conventional word-level approaches (95.9% vs. 87.5% AUC), while providing interpretable recurrence plot visualizations. Our findings establish that character-level linguistic features contain significant biomarker information for cognitive assessment, representing a fundamental shift from word-based to character-based analysis for precision health applications in dementia screening.