Analysis of Conceptual Overlap Among Formal Thought Disorder Rating Scales in Psychosis: A Systematic Semantic Synthesis
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Measuring Formal Thought Disorder (FTD), a common, cross-diagnosed symptom dimension across mental disorders, is plagued by numerous inconsistencies. Clinicians use either FTD-specific scales or items from generic scales. While these tools are based on extensive clinical observations, they suffer from inconsistent terminology. Different scales may use the same term for distinct concepts or different terms for the same concept. This lack of conceptual standardization prevents the identification of underlying FTD subconstructs. By using natural language processing, we compared the definitions, labeling and overlap of FTD symptoms across psychopathological scales.We used a three-pronged validation approach to analyze semantic clusters of FTD scale items. First, we used sentence-BERT to divide 30 Thought and Language Disorder scale (TALD) items into positive or negative FTD clusters, validating this approach by checking for correspondence with published factor-analytic divisions (approach validation). Second, we created a sparse item-to-item similarity matrix from 103 items across seven scales to identify semantically converging cross-scale FTD items; a clinician-researcher described the resulting four clusters, and we compared our automated classification with that of six blinded experts to establish expert-machine semantic correspondence. Finally, we analyzed data from 98 participants (49 healthy controls and 49 schizophrenia/affective psychosis), identifying the highest-correlating Clinical Language Disorder Scale (CLANG) item for each Thought, Language and Communication (TLC) scale item and mapping these to our BERT-derived clusters to establish data-level correspondence.When assigning TALD items to BERT-derived positive or negative FTD groupings, we observed a 73% match with prior factor analyses. The BERT-informed clustering of cross-scale items highlighted four coherent FTD groupings; 1) muddled communication & incomprehension. 2) abrupt topic shifts. 3) inconsistent narrative structure, 4) restricted speech. Expert raters showed moderate-to-high overlap (Fleiss’ kappa = .617) with computational clusters. A binomial test indicated that at the level of individual participants, correlations among CLANG-TLC item pairs were significantly more likely than chance to fall into the expected semantic cluster (p < .001). Our results indicates that FTD rating scales measure overlapping, semantically related constructs that drive item-level correlations. Semantic clustering acts as a novel method to harmonize multi-scale data and pinpoint discrepancies between expert and machine classifications. Computational linguistics has the potential to improve consistency across rating scales especially when measuring complex constructs such as FTD.