Automated Speech-Fluency Explanations for Schizophrenia Diagnosis

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Schizophrenia is a chronic and severe mental disorder that still relies on time-intensive, clinician-administered assessments. Although several automated approaches have been proposed to support diagnosis, these systems often lack the level of explainability necessary for informed clinical decision-making. In this study, we present a fully automated and explainable pipeline for detecting schizophrenia from audio recordings of verbal fluency tests, collected from 126 Slovene-speaking participants (68 healthy controls, 58 individuals diagnosed with schizophrenia), leveraging recent advancements in automatic speech recognition (ASR) and large language model (LLM) systems. We evaluated three ASR models—Truebar, Whisper, and Soniox—for transcription quality, and selected the best-performing system for further processing. We semantically enriched the transcriptions using the generative capabilities of LLMs and extracted both verbal and non-verbal features grounded in established diagnostic criteria. We assessed the relevance of these features using a Bayesian statistical framework and trained multiple classical machine learning models for automatic classification. Our best-performing model, an Explainable Boosting Machine, achieved a classification accuracy of 0.82 and an AUC of 0.90. We further generated visual explanations for the model's predictions, establishing the first fully automated and explainable schizophrenia detection framework developed for the Slovene language. Our approach prioritizes explainability through model-transparent outputs, while still achieving performance comparable to existing automated systems for speech-based schizophrenia detection.

Article activity feed