Scalable depression monitoring with smartphone speech: a multimodal benchmark and topic analysis

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective, scalable biomarkers are needed for continuous monitoring of major depressive disorder (MDD). Smartphone-collected speech is promising, yet extracting clinically useful signals remains difficult. We analysed 3 151 weekly voice diaries from 284 German-speaking adults (128 MDD, 156 controls) and regressed Beck Depression Inventory (BDI) scores. Sentence embeddings from the open-source 8-billion-parameter Qwen3-8B model predicted scores with MAE = 4.45 and R 2 = 0.35, explaining 16 more points of variance than the best traditional feature set (TF-IDF). Adding lexical–prosodic or TF-IDF features provided only marginal improvement (best MAE = 4.39). To interpret the embeddings we applied BERTopic and uncovered ten coherent themes; BDI scores peaked for “Persistent Low Mood” and “Pain Distress”, confirming clinical relevance. Large-language-model embeddings therefore capture the dominant signal of depression severity in everyday speech and, paired with interpretable topic analysis, offer a privacy-preserving, scalable route to digital mental-health phenotyping.

Article activity feed