Comorbidity classification from clinical free-text using large language models: application to sleep disorder patients

Yihan Deng
Fabio Dennstädt
Irina Filchenko
Julia van der Meer
Xiaoli Yang
Markus H. Schmidt
Claudio L. A. Bassetti
Athina Tzovara
Kerstin Denecke

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Patients presenting to neurology clinics commonly have a complex history of comorbidities and partially documented health trajectories, making it essential to reliably extract comorbidity information from historical records. However, existing extraction methods, ranging from rule-based systems to classical machine learning (ML), often fall short in accuracy, scalability, or adaptability across diverse document types.In this study, we present a large language model (LLM)-based framework for comorbidity extraction from diagnostic texts, capable of handling various prompt formats and textual sources such as patient history, prior diagnoses, and structured sleep assessments. The fine-tuned Mistral-24B (Instruct-2501) model achieves 95% macro classification accuracy and 92% F1 measure across six common classes of comorbidities, substantially outperforming prior state-of-the-art approaches. The proposed method extracts comorbidities through a transparent hierarchical approach, thereby supporting clinical analysis and providing interpretable insights for disease modeling and personalized treatment planning in sleep medicine.

Version published to 10.21203/rs.3.rs-7763721/v1 on Research Square
Oct 24, 2025

Personalized Disease Risk Prediction from Multimodal Health Data Using Large Language Models

This article has 2 authors:
1. Hanieh Arjmand
2. Alexandre Tomberg
This article has no evaluationsLatest version Jan 25, 2026
Voice as a Digital Biomarker: Foundation Model-Based COPD Assessment

This article has 9 authors:
1. Sang Mee Lee
2. Hyein Ryu
3. Sunga Kong
4. Sun Hye Shin
5. Wooseong Huh
6. Myung Jin Chung
7. Juhee Cho
8. Taeyoung Kim
9. Hye Yun Park
This article has no evaluationsLatest version Dec 18, 2025
Parkinson’s disease in real life healthcare organization database: Medication based algorithm, incidence and prodromal symptoms

This article has 6 authors:
1. Hila Avisar
2. Ruth Djaldetti
3. Amir Krivoy
4. Anat Mirelman
5. Roy N. Alcalay
6. Nir Giladi
This article has no evaluationsLatest version Dec 22, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Personalized Disease Risk Prediction from Multimodal Health Data Using Large Language Models

Voice as a Digital Biomarker: Foundation Model-Based COPD Assessment

Parkinson’s disease in real life healthcare organization database: Medication based algorithm, incidence and prodromal symptoms