Comorbidity classification from clinical free-text using large language models: application to sleep disorder patients

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Patients presenting to neurology clinics commonly have a complex history of comorbidities and partially documented health trajectories, making it essential to reliably extract comorbidity information from historical records. However, existing extraction methods, ranging from rule-based systems to classical machine learning (ML), often fall short in accuracy, scalability, or adaptability across diverse document types.In this study, we present a large language model (LLM)-based framework for comorbidity extraction from diagnostic texts, capable of handling various prompt formats and textual sources such as patient history, prior diagnoses, and structured sleep assessments. The fine-tuned Mistral-24B (Instruct-2501) model achieves 95% macro classification accuracy and 92% F1 measure across six common classes of comorbidities, substantially outperforming prior state-of-the-art approaches. The proposed method extracts comorbidities through a transparent hierarchical approach, thereby supporting clinical analysis and providing interpretable insights for disease modeling and personalized treatment planning in sleep medicine.

Article activity feed