Automated Insomnia Phenotyping from Electronic Health Records: Leveraging Large Language Models to Decode Clinical Narratives

Guillermo Lopez-Garcia
Davy Weissenbacher
Matthew Stadler
Karen O’Connor
Dongfang Xu
Lauren Gryboski
Jared Heavens
Noor Abu-el-Rub
Diego R. Mazzotti
Subhajit Chakravorty
Graciela Gonzalez-Hernandez

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Insomnia is a highly prevalent but often underdiagnosed condition in clinical practice. Its inconsistent documentation in electronic health records (EHRs) limits population-level analyses and obstructs efforts to evaluate treatment patterns or outcomes. We present a novel, fully automated approach for phenotyping insomnia directly from unstructured clinical notes using generative large language models (LLMs). Leveraging prompt engineering with few-shot learning and chain-of-thought reasoning, we evaluated our system on two distinct corpora: inpatient clinical notes from MIMIC-III and outpatient primary care notes from the University of Kansas Health System (KUMC). Our models—Llama 70B and Llama 405B—achieved F1 scores of 93.0 on the MIMIC corpus and 85.7 on the KUMC corpus, substantially outperforming domain-adapted BERT-based classifiers. Ultimately, our framework offers a scalable and interpretable solution for clinical phenotyping of insomnia and can serve as a blueprint for similar efforts targeting other underdiagnosed or under-documented conditions in the EHR.

Version published to 10.1101/2025.06.02.25328701v1 on medRxiv
Jun 3, 2025

Empirical Review of LLM-driven Classification of Multidimensional Sleep Health Mentions from Free-Text Clinical Notes

This article has 5 authors:
1. Syed-Amad Hussain
2. Ariana Calloway
3. Joseph Sirrianni
4. Eric Fosler-Lussier
5. Mattina Davenport
This article has no evaluationsLatest version Jun 5, 2025
clickBrick Prompt Engineering: Optimizing Large Language Model Performance in Clinical Psychiatry

This article has 10 authors:
1. F Gerrik Verhees
2. Fabian Huth
3. Vincent Meyer
4. Fabian Wolf
5. Michael Bauer
6. Andrea Pfennig
7. Philipp Ritter
8. Jakob N Kather
9. Isabella C Wiest
10. Pavol Mikolas
This article has no evaluationsLatest version Jun 30, 2025
From Keywords to Context: Bridging Expert Insight and Language Models for Multidimensional Sleep Health Classification in Clinical Notes

This article has 5 authors:
1. Syed-Amad Hussain
2. Ariana Calloway
3. Joseph W Sirrianni
4. Eric Fosler-Lussier
5. Mattina Davenport
This article has no evaluationsLatest version Jun 7, 2025

Listed in

Abstract

Article activity feed

Related articles

Empirical Review of LLM-driven Classification of Multidimensional Sleep Health Mentions from Free-Text Clinical Notes

clickBrick Prompt Engineering: Optimizing Large Language Model Performance in Clinical Psychiatry

From Keywords to Context: Bridging Expert Insight and Language Models for Multidimensional Sleep Health Classification in Clinical Notes