PsyRoBERTa: A Large Language Model for Predicting Psychiatric Outcomes from Clinical Notes - A Population-Based Danish EHR study
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Importance
Machine Learning has been widely applied on structured Electronic Health Records (EHR) to predict psychiatric outcomes. However, the complex symptom descriptions and anamneses of psychiatric patients are found in the unstructured clinical notes. which have been rarely utilized for psychiatric research and outcome prediction. However, large language models (LLM) now enable large-scale utilization of clinical notes.
Objective
To develop a LLM for predicting psychiatric outcomes from clinical notes using population-based EHR data from the entire eastern part of Denmark.
Design, setting and participants
This prognostic study included ∼44 million Danish clinical notes, written between 2000 and 2022, from 255,944 patients who have had contact with the mental health services. A LLM was pretrained on ∼40 million notes and finetuned on a dataset of 85,547 psychiatric admissions for predicting psychiatric acute readmissions. The model was evaluated against three publicly available models, pretrained on either public general- or medical-domain text, and a baseline logistic regression classifier. Data was analyzed between April 2023 and November 2024.
Exposure
At least one contact with the mental health services in the eastern part of Denmark.
Main outcomes and measures
Predictability of 1) masked tokens (word-pieces) measured by cross-entropy loss, 2) psychiatric acute readmission, defined as an unplanned readmission within 30 days after discharge, and 3) psychiatric diagnosis recognition, evaluated by AUC, MCC and additional metrics, as well as explainability and fairness analyses.
Results
Our specialized LLM, PsyRoBERTa, outperformed three Danish LLMs in predicting psychiatric acute readmissions (AUC:0.736; MCC:0.303) and was significantly better (p<0.05) than a baseline LR classifier (AUC:0.718; MCC:0.258). A public medical-domain model, MeDa-BERT, was a close second (AUC:0.734; MCC:0.295). Five categories of important features were identified (psychosis, medication, level of function, alcohol and substances, and lack of insight into own illness). PsyRoBERTa was furthermore able to recognize patients’ main current diagnoses through diagnosis-based clustering of the clinical notes (AUC:0.832; MCC:0.489).
Conclusions and relevance
To the best of our knowledge, we present the first clinical LLM specialized for the psychiatric domain. This is a first step towards large-scale utilization of the unstructured EHR data with the prospect of improving patient care.
Key points
Question
Can large language models be used for psychiatric outcome prediction based on clinical notes?
Findings
In this prognostic study including 255,944 individuals and ∼40 million clinical notes, the performance of a large language model for predicting psychiatric acute readmission improved when adapting the model for the clinical domain by pretraining on clinical notes. The model showed descent classification performance and demonstrated multi-purpose abilities with its excellence in recognizing psychiatric diagnoses.
Meaning
These findings demonstrate that adapting large language models to the clinical domain facilitates large-scale utilization of clinical notes for both psychiatric outcome prediction and recognition.