PsyRoBERTa: A Large Language Model for Predicting Psychiatric Outcomes from Clinical Notes - A Population-Based Danish EHR study

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Importance

Machine Learning has been widely applied on structured Electronic Health Records (EHR) to predict psychiatric outcomes. However, the complex symptom descriptions and anamneses of psychiatric patients are found in the unstructured clinical notes. which have been rarely utilized for psychiatric research and outcome prediction. However, large language models (LLM) now enable large-scale utilization of clinical notes.

Objective

To develop a LLM for predicting psychiatric outcomes from clinical notes using population-based EHR data from the entire eastern part of Denmark.

Design, setting and participants

This prognostic study included ∼44 million Danish clinical notes, written between 2000 and 2022, from 255,944 patients who have had contact with the mental health services. A LLM was pretrained on ∼40 million notes and finetuned on a dataset of 85,547 psychiatric admissions for predicting psychiatric acute readmissions. The model was evaluated against three publicly available models, pretrained on either public general- or medical-domain text, and a baseline logistic regression classifier. Data was analyzed between April 2023 and November 2024.

Exposure

At least one contact with the mental health services in the eastern part of Denmark.

Main outcomes and measures

Predictability of 1) masked tokens (word-pieces) measured by cross-entropy loss, 2) psychiatric acute readmission, defined as an unplanned readmission within 30 days after discharge, and 3) psychiatric diagnosis recognition, evaluated by AUC, MCC and additional metrics, as well as explainability and fairness analyses.

Results

Our specialized LLM, PsyRoBERTa, outperformed three Danish LLMs in predicting psychiatric acute readmissions (AUC:0.736; MCC:0.303) and was significantly better (p<0.05) than a baseline LR classifier (AUC:0.718; MCC:0.258). A public medical-domain model, MeDa-BERT, was a close second (AUC:0.734; MCC:0.295). Five categories of important features were identified (psychosis, medication, level of function, alcohol and substances, and lack of insight into own illness). PsyRoBERTa was furthermore able to recognize patients’ main current diagnoses through diagnosis-based clustering of the clinical notes (AUC:0.832; MCC:0.489).

Conclusions and relevance

To the best of our knowledge, we present the first clinical LLM specialized for the psychiatric domain. This is a first step towards large-scale utilization of the unstructured EHR data with the prospect of improving patient care.

Key points

Question

Can large language models be used for psychiatric outcome prediction based on clinical notes?

Findings

In this prognostic study including 255,944 individuals and ∼40 million clinical notes, the performance of a large language model for predicting psychiatric acute readmission improved when adapting the model for the clinical domain by pretraining on clinical notes. The model showed descent classification performance and demonstrated multi-purpose abilities with its excellence in recognizing psychiatric diagnoses.

Meaning

These findings demonstrate that adapting large language models to the clinical domain facilitates large-scale utilization of clinical notes for both psychiatric outcome prediction and recognition.

Article activity feed