Using Generative AI for the Objective Assessment of Language in Healthcare

James O'Sullivan
Pilar Garces
Eduardo A. Aponte
Julian Tillmann
Christopher Chatham
Florian Lipsmeier
David Nobbs

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Traditional methods for language assessment in psychiatric and neurological disorders, such as clinical scales, are time and resource intensive, and can be hampered by rater biases and subjectivity. These limitations can compromise their reliability and sensitivity, as well as their practical use to measure change over time, which is of particular importance in clinical trials. Objective methods are required to improve the evaluation of language function across a spectrum of psychiatric and neurological conditions. To address these challenges, we introduce an innovative method that uses a state-of-the-art artificial intelligence (AI) model, GPT4, to provide an objective evaluation of language. As a test case, we focus on measuring expressive communication capabilities in autistic participants as they naturally converse with their study partner during an observational clinical trial. The conversations were recorded, professionally transcribed, and then processed with GPT4 with the aim of predicting the individuals' Vineland Adaptive Behaviour Scales (VABS-II) expressive communication scores. The model’s predictions were also compared with several benchmark linguistic features (e.g. the number of words spoken per sentence), to determine the added benefit of using such a complex model. We found that GPT4’s predictions correlated strongly with the actual VABS-II scores (Pearson's r > 0.65) and demonstrated high test-retest reliability (ICC(2,1) = 0.97). The model's predictions also accounted for significantly more variance than that explained by the benchmark linguistic features. These findings demonstrate that GPT4 can provide a holistic, reliable, objective, and time-efficient assessment of expressive communication abilities. This suggests that generative AI models like GPT4 could transform the assessment of communicative abilities, to support the assessment of treatment efficacy in clinical trials, and provide a faster and more scalable tool for assessing patients in clinical practice.

Version published to 10.21203/rs.3.rs-7611391/v1 on Research Square
Nov 4, 2025

Artificial Intelligence in Mental Health Treatment and Research

This article has 2 authors:
1. Steven Mesquiti
2. Erik C Nook
This article has no evaluationsLatest version Oct 8, 2025
Psychiatric Voice Biomarkers: Methodological flaws in pediatric populations

This article has 9 authors:
1. Hammza Jabbar Abd Sattar Hamoudi
2. Mon-Ju Wu
3. Marsal Sanches
4. Cesar A. Soutullo
5. Carolina Olmos
6. Leslie K. Taylor
7. Giovanna Zunta-Soares
8. Jair C. Soares
9. Benson Mwangi
This article has no evaluationsLatest version Oct 15, 2025
Computational Analysis of Expressive Behavior in Clinical Assessment

This article has 4 authors:
1. Jeffrey M. Girard
2. Dasha A. Yermol
3. Albert Ali Salah
4. Jeffrey F Cohn
This article has no evaluationsLatest version Sep 8, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Artificial Intelligence in Mental Health Treatment and Research

Psychiatric Voice Biomarkers: Methodological flaws in pediatric populations

Computational Analysis of Expressive Behavior in Clinical Assessment