Using Generative AI for the Objective Assessment of Language in Healthcare
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Traditional methods for language assessment in psychiatric and neurological disorders, such as clinical scales, are time and resource intensive, and can be hampered by rater biases and subjectivity. These limitations can compromise their reliability and sensitivity, as well as their practical use to measure change over time, which is of particular importance in clinical trials. Objective methods are required to improve the evaluation of language function across a spectrum of psychiatric and neurological conditions. To address these challenges, we introduce an innovative method that uses a state-of-the-art artificial intelligence (AI) model, GPT4, to provide an objective evaluation of language. As a test case, we focus on measuring expressive communication capabilities in autistic participants as they naturally converse with their study partner during an observational clinical trial. The conversations were recorded, professionally transcribed, and then processed with GPT4 with the aim of predicting the individuals' Vineland Adaptive Behaviour Scales (VABS-II) expressive communication scores. The model’s predictions were also compared with several benchmark linguistic features (e.g. the number of words spoken per sentence), to determine the added benefit of using such a complex model. We found that GPT4’s predictions correlated strongly with the actual VABS-II scores (Pearson's r > 0.65) and demonstrated high test-retest reliability (ICC(2,1) = 0.97). The model's predictions also accounted for significantly more variance than that explained by the benchmark linguistic features. These findings demonstrate that GPT4 can provide a holistic, reliable, objective, and time-efficient assessment of expressive communication abilities. This suggests that generative AI models like GPT4 could transform the assessment of communicative abilities, to support the assessment of treatment efficacy in clinical trials, and provide a faster and more scalable tool for assessing patients in clinical practice.