Demonstrating High Validity of a New AI-Language Assessment of PTSD: A Sequential Evaluation with Model Pre-registration

Oscar Nils Erik Kjell
Adithya V Ganesan
Ryan Boyd
Joshua R. Oltmanns
Alfredo Rivero
Scott Feltman
Melissa Anne Carr
Benjamin Luft
Roman Kotov
H. Andrew Schwartz

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

BACKGROUND: Modern Artificial Intelligence (AI) has shown promise in identifyingpsychopathology based on the language used by patients, providing a scalable method forobtaining relevant behavioral markers. However, no existing models for assessing posttraumaticstress disorder (PTSD) have successfully demonstrated out-of-sample replicability. We developa language-based AI model for PTSD and rigorously evaluate replicability in a prospectivesample.METHODS: Participants from the Stony Brook World Trade Center (WTC) Health andWellness Program described their lives in an automated interview during a clinical monitoringvisit. The language was analysed using AI to assess PTSD CheckList (PCL) for total symptomseverity score and four symptom subscales and validated against medical record PTSDdiagnosis.To yield realistic accuracy estimates in this cross-sectional study, we propose the SequentialEvaluation with Model Pre-registration design, consisting of an iterative, two-phase pre-registration paradigm. The first pre-registration specifies the data split, the model development,and the initial hypotheses. The second pre-registration specifies the exact pre-trained models,data cleaning procedures, and the refined hypotheses.RESULTS: The data split included a development (N=1437) and a prospective (N=346) dataset.Within the prospective sample, the pre-registered models produced scores that significantlycorrelated with their targets: PCL total (r=.38, p-value<.001) and the four subscales (r=.28–.37,p-value<.001). The pre-registered model for PCL total showed a robust association with PTSDdiagnosis (AUC=.76), significantly outperforming demographics (AUC=.61, p-value=.006),WTC attack exposures (AUC=.61, p-value=.007) and a validated depression language model(AUC=.60, p-value<.001).CONSLUSIONS: We developed new AI-language assessments of PTSD symptom severity.Within a clinical setting and over prospectively collected participant data, the assessmentsreplicated with high convergent validity with self-report and high external validity againstdiagnosis in medical records. Analyses of observable behavioral markers in automated clinicalinterview language can produce robust psychiatric assessments, overcoming limitations foundin traditional assessments.

Version published to 10.31234/osf.io/xw24e on OSF Preprints
Oct 4, 2024

The Broad Structure of Psychopathology in the All of Us Research Program

This article has 2 authors:
1. Alireza Ehteshami
2. Irwin D. Waldman
This article has no evaluationsLatest version May 21, 2025
Generative Artificial Intelligence in PTSD Treatment: Exploring Five Different Use Cases

This article has 4 authors:
1. Philip Held
2. Elizabeth Cameron Stade
3. Katherine Dondanville
4. Shannon Wiltsey Stirman
This article has no evaluationsLatest version May 15, 2025
Generative Artificial Intelligence in PTSD Treatment: Exploring Five Different Use Cases

This article has 4 authors:
1. Philip Held
2. Elizabeth Cameron Stade
3. Katherine Dondanville
4. Shannon Wiltsey Stirman
This article has no evaluationsLatest version May 15, 2025

Listed in

Abstract

Article activity feed

Related articles

The Broad Structure of Psychopathology in the All of Us Research Program

Generative Artificial Intelligence in PTSD Treatment: Exploring Five Different Use Cases

Generative Artificial Intelligence in PTSD Treatment: Exploring Five Different Use Cases