Exploring the therapeutic competencies of large language models: Observational study and comparison with meta-analytical estimates for human therapists

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Are large language models competent psychotherapists? This paper presents a comparative study benchmarking an AI chatbot’s therapeutic competencies against meta-analytical estimates for human therapists’ competencies. Sixty-five young adults with mild-to-moderate psychological distress received one session of cognitive behavioral therapy delivered by a chatbot. A systematic review and meta-analysis of 18 studies of human therapists was conducted to derive reference estimates for therapist competence. Comparative analyses showed that the chatbot demonstrated significantly lower competence than therapists overall ( k  = 17, p = .020, d  = − .296). However, compared to studies judged to be of high methodological quality ( k  = 6), no significant difference was observed ( p = .506, d  = − .083). The chatbot exceeded the average competence in medium quality studies ( k  = 5, p < .001, d = .516). As lapses in therapeutic competence were observed, research into when and why therapy chatbots perform more or less skillfully is warranted.

Article activity feed