Exploring the therapeutic competencies of large language models: Observational study and comparison with meta-analytical estimates for human therapists
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Are large language models competent psychotherapists? This paper presents a comparative study benchmarking an AI chatbot’s therapeutic competencies against meta-analytical estimates for human therapists’ competencies. Sixty-five young adults with mild-to-moderate psychological distress received one session of cognitive behavioral therapy delivered by a chatbot. A systematic review and meta-analysis of 18 studies of human therapists was conducted to derive reference estimates for therapist competence. Comparative analyses showed that the chatbot demonstrated significantly lower competence than therapists overall ( k = 17, p = .020, d = − .296). However, compared to studies judged to be of high methodological quality ( k = 6), no significant difference was observed ( p = .506, d = − .083). The chatbot exceeded the average competence in medium quality studies ( k = 5, p < .001, d = .516). As lapses in therapeutic competence were observed, research into when and why therapy chatbots perform more or less skillfully is warranted.