Detection of Induced Psychosis-Like Thinking in User–GenAI Interaction: A Retrieval-Augmented Generation (RAG) Approach
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
With the rapid growth of generative artificial intelligence (GenAI), concerns about its psychological safety have increased. Recent psychiatric reports show that some users—especially those with vulnerable mental health—may develop delusional thoughts or psychosis-like experiences during interactions with GenAI. Detecting such risk language is therefore important for the safe deployment of large language models (LLMs). This study examined whether LLMs can identify delusional statements and whether retrieval-augmented generation (RAG) can improve performance. A dataset of 200 simulated user statements (50% delusional, 50% non-delusional) was created based on DSM-5 criteria and validated by a licensed psychotherapist. The dataset was divided into a 100-item retrieval corpus and a 100-item test set. Two GPT-4o-mini classifiers were evaluated: a baseline non-RAG model and a RAG-enhanced model that retrieved the three most semantically similar examples from the corpus. When a statement was classified as delusional, models also assigned a severity rating (1–3). The non-RAG model achieved 81% accuracy with a recall of 0.66, whereas the RAG model reached 94% accuracy with improved recall (0.89) and perfect specificity. However, both models systematically underestimated severity, with the RAG model showing greater underestimation. These findings indicate that RAG can enhance early detection of psychosis-like language but is insufficient for reliable severity assessment, highlighting both the potential and the limits of current LLM-based cognitive-risk detection.