Evaluating Voice-Enabled Generative AI for Mental Health: Real-Time Performance and Safety Analyses

Nhat Ngo
Akane Sano

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

This study investigates the integration of Voice AI into a locally hosted generative AI chatbot designed to function as a mental health assistant, with the goal of enabling intuitive, voice-based therapeutic interaction. Leveraging the Llama3.1 8B language model for privacy-preserving generation, the system combines Deepgram’s Speech-to-Text API and OpenAI’s Text-to-Speech API within a WebRTC-based framework to support low-latency, bi-directional communication. A custom pipeline facilitates real-time voice input and output, aiming to reduce barriers to engagement and foster a more natural conversational flow. Technical evaluation focuses on latency across short, long-form, and multi-turn dialogues, revealing response times within tolerable bounds for synchronous use. Prompt engineering and system prompt customization guide empathetic, context-aware responses in standard therapeutic scenarios, though limitations persist in handling edge cases. These findings suggest that locally hosted voice-enabled LLMs can support responsive, privacy-conscious mental health applications, with future work directed toward fine-tuning for high-risk interactions.

Version published to 10.1101/2025.11.14.25340246 on medRxiv
Nov 17, 2025

Efficient and Responsible Transformer Based Conversational Agents for Emotionally Supportive Dialogue

This article has 8 authors:
1. DIVYA SALEELA
2. Akhil Mathew Philip
3. Reji R
4. Rincy Merlin Mathew
5. Teena Joseph
6. Sujith Kumar P S
7. Supriya L P
8. Chinchu M S
This article has no evaluationsLatest version Feb 2, 2026
Retrieval-Augmented Generation in LLMs for Mental Health: Examining the Impact on User Intent Detection in Wysa

This article has 5 authors:
1. Anand Gupta
2. Akshat Surolia
3. Shubham Mishra
4. Shakil Imtiaz
5. Chaitali Sinha
This article has no evaluationsLatest version Dec 30, 2025
Title: AI, Autism, and the Architecture of Voice: From Engineered Exclusion to Designed Dignity

This article has 1 author:
1. Hari Srinivasan
This article has no evaluationsLatest version Jan 19, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Efficient and Responsible Transformer Based Conversational Agents for Emotionally Supportive Dialogue

Retrieval-Augmented Generation in LLMs for Mental Health: Examining the Impact on User Intent Detection in Wysa

Title: AI, Autism, and the Architecture of Voice: From Engineered Exclusion to Designed Dignity