A Modular Prototype of Emotion-Aware Proactive Voice Agent with Live2D Embodiment

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

We present a voice-based conversational agent that advances beyond reactive dialogue by integrating speech-to-text transcription with Whisper, emotion recognition, simple policy mechanisms, and Live2D embodiment. The system delivers supportive guidance either as inline prompts or card-style recommendations, while empathetic dialogue and expressive avatar cues enhance both transparency and user engagement. A log-based evaluation across ten sessions showed consistent stability, with an average latency of 7.1 seconds. This prototype illustrates a practical foundation for developing emotion-aware, proactive companions aligned with the vision of human-centered dialogue systems.

Article activity feed