A Modular Prototype of Emotion-Aware Proactive Voice Agent with Live2D Embodiment
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We present a voice-based conversational agent that advances beyond reactive dialogue by integrating speech-to-text transcription with Whisper, emotion recognition, simple policy mechanisms, and Live2D embodiment. The system delivers supportive guidance either as inline prompts or card-style recommendations, while empathetic dialogue and expressive avatar cues enhance both transparency and user engagement. A log-based evaluation across ten sessions showed consistent stability, with an average latency of 7.1 seconds. This prototype illustrates a practical foundation for developing emotion-aware, proactive companions aligned with the vision of human-centered dialogue systems.