Development of an Interactive Digital Human with Context-Sensitive Facial Expressions

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

With the increasing complexity of human-computer interaction scenarios, conventional digital human facial expression systems show notable limitations in handling multi-emotion co-occurrence, dynamic expression, and semantic responsiveness. This paper proposes a digital human system framework that integrates multimodal emotion recognition and compound facial expression generation. The system establishes a complete pipeline for real-time interaction and compound emotional expression, following a sequence of "speech semantic parsing—multimodal emotion recognition—Action Unit (AU)-level 3D facial expression control."First, a ResNet18-based model is employed for robust emotion classification using the AffectNet dataset. Then, an AU motion curve driving module is constructed on the Unreal Engine platform, where dynamic synthesis of basic emotions is achieved via a state-machine mechanism. Finally, Generative Pre-trained Transformer (GPT) is utilized for semantic analysis, generating structured emotional weight vectors that are mapped to the AU layer to enable language-driven facial responses. The software interface is shown in Fig .Experimental results demonstrate that the proposed system outperforms traditional methods in terms of recognition accuracy, expression naturalness, and interaction efficiency, effectively supporting realistic, natural, and semantically aligned facial expressions. This research provides a complete technical framework and practical foundation for high-fidelity digital humans with affective interaction capabilities.

Article activity feed