ELEGANT: Combining Simultaneous Node and Edge Generation with Landmark Multi-Task Learning for Facial Action Unit Recognition
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Facial Action Units (AUs) have recently been used in dementia detection, pain detection, talking head generation, and even facial reconstruction tasks. The success of each of these applications is at least partially due to the performance of the underlying AU recognition model. The commonly reported metric for comparing AU recognition is the average F1 score. Improving the average F1 score for AU recognition will directly improve the performance of each application that requires AU recognition. To improve the average F1 score, we propose simultaneously generating the nodes and edges for the graph neural network, strategically using landmark data, present additional AU recognition multi-task learning methods, and introduce ensemble learning to AU recognition. Although most current solutions for AU recognition generate the nodes and the edges separately, our proposed method demonstrates the improvement that comes from simultaneously generating the nodes and edges. In addition, our method proposes to use the available landmark data in a multi-task learning method. Our solution also applies ensemble learning to AU recognition. Through extensive experimentation, we demonstrate an improvement in the state-of-the-art average F1 score from 66.3 to 67.3 and from 66.9 to 67.8 on the BP4D and DISFA datasets, a considerable improvement in this field. These results underscore the substantial improvements our proposed method brings to the application of AU recognition.