SCAT: The Self-Correcting Aesthetic Transformer for Explainable Facial Beauty Prediction

Djamel Eddine Boukhari
Ali Chemsa

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Modeling human aesthetic perception is a fundamental challenge in computer vision. While deep learning has significantly advanced Facial Beauty Prediction (FBP), state-of-the-art models suffer from two critical, interlinked limitations: a performance plateau with Pearson Correlation (PC) coefficients seldom exceeding 0.90, and a ”black box” nature that offers no insight into their reasoning. We posit that these limitations stem from a failure to emulate the hierarchical, part-based reasoning inherent to human aesthetic judgment. In this work, we propose the Self-Correcting Aesthetic Transformer (SCAT), a novel, explainable-by-design framework that overcomes these challenges. SCAT introduces a two-stage architecture featuring a Semantic Parser to disentangle the face into explicit part embeddings (e.g., eyes, mouth) and a Corrector Aggregator to reason about their harmonious interplay. The model is trained with a novel self-correcting loss that enforces internal consistency between its part-based and holistic evaluations. To facilitate this, we present FBP5500-Subscores, a large-scale dataset with granular part-level aesthetic annotations. Extensive experiments demonstrate that SCAT achieves a new state-of-the-art Pearson Correlation of 0.935, thereby breaking the long-standing performance barrier, while simultaneously providing transparent , human-intelligible predictions. Our work bridges the critical gap between 1 predictive power and interpretability in FBP and suggests a structured reasoning paradigm for other subjective visual assessment tasks.

Version published to 10.21203/rs.3.rs-7003463/v1 on Research Square
Jul 8, 2025

An Uncertainty-Aware and Explainable Deep Learning Model for Facial Beauty Prediction

This article has 2 authors:
1. Djamel Eddine Boukhari
2. Ali Chemsa
This article has no evaluationsLatest version Jul 9, 2025
Predicting Facial Impressions Using Guided Grad-CAM and GPT-4o

This article has 2 authors:
1. Takanori Sano
2. Hideaki Kawabata
This article has no evaluationsLatest version Jul 1, 2025
Shades of Smiles: Creating Variants of Smiles from Neutral Images of Real Individuals - Method and Validation

This article has 4 authors:
1. Jin Gao
2. Werner Sommer
3. Rasha Abdel Rahman
4. Wei-Jun Li
This article has no evaluationsLatest version Jul 2, 2025

Listed in

Abstract

Article activity feed

Related articles

An Uncertainty-Aware and Explainable Deep Learning Model for Facial Beauty Prediction

Predicting Facial Impressions Using Guided Grad-CAM and GPT-4o

Shades of Smiles: Creating Variants of Smiles from Neutral Images of Real Individuals - Method and Validation