ISGD : A Dataset for Demographically-Aware Facial Analysis and Privacy-First Skincare Recommendation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Facial attribute recognition plays a crucial role in applications ranging from human-computer interaction to personalised digital health. However, the effectiveness of existing systems is often limited by demographic bias in training data and the absence of domain-specific annotations, particularly for nuanced tasks such as skincare and grooming analysis. Large-scale datasets like CelebA are predominantly Western-centric and lack critical attributes including Oily Skin, Wrinkles, and grooming-related characteristics. To address these limitations, we introduce the Indian Skincare and Grooming Dataset (ISGD), a manually curated dataset comprising 30,141 facial images from the Indian subcontinent, annotated across 33 fine-grained binary attributes specifically designed for skincare and grooming analysis. Building upon ISGD, we propose AKRTI, a privacy-first inference pipeline that decouples visual processing from report generation. The system employs a ConvNeXt-Tiny backbone for multi-label facial attribute prediction. Importantly, only the predicted binary attribute vector—never the raw facial image—is passed to a large language model (LLM) to generate a personalised, human-readable skincare and grooming report, thereby preserving user privacy. Experimental results demonstrate that models trained on ISGD significantly outperform those trained on a size-matched subset of CelebA, achieving 94.26% overall accuracy and an F1-score of 0.8851. Furthermore, per-attribute evaluation indicates more consistent and reliable predictions for skincare-critical features such as beard presence, skin condition, and wrinkles. By introducing a demographically representative dataset alongside a privacy-aware framework, this work establishes a robust foundation for equitable and practical AI-driven facial analysis systems in personalised healthcare and wellness. The source code for all experiments and implementations is publicly available at our GitHub repository: https://github.com/HimalRana2610/ISGD. Archived at Zenodo (DOI: https://doi.org/10.5281/zenodo.18837811).

Article activity feed