Capturing Children’s Emotions Through Artwork: A Vision Transformer-Based Analysis

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Children’s drawings are recognized as a potent medium for understanding their emotional states, particularly when language skills are still developing. This study presentsa novel framework that integrates the Mood and Feelings Questionnaire (MFQ) withan advanced Vision Transformer (ViT-B/16) model to examine indicators of depressionin children’s artwork. We recruited 80 children aged 6 to 10, who first completed theMFQ and then created free-form drawings. Each drawing was digitized and labeled asindicating either high or low levels of depressive symptoms according to MFQ thresholds. Our ViT-B/16 architecture, initialized with pretrained weights and fine-tunedon a dataset of 514 scanned images, achieved a validation accuracy of 89.27%, aligning well with the MFQ-based labels. Key model metrics—including precision, recall,and F1-score—highlighted the self-attention mechanism’s ability to capture subtle visual cues dispersed across each drawing. Despite these encouraging results, the studyacknowledges limitations such as the subjectivity of artistic interpretation, potentialbiases from a limited sample, and the need for further ethical and explainability considerations. Overall, this work demonstrates the potential of state-of-the-art deep learningtechniques to enhance early detection and intervention for children’s emotional wellbeing.

Article activity feed