ChildLens: An Egocentric Video Dataset for Activity Analysis in Children
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We present ChildLens, an egocentric video and audio dataset that captures naturalistic everyday experiences in children aged three to five years, including detailed activity labels. A total of 109 hours of experiences were recorded from 62 children in their home environment using a 140° wide-lens camera equipped with a microphone integrated in a child-friendly vest. Annotations include five location classes and 14 activity classes, covering audio-only, video-only, and multimodal activities. Captured through the camera on chest height, ChildLens provides a rich resource for analyzing children’s daily interactions and behaviors. We provide an overview of the dataset, the collection process, and the labeling strategy. Finally, we present benchmark performance of two state-of-the-art models on the dataset: the Boundary-Matching Network for temporal activity localization and the Voice Type Classifier for detecting and classifying speech in audio. The ChildLens dataset will be freely available for research purposes via an institutional repository, listed on the ChildLens website (https://www.eva.mpg.de/comparative-cultural-psychology/technical-development/childlens/). It provides rich data to advance computer vision and audio analysis techniques and thereby removes a critical obstacle to studying the context of child development.