Explainable Infant Cry Recognition Using Reinforcement-Learned Feature Fusion and SHAP Interpretation

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Crying is one of the most fundamental ways an infant can communicate with the outside world. The cry contains vital information to determine the needs of the baby, whether due to hunger, pain, fatigue, or simply discomfort [1]. The accurate interpretation of these subtle acoustic patterns carried by cry signals is crucial for proper care and early diagnosis. This study presents an innovative approach to infant cry classification using explainable reinforcement learning and feature fusion methods. We dynamically assign different attention weights to already extracted features using a lightweight policy agent learned via the REINFORCE algorithm [2]. The model is trained and validated on a widely and popular literature dataset named Donate-a-Cry Corpus , which classifies cries into five categories namely; hunger, tiredness, belly pain, burping, and discomfort. In order to help reduce the extreme class imbalance present in the dataset, we use specific data augmentation methods. We also introduce a dynamic reward shaping mechanism into the reinforcement loop that improves the agent’s ability to focus on underrepresented classes. Once augmented and balanced, most salient acoustic features (MFCC, GFCC and prosodic features) are extracted and processed using a lightweight MLP(Multi-layer Perceptron) classifier for final classification. To validate our model, we apply k=3-fold cross-validation where we achieve an accuracy of 94.44%.

Article activity feed