CrySenseNet: A Deep Learning-Based Acoustic Intelligence System for Decoding Infant Cries
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Infant crying is the primary communication medium for infants,for expressing their fundamental needs and medical issues. Proper infant cry classification can help parents and clinicians identify issues early and apply correct interventions. Here, we examine infant cry signal classification with various machine learning and deep learning techniques. Hand-crafted features were taken out from audio signals and labelled with traditional machine learning models, i.e., Support Vector Machine,Hidden Markov Model,Probabilistic Neural Network,Multi-Layer Perceptron, and Recurrent Neural Network. Additionally, deep learning models like 1D Convolutional Neural Network, Convolutional Neural Network-By Long Short Term Memory hybrid models, and transfer models like GoogLeNet, ShuffleNet, and ResNet-18 were used over spectrogram-based representations.The experiments were performed on the Infant Cry Audio Corpus dataset, which contains five different classes. Out of all the models, SVM produced the maximum classification accuracy of 96.07% then GoogLeNet (84.98%), ShuffleNet (84.78%),CNN-BiLSTM(84%) PNN (83.70%), MLP (82.61%),ResNet-18(80.43%), RNN (80%), CNN (82%), and HMM (66%). The outcomes show that transfer learning models and conventional machine learning classifiers, especially when they employ hand-crafted features, perform better than single deep learning models in this task. In general, the findings confirm the effectiveness of combining signal processing techniques with high-end classification methods for stable and accurate infant cry analysis.