CrySenseNet: A Deep Learning-Based Acoustic Intelligence System for Decoding Infant Cries

Krishna S
Anushka B R
Swetha Saju
Amrutha K V
Devika S Babu
Sishu Shankar Muni
Swetha P

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Infant crying is the primary communication medium for infants,for expressing their fundamental needs and medical issues. Proper infant cry classification can help parents and clinicians identify issues early and apply correct interventions. Here, we examine infant cry signal classification with various machine learning and deep learning techniques. Hand-crafted features were taken out from audio signals and labelled with traditional machine learning models, i.e., Support Vector Machine,Hidden Markov Model,Probabilistic Neural Network,Multi-Layer Perceptron, and Recurrent Neural Network. Additionally, deep learning models like 1D Convolutional Neural Network, Convolutional Neural Network-By Long Short Term Memory hybrid models, and transfer models like GoogLeNet, ShuffleNet, and ResNet-18 were used over spectrogram-based representations.The experiments were performed on the Infant Cry Audio Corpus dataset, which contains five different classes. Out of all the models, SVM produced the maximum classification accuracy of 96.07% then GoogLeNet (84.98%), ShuffleNet (84.78%),CNN-BiLSTM(84%) PNN (83.70%), MLP (82.61%),ResNet-18(80.43%), RNN (80%), CNN (82%), and HMM (66%). The outcomes show that transfer learning models and conventional machine learning classifiers, especially when they employ hand-crafted features, perform better than single deep learning models in this task. In general, the findings confirm the effectiveness of combining signal processing techniques with high-end classification methods for stable and accurate infant cry analysis.

Version published to 10.21203/rs.3.rs-7670543/v1 on Research Square
Sep 29, 2025

A Review of Deep Learning for Speech Recognition and Its Application in Advanced Hearing Assistance for the Elderly

This article has 3 authors:
1. Cheng Yao Song
2. Zhen Bin It
3. Jovan Bowen Heng
This article has no evaluationsLatest version Oct 23, 2025
ADFF-Net: An Attention-Based Dual-Stream Feature Fusion Network for Respiratory Sound Classification

This article has 6 authors:
1. Bing Zhu
2. Lijun Chen
3. Xiaoling Li
4. Songnan Zhao
5. Shaode Yu
6. Qiurui Sun
This article has no evaluationsLatest version Oct 21, 2025
Applying a Transformer-based machine-learning model to classify caregiver and infant behaviours during dyadic interactions.

This article has 3 authors:
1. Alexander Turner
2. Aly Magassouba
3. Sobanawartiny Wijeakumar
This article has no evaluationsLatest version Sep 25, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A Review of Deep Learning for Speech Recognition and Its Application in Advanced Hearing Assistance for the Elderly

ADFF-Net: An Attention-Based Dual-Stream Feature Fusion Network for Respiratory Sound Classification

Applying a Transformer-based machine-learning model to classify caregiver and infant behaviours during dyadic interactions.