CNNMC: A Convolutional Neural Network with Monte Carlo Dropout for Speaker Recognition
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Speaker recognition is the task of identifying or verifying a person’s identity using their voice. This problem involves challenges like variations in speech due to emotional states, health conditions, heterogeneity of microphones models, different environments and background noise. Accurate speaker recognition is critical for applications in security, personalized user experiences, and forensic analysis. Applying a CNN with Monte Carlo Dropout can enhance Speaker Recognition by leveraging unlabeled data to improve feature extraction and model robustness. This approach helps mitigate overfitting and enhances generalization, making it particularly effective in handling diverse and variable speech patterns in speaker recognition tasks. The designed deep learning model showcases superior performance in multiple dimensions achieving a peak validation accuracy of 92.07\% for speaker recognition on a specific dataset recorded in the wild by phone, and 0.038 of EER, obtaining good performance with respect the related prior art.