Analysis of individual identification and age-class classification of wild macaque vocalizations without pitch- and formant-based acoustic parameter measurements

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In recent years, deep learning has achieved high performance in bioacoustic classification tasks by leveraging automatically processed acoustic features for large datasets. However, few performance evaluations of automatically processed acoustic features have been conducted on small-scale data because deep learning requires large datasets. To test whether mel spectrograms (an automatically processed acoustic features) are effective for classifying relatively small acoustic data, we evaluated the performance of two classification machines (random forest and support vector machine) using mel-spectrograms of 651 coo calls of six wild female Japanese macaques on two tasks: 1) individual identification and 2) age-class classification between younger (<10 yrs) and the older animals (>20 yrs). For the individual identification task, the mean balanced accuracy was 0.81 for random forest and 0.82 for support vector machine. For the age-class classification task, the mean balanced accuracy was 0.91 for random forest and 0.93 for support vector machine. Considering that of all the calls were recorded in the wild, methods using automatically processed acoustic features, such as mel spectrogram, are effective in classifying small acoustic data for the individual identification task. The high performance in the age-class classification task might be due to the ability of mel spectrograms to capture the characteristics (e.g. harshness) of older individuals.

Article activity feed