Deep Personalized Semantic Audio Classification (DPSAC): Bridging the Subjective Gap via YAMNet Transfer Learning
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Traditional music recommendation systems rely on collaborative filtering, which suffers from “cold start” limitations and popularity bias. This paper introduces Deep Personalized Semantic Audio Classification (DPSAC), a novel framework that shifts the recommendation paradigm toward individual acoustic semantics. By utilizing transfer learning with the YAMNet architecture pre-trained on the Google AudioSet, the system extracts 1024-dimensional embeddings to map a specific user’s subjective preference. Unlike generalized genre classifiers, DPSAC is trained on a small, personalized dataset (400 tracks) to model idiosyncratic listener logic. Experimental results demonstrate high technical robustness, achieving a validation accuracy of 0.94. In a blind test against 100 novel, out-of-distribution songs, the model achieved a 93% predictive accuracy, successfully identifying preference without any prior social metadata or interaction history. This research demonstrates that deep semantic embeddings can effectively bridge the “semantic gap” in music information retrieval, offering a content-driven, privacyfocused solution for personalized musi