Deep Personalized Semantic Audio Classification (DPSAC): Bridging the Subjective Gap via YAMNet Transfer Learning

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Traditional music recommendation systems rely on collaborative filtering, which suffers from “cold start” limitations and popularity bias. This paper introduces Deep Personalized Semantic Audio Classification (DPSAC), a novel framework that shifts the recommendation paradigm toward individual acoustic semantics. By utilizing transfer learning with the YAMNet architecture pre-trained on the Google AudioSet, the system extracts 1024-dimensional embeddings to map a specific user’s subjective preference. Unlike generalized genre classifiers, DPSAC is trained on a small, personalized dataset (400 tracks) to model idiosyncratic listener logic. Experimental results demonstrate high technical robustness, achieving a validation accuracy of 0.94. In a blind test against 100 novel, out-of-distribution songs, the model achieved a 93% predictive accuracy, successfully identifying preference without any prior social metadata or interaction history. This research demonstrates that deep semantic embeddings can effectively bridge the “semantic gap” in music information retrieval, offering a content-driven, privacyfocused solution for personalized musi

Article activity feed