Deep Personalized Semantic Audio Classification (DPSAC): Bridging the Subjective Gap via YAMNet Transfer Learning

Nuriel Shalom Mor
Hila Ben Levi

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Traditional music recommendation systems rely on collaborative filtering, which suffers from “cold start” limitations and popularity bias. This paper introduces Deep Personalized Semantic Audio Classification (DPSAC), a novel framework that shifts the recommendation paradigm toward individual acoustic semantics. By utilizing transfer learning with the YAMNet architecture pre-trained on the Google AudioSet, the system extracts 1024-dimensional embeddings to map a specific user’s subjective preference. Unlike generalized genre classifiers, DPSAC is trained on a small, personalized dataset (400 tracks) to model idiosyncratic listener logic. Experimental results demonstrate high technical robustness, achieving a validation accuracy of 0.94. In a blind test against 100 novel, out-of-distribution songs, the model achieved a 93% predictive accuracy, successfully identifying preference without any prior social metadata or interaction history. This research demonstrates that deep semantic embeddings can effectively bridge the “semantic gap” in music information retrieval, offering a content-driven, privacyfocused solution for personalized musi

Version published to 10.21203/rs.3.rs-8788146/v1 on Research Square
Feb 5, 2026

Acoustic Feature Synergy and Self-Supervised Learning for Robust Tabla Stroke Classification

This article has 4 authors:
1. Jaipreet Kaur
2. Rajdeep Singh Sohal
3. Manbir Kaur
4. Satinder Kaur
This article has no evaluationsLatest version Mar 24, 2026
Deep Learning Based Personalized Recommendation Systems for E-Commerce Platforms

This article has 2 authors:
1. Ayomide owolabi
2. Jagadeesh Sundaramoorthy
This article has no evaluationsLatest version Feb 24, 2026
Task-Conditioned Representation Adaptation for Many-Shot In-Context Learning via Continued Pretraining

This article has 3 authors:
1. Lukas Schneider
2. Anna-Maria Keller
3. Michael Tobias Fischer
This article has no evaluationsLatest version Feb 16, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Acoustic Feature Synergy and Self-Supervised Learning for Robust Tabla Stroke Classification

Deep Learning Based Personalized Recommendation Systems for E-Commerce Platforms

Task-Conditioned Representation Adaptation for Many-Shot In-Context Learning via Continued Pretraining