Genre and Temporal Dynamics in Spotify Popularity Prediction

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The rise of music streaming platforms has created modern access to track-specific data, allowing for analysis of song popularity. While existing literature has explored audio features as predictors of popularity, less attention has been given to the combined role of genre and temporal dynamics — this paper addresses that gap. Using a public dataset of 114,000 tracks from 2000 to 2022, we apply a data science framework combining iterative OLS regression, interaction modeling, random forest, and rolling coefficient analysis to explore the predictive power of Spotify audio characteristics: loudness, danceability, energy, liveness, and valence, as well as genre and release year. Four iterative OLS regression models are developed using an 80/20 train/test split, showing that genre accounts for the largest gain in explained variation, increasing R-squared from 0.042 to 0.434. A genre-year interaction model further improves R-squared to 0.641, with interaction terms jointly significant confirmed by a partial F-test (F(80,318,243)=653.72, p<.001), implying that the effect of genre on popularity varies across time — specifically that different genres rise and fall in prevalence at different periods. A random forest model confirms these findings, ranking genre and year significantly higher in feature importance based on impurity reduction. The most accurate model achieves RMSE=9.64 on a popularity scale of 0-100, with remaining variance likely attributable to unmeasured factors such as Spotify playlist algorithms and social media exposure. Rolling coefficient analysis further reveals the instability of audio features over time — energy's contribution to popularity turned strongly negative post-2010, while danceability peaked around 2015-2016 — suggesting that the streaming era has fundamentally reshaped which acoustic properties drive popularity.

Article activity feed