MixSense: AI Optimization for Contiguous Music Segmentation at Scale
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This paper casts long-form music stream segmentation as an AI optimization problem over a self-similarity manifold, unifying evolutionary search for parameter discovery with globally optimal dynamic-programming inference to recover contiguous boundaries consistent with a track-count prior or a data-driven estimate. Starting from Fourier-derived spectral embeddings, the method constructs cosine self-similarity and time-aware cost surfaces that encode symmetry, contiguity, and evolutionary stability, then solves for the minimum-cost partition without heuristic change-point thresholds. The pipeline is learning-free yet intelligent, leveraging search and global reasoning instead of supervised labels, and is stress-tested on a hand-annotated corpus exceeding 640 hours with humanvariance analysis to contextualize error and tolerance around true boundaries. Results show robust, scalable segmentation under both known and estimated segment counts, highlighting AI-style optimization as a powerful alternative to local novelty detectors and ad-hoc rules in music structure recovery