Where is the melody? Spontaneous attention orchestrates melody formation during polyphonic music listening

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Humans seamlessly process multi-voice music into a coherent perceptual whole. Yet the neural strategies supporting this experience remain unclear. One fundamental component of this process is the formation of melody, a core structural element of music. Previous work on monophonic listening has provided strong evidence for the neurophysiological basis of melody processing, for example indicating predictive processing as a foundational mechanism underlying melody encoding. However, considerable uncertainty remains about how melodies are formed during polyphonic music listening, as existing theories (e.g., divided attention, figure–ground model, stream integration) fail to unify the full range of empirical findings. Here, we combined behavioral measures with non-invasive electroencephalography (EEG) to probe spontaneous attentional bias and melodic expectation while participants listened to two-voice classical excerpts. Our uninstructed listening paradigm eliminated a major experimental constraint, creating a more ecologically valid setting. We found that attention bias was significantly influenced by both the high-voice superiority effect and intrinsic melodic statistics. We then employed transformer-based models to generate next-note expectation profiles and test competing theories of polyphonic perception. Drawing on our findings, we propose a weighted-integration framework in which attentional bias dynamically calibrates the degree of integration of the competing streams. In doing so, the proposed framework reconciles previous divergent accounts by showing that, even under free-listening conditions, melodies emerge through an attention-guided statistical integration mechanism.

Highlights

  • EEG can be used to track spontaneous attention during the uninstructed listening of polyphonic music.

  • Behavioural and neural data indicate that spontaneous attention is influenced by both high-voice superiority and melodic contour.

  • Attention bias impacts the neural encoding of the polyphonic streams, with strongest effects within 200 ms after note onset.

  • Strong attention bias leads to melodic expectations consistent with a monophonic music transformer, in line with a Figure-ground model. Weak attention bias leads to melodic expectations consistent with a Stream Integration model.

  • We propose a bi-directional influence between attention and prediction mechanisms, with horizontal statistics impacting attention (i.e., salience), and attention impacting melody extraction.

  • Article activity feed