The Interplay Between Multisensory Processing and Attention in Working Memory: Behavioral and Neural Indices of Audio-Visual Object Storage

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Although real-life events are multisensory, how audio-visual objects are stored in working memory is an open question. At a perceptual level, evidence shows that both top-down and bottom-up attentional processes can play a role in multisensory interactions. To understand how attention and multisensory processes interact in working memory, we designed an audio-visual delayed match-to-sample task in which participants were presented with one or two audio-visual memory items, followed by an audio-visual probe. In three different blocks, participants were instructed to either (a) attend to the auditory features, (b) attend to the visual features, or (c) attend to both auditory and visual features. Participants were instructed to indicate whether the task-relevant features of the probe matched one of the task-relevant feature(s) or objects in working memory. Behavioral results showed interference from task-irrelevant features, suggesting bottom-up integration of audio-visual features and their automatic encoding into working memory, irrespective of their task relevance. Yet, ERP analyses revealed no evidence for active maintenance of these task-irrelevant features, while they clearly taxed greater attentional resources during recall. Notably, alpha oscillatory activity revealed that linking information between auditory and visual modalities required more attentional demands at retrieval. Overall, these results offer critical insights into how and at which processing stage multisensory interactions occur in working memory.

Public Significance Statement

Current working memory research is dominated by investigations of the visual domain. Yet, understanding how more complex representations, e.g., based on multisensory inputs, are formed, stored, and recalled is crucial to obtaining a more realistic understanding of working memory function. The present study shows that when confronted with audio-visual inputs at the same time and location, features from both modalities are combined into a working memory representation, irrespective of their task relevance. During maintenance, alpha oscillations serve to flexibly gate information flow in the cortex, allowing for the re-distribution of attentional resources between modalities depending on their task-relevance. Notably, when the task instructions explicitly involve the storage of audio-visual objects as a whole, recall requires more attentional resources.

Article activity feed