Advancing Audio-Visual Attention Analysis in 360° Videos Through Real-Time Visualization

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

This study presents an open-source, real-time visualization tool designed to analyse audio-visual attention in 360° video environments under varying sound conditions. Traditional methods, such as static saliency maps and post-hoc analyses, often fail to capture the dynamic and participant-specific nature of attention shifts in immersive environments. To address these limitations, the proposed tool dynamically integrates head pose fixation maps with sound intensity heatmaps, enabling real-time tracking of attention patterns across different audio conditions, including No Sound (NS), Stereo (ST), First-Order Ambisonics (FO), and Third-Order Ambisonics (HO). Attention shifts across sound conditions were quantified using the Jaccard Index, which measures the overlap of the top 5% most-viewed regions across participants. The results demonstrate that increasing auditory complexity—from silence to spatial audio—significantly broadens visual exploration. First-Order Ambisonics (FO) led to the most dispersed attention patterns, with a 62.4% reduction in attention overlap indoors and 58.8% outdoors compared to NS. Third-Order Ambisonics (HO) resulted in a 61.2% reduction indoors and 52.0% outdoors, suggesting that while FO encourages broader exploration, HO facilitates a more focused distribution of attention. Notably, HO conditions led to a 3.2% increase indoors and a 16.6% increase outdoors in attention overlap compared to FO, indicating that higher-order spatial audio helps guide attention more precisely in complex environments. Unlike conventional approaches, which rely on static analyses, this tool provides real-time, participant-specific insights into attention shifts, offering a dynamic perspective on how spatial audio influences exploration. These capabilities empower VR content creators and researchers with actionable insights, optimizing spatial audio design and enhancing user engagement. By offering a robust and adaptable framework, this study advances the understanding of audio-visual interactions in immersive media environments.

Article activity feed