Equilibrium Propagation Discovers Top-Down Feedback for Audio-Visual Binding in Continuous Wave Fields

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Cross-modal binding — the fusion of simultaneous sensory streams into a unified percept — has not been achieved in physical neural networks without backpropagation. Whether top-down feedback between hierarchical field layers can emerge from local learning rules alone remains untested. We extend a Landau-Ginzburg wave field architecture trained by Equilibrium Propagation to a two-layer system: primary audio and visual fields drive a binding field that sends top-down feedback to both primaries through coupling coefficients initialized to zero. Trained on the GRID audiovisual corpus, the coupling coefficients grow from 0.0 to 0.051 over ten epochs — a result absent in the unimodal case — confirming that Equilibrium Propagation discovers top-down feedback when cross-modal binding is required. The binding field outperforms late fusion; replacing phase-sensitive measurement with amplitude-only readout costs 9.2 percentage points, exceeding the analogous unimodal penalty. When presented with conflicting audiovisual inputs, the system produces fusion responses in 83% of trials, stable under contrastive readout training and therefore reflecting field dynamics rather than readout bias. Symmetric noise degradation — 33.3 versus 33.7 percentage points for audio and video respectively — confirms genuine integration.

Article activity feed