A dynamic spatiotemporal normalization model for continuous vision
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
How does the visual system process dynamic inputs? Perception and neural activity are shaped by the spatial and temporal context of sensory input, which has been modeled by divisive normalization over space or time. However, theoretical work has largely treated normalization separately within these dimensions and has not explained how future stimuli can suppress past ones. Here we introduce a dynamic spatiotemporal normalization model (D-STAN) with a unified spatiotemporal receptive field structure that implements normalization across both space and time and ask whether this model captures the bidirectional effects of temporal context on neural responses and behavior. D-STAN implements temporal normalization through excitatory and suppressive drives that depend on the recent history of stimulus input, controlled by separate temporal windows. We found that biphasic temporal receptive fields emerged from this normalization computation, consistent with empirical observations. The model also reproduced several neural response properties, including surround suppression, nonlinear response dynamics, subadditivity, response adaptation, and backwards masking. Further, spatiotemporal normalization captured bidirectional temporal suppression that depended on stimulus contrast, consistent with human behavior. Thus, D-STAN captured a wide range of neural and behavioral effects, demonstrating that a unified spatiotemporal normalization computation could underlie dynamic stimulus processing and perception.