Causal neural mechanisms of context-based object recognition

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    This study will be of interest to scientists involved in high-level vision. The data provide a compelling demonstration of the causal role of three key visual areas in context-based object recognition. The key claims of the manuscript are supported by the data, and are strengthened by the pre-registration of each of the three experiments.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #3 agreed to share their names with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Objects can be recognized based on their intrinsic features, including shape, color, and texture. In daily life, however, such features are often not clearly visible, for example when objects appear in the periphery, in clutter, or at a distance. Interestingly, object recognition can still be highly accurate under these conditions when objects are seen within their typical scene context. What are the neural mechanisms of context-based object recognition? According to parallel processing accounts, context-based object recognition is supported by the parallel processing of object and scene information in separate pathways. Output of these pathways is then combined in downstream regions, leading to contextual benefits in object recognition. Alternatively, according to feedback accounts, context-based object recognition is supported by (direct or indirect) feedback from scene-selective to object-selective regions. Here, in three pre-registered transcranial magnetic stimulation (TMS) experiments, we tested a key prediction of the feedback hypothesis: that scene-selective cortex causally and selectively supports context-based object recognition before object-selective cortex does. Early visual cortex (EVC), object-selective lateral occipital cortex (LOC), and scene-selective occipital place area (OPA) were stimulated at three time points relative to stimulus onset while participants categorized degraded objects in scenes and intact objects in isolation, in different trials. Results confirmed our predictions: relative to isolated object recognition, context-based object recognition was selectively and causally supported by OPA at 160–200 ms after onset, followed by LOC at 260–300 ms after onset. These results indicate that context-based expectations facilitate object recognition by disambiguating object representations in the visual cortex.

Article activity feed

  1. Author Response:

    **Reviewer #1 (Public Review): **

    [...] I have two overarching concerns regarding the Results as reported:

    1. In their preregistration the authors make specific hypotheses regarding TMS effects on the scene-only condition and stated that their plan was to include a 3-level factor of stimulus type (object-related, context-related, scene-only) in their ANOVAs. However the scene-only condition has not been included in the statistics. A justification for this alteration should be provided or the statistics should be run as originally planned.

    The pre-registered statistics for the scene-alone condition in the OPA experiment are now included in the manuscript (p.9-10). The relevant figure (Figure 3) has also been updated such that the scene-alone condition results are now in the main text rather than the Supplement. Results confirmed our predictions, showing a reduction of scene-alone performance when OPA was stimulated 160-200 ms after stimulus onset. Note that the scene-alone condition was only included in the pre-registration of the OPA experiment, which is why we had not reported the corresponding statistics previously. (This condition was not relevant for the LOC and EVC experiments.)

    1. All participants were screened and only included in the study if TMS stimulation of the relevant area produced a reduction in object recognition. More detail on the specific procedures used should be provided. The authors should clarify which SOAs were used as part of the screening and how many participants were excluded based on this screening. The use of this screening procedure should be flagged in the main text so that the reader can interpret the results accordingly.

    We now introduce the screening procedure in the main text and point the reader to a recent publication that documents the methods and results of this experiment (Wischnewski & Peelen, J Neurosci 2021). The screening experiment followed the design of Dilks et al. (J Neurosci 2013), stimulating OPA and LOC using 5 TMS pulses at a rate of 10Hz (i.e., no SOAs were used). No participants were excluded – all participants were assigned to one of the three conditions (OPA, LOC, EVC). This is now more clearly explained in the manuscript.

    1. Based on the fact that TMS to LOC and EVA disrupts performance >150ms after stimulus onset the authors conclude that this reflects the role of feedback from scene-selective areas. Can the authors really exclude alternative possibilities? Would the same results not be expected if areas like LOC and EVA exhibit recurrent activity perhaps reflecting continued processing of a representation of the stimulus held in iconic memory? Similarly, the authors conclude that the longer latency of the TMS effects on LOC in the context-based vs object-based condition reflects the role of feedback. But the object stimulus is degraded in the context-based condition so could it not be that LOC remains active over longer periods of time to support a more difficult discrimination?

    We have added a paragraph to the Discussion section in which we discuss the alternative interpretation of local recurrence (p.13-14).

  2. Evaluation Summary:

    This study will be of interest to scientists involved in high-level vision. The data provide a compelling demonstration of the causal role of three key visual areas in context-based object recognition. The key claims of the manuscript are supported by the data, and are strengthened by the pre-registration of each of the three experiments.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #3 agreed to share their names with the authors.)

  3. Reviewer #1 (Public Review):

    This paper reports a set of pre-registered experiments devised to examine the role that scene-selective visual areas play in supportive object identification. Participants performed an object discrimination task under two conditions: one in which the object stimuli were viewed in isolation and another in which the stimuli were degraded and presented in a congruent scene. Separate groups of participants underwent TMS applied to OPA (a scene-selective area), LOC (an object-selective area) and EVC (early visual cortex) at three different stimulus onset asynchronies. OPA stimulation has no discernible impact on performance in the object-only condition but led to reduced discrimination in the context-based condition if delivered 160-200ms post-stimulus. LOC stimulation affected performance in both conditions but TMS disruption was evident even at the longest delay. EVC disruption also affected performance in both conditions but with different timecourses.

    I have two overarching concerns regarding the Results as reported:

    1. In their preregistration the authors make specific hypotheses regarding TMS effects on the scene-only condition and stated that their plan was to include a 3-level factor of stimulus type (object-related, context-related, scene-only) in their ANOVAs. However the scene-only condition has not been included in the statistics. A justification for this alteration should be provided or the statistics should be run as originally planned.

    2. All participants were screened and only included in the study if TMS stimulation of the relevant area produced a reduction in object recognition. More detail on the specific procedures used should be provided. The authors should clarify which SOAs were used as part of the screening and how many participants were excluded based on this screening. The use of this screening procedure should be flagged in the main text so that the reader can interpret the results accordingly.

    3. Based on the fact that TMS to LOC and EVA disrupts performance >150ms after stimulus onset the authors conclude that this reflects the role of feedback from scene-selective areas. Can the authors really exclude alternative possibilities? Would the same results not be expected if areas like LOC and EVA exhibit recurrent activity perhaps reflecting continued processing of a representation of the stimulus held in iconic memory? Similarly, the authors conclude that the longer latency of the TMS effects on LOC in the context-based vs object-based condition reflects the role of feedback. But the object stimulus is degraded in the context-based condition so could it not be that LOC remains active over longer periods of time to support a more difficult discrimination?

  4. Reviewer #2 (Public Review):

    The authors investigate the causal role of the EVC, LOC and OPA in facilitating context-based object recognition, by specifically and systematically targeting each of these regions across a set of three pre-registered experiments. The results indicate that context-based object recognition is mediated first by the OPA, and later by the LOC. The authors conclude that context-based expectations facilitate object recognition by disambiguating object representations in visual cortex.

    Overall, this paper makes a strong contribution to the field by advancing our understanding of the neural mechanisms underlying object recognition. The study seemed to be adequately powered (N=24 per experiment) based on a medium effect size and behavioural data from Brandman & Peelen (2017). A significant strength was that the hypotheses surrounding the three experiments were pre-registered. The design was overall solid. The manuscript was overall well-written and easy to follow, and the introduction provided a good survey of the key literature. The approach of selecting participants in whom stimulation of the EVC, LOC and OPA led to performance change in a preliminary behavioural task allowed the authors to ensure that stimulation was specifically targeted to the corresponding brain sites.

  5. Reviewer #3 (Public Review):

    Wischnewski and Peelen use state of the art transcranial stimulation techniques to study the causal role of contextual feedback on object recogniton.

    Importantly, whereas most studies of object recognition have solely presented high contrast images of objects in isolation, Wischnewski and Peelen crucially also presented degraded objects in scene contexts - arguably a more naturalistic setting.

    They find that late stimulation of scene-selective areas hinders recognition of degrades images of objects presented in a scene context, but not of objects presented in isolation, demonstrating a causal role of scene-selective cortex in guiding object recognition. The fact that later stimulation of object-selective cortex also interferes with object recognition only for objects-in-context images suggests that scene-selective cortex provides feedback to object-selective cortex, in line with previous fMRI and MEG work (e.g. Brandman & Peelen 2017).

    The effects of transcranial stimulation on object recognition are very strong and clear cut, which is especially impressive for a TMS study. Furthermore, the results are presented side-by-side with the (pre-registered) hypotheses, which is a delight.

    It is also clearly shown and acknowledged when results diverge from the predictions, as is the case for early visual cortex (EVC) stimulation. The involvement of EVC in (later stages of) object recognition remains somewhat of a mystery.

    This paper is likely to have a big impact on how seriously the field takes the role of contextual feedback in object cognition; a field that has been dominated by a focus on fast feedforward feature extraction.