Preparatory attentional templates in prefrontal and sensory cortex encode target-associated information

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife Assessment

    This valuable study decoded target-associated information in prefrontal and sensory cortex during the preparatory period of a visual search task, suggesting a memory-driven attentional template. The evidence supporting this claim is convincing, based on multivariate pattern analyses of fMRI data. The results will be of interest to psychologists and cognitive neuroscientists.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Visual search relies on the ability to use information about the target in working memory to guide attention and make target-match decisions. The representation of target features is referred to as the “attentional” or “target” template and is thought to be encoded within an IFJ-visual cortical network (Baldauf & Desimone, 2014; Bichot et al., 2015b). The contents of the template typically contain veridical target information that is used to modulate sensory processing in preparation for guiding attention during search. However, many behavioral studies have shown that target-associated information is used to guide attention, especially when target discrimination is difficult (Battistoni et al., 2017; de Lange et al., 2018; Peelen et al., 2024; Vo et al., 2019; Yu et al., 2023; Zhou & Geng, 2024). Thus, while target-associated information is known to impact search performance, its presence within the IFJ-visual attentional network during the preparatory period has never been demonstrated. Here, we use fMRI and multivariate pattern analysis, to test if attentional guidance by target-associated information is explicitly represented in the preparatory period before search begins, either in conjunction with the target or even in place of it. Participants were first trained on four face-scene category pairings after which they completed a cued visual search task for the same faces. Each trial began with a face cue, followed by a delay period, and then a search display with two lateralized faces superimposed on scene images. The critical results showed that while face information could be decoded in the fusiform face area (FFA), superior parietal lobule (SPL), and dorsolateral prefrontal cortex (dLPFC), during the cue period, face information could not be decoded in any brain regions during the delay period. In contrast, the associated scene was decoded only in ventrolateral prefrotnal cortex (vLPFC) curing the cue period but most importantly, in the inferior frontal junction (IFJ) and the parahippocampal place area (PPA) during the delay period. Our results are a novel demonstration that target-associated information from memory can supplant veridical target information in the brain’s “target template” in anticipation of difficult visual search.

Article activity feed

  1. eLife Assessment

    This valuable study decoded target-associated information in prefrontal and sensory cortex during the preparatory period of a visual search task, suggesting a memory-driven attentional template. The evidence supporting this claim is convincing, based on multivariate pattern analyses of fMRI data. The results will be of interest to psychologists and cognitive neuroscientists.

  2. Reviewer #1 (Public review):

    When you search for something, you need to maintain some representation (a "template") of that target in your mind/brain. Otherwise, how would you know what you were looking for? If your phone is in a shocking pink case, you can guide your attention to pink things based on a target template that includes the attribute 'pink'. That guidance should get you to the phone pretty effectively if it is in view. Most real-world searches are more complicated. If you are looking for the toaster, you will make use of your knowledge of where toasters can be. Thus, if you are asked to find a toaster, you might first activate a template of a kitchen or a kitchen counter. You might worry about pulling up the toaster template only after you are reasonably sure you have restricted your attention to a sensible part of the scene.

    Zhou and Geng are looking for evidence of this early stage of guidance by information about the surrounding scene in a search task. They train Os to associate four faces with four places. Then, with Os in the scanner, they show one face - the target for a subsequent search. After an 8 sec delay, they show a search display where the face is placed on the associated scene 75% of the time. Thus, attending to the associated scene is a good idea. The questions of interest are "When can the experimenters decode which face Os saw from fMRI recording?" "When can the experimenters decode the associated scene?" and "Where in the brain can the experimenters see evidence of this decoding? The answer is that the face but not the scene can be read out during the face's initial presentation. The key finding is that the scene can be read out (imperfectly but above chance) during the subsequent delay when Os are looking at just a fixation point. Apparently, seeing the face conjures up the scene in the mind's eye.

    This is a solid and believable result. The only issue, for me, is whether it is telling us anything specifically about search. Suppose you trained Os on the face-scene pairing but never did anything connected to the search. If you presented the face, would you not see evidence of recall of the associated scene? Maybe you would see the activation of the scene in different areas and you could identify some areas as search specific. I don't think anything like that was discussed here.

    You might also expect this result to be asymmetric. The idea is that the big scene gives the search information about the little face. The face should activate the larger useful scene more than the scene should activate the more incidental face, if the task was reversed. That might be true if the finding is related to a search where the scene context is presumed to be the useful attention guiding stimulus. You might not expect an asymmetry if Os were just learning an association.

    It is clear in this study that the face and the scene have been associated and that this can be seen in the fMRI data. It is also clear that a valid scene background speeds the behavioral response in the search task. The linkage between these two results is not entirely clear but perhaps future research will shed more light.

    It is also possible that I missed the clear evidence of the search-specific nature of the activation by the scene during the delay period. If so, I apologize and suggest that the point be underlined for readers like me.

  3. Reviewer #2 (Public review):

    Summary:

    This work is one of the best instances of a well-controlled experiment and theoretically impactful findings within the literature on templates guiding attentional selection. I am a fan of the work that comes out of this lab and this particular manuscript is an excellent example as to why that is the case. Here, the authors use fMRI (employing MVPA) to test whether during the preparatory search period, a search template is invoked within the corresponding sensory regions, in the absence of physical stimulation. By associating faces with scenes, a strong association was created between two types of stimuli that recruit very specific neural processing regions - FFA for faces and PPA for scenes. The critical results showed that scene information that was associated with a particular cue could be decoded from PPA during the delay period. This result strongly supports the invoking of a very specific attentional template.

    Strengths:

    There is so much to be impressed with in this report. The writing of the manuscript is incredibly clear. The experimental design is clever and innovative. The analysis is sophisticated and also innovative. The results are solid and convincing.

    Weaknesses:

    I only have a few weaknesses to point out.
    This point is not so much of a weakness, but a further test of the hypothesis put forward by the authors. The delay period was long - 8 seconds. It would be interesting to split the delay period into the first 4seconds and the last 4seconds and run the same decoding analyses. The hypothesis here is that semantic associations take time to evolve, and it would be great to show that decoding gets stronger in the second delay period as opposed to the period right after the cue. I don't think this is necessary for publication, but I think it would be a stronger test of the template hypothesis.
    Type in the abstract "curing" vs "during."
    It is hard to know what to do with significant results in ROIs that are not motivated by specific hypotheses. However, for Figure 3, what are the explanations for ROIs that show significant differences above and beyond the direct hypotheses set out by the authors?

  4. Reviewer #3 (Public review):

    The manuscript contains a carefully designed fMRI study, using MVPA pattern analysis to investigate which high-level associate cortices contain target-related information to guide visual search. A special focus is hereby on so-called 'target-associated' information, that has previously been shown to help in guiding attention during visual search. For this purpose the author trained their participants and made them learn specific target-associations, in order to then test which brain regions may contain neural representations of those learnt associations. They found that at least some of the associations tested were encoded in prefrontal cortex during the cue and delay period.

    The manuscript is very carefully prepared. As far as I can see, the statistical analyses are all sound and the results integrate well with previous findings.

    I have no strong objections against the presented results and their interpretation.

  5. Author response:

    Public Reviews:

    Reviewer #1 (Public review):

    When you search for something, you need to maintain some representation (a "template") of that target in your mind/brain. Otherwise, how would you know what you were looking for? If your phone is in a shocking pink case, you can guide your attention to pink things based on a target template that includes the attribute 'pink'. That guidance should get you to the phone pretty effectively if it is in view. Most real-world searches are more complicated. If you are looking for the toaster, you will make use of your knowledge of where toasters can be. Thus, if you are asked to find a toaster, you might first activate a template of a kitchen or a kitchen counter. You might worry about pulling up the toaster template only after you are reasonably sure you have restricted your attention to a sensible part of the scene.

    Zhou and Geng are looking for evidence of this early stage of guidance by information about the surrounding scene in a search task. They train Os to associate four faces with four places. Then, with Os in the scanner, they show one face - the target for a subsequent search. After an 8 sec delay, they show a search display where the face is placed on the associated scene 75% of the time. Thus, attending to the associated scene is a good idea. The questions of interest are "When can the experimenters decode which face Os saw from fMRI recording?" "When can the experimenters decode the associated scene?" and "Where in the brain can the experimenters see evidence of this decoding? The answer is that the face but not the scene can be read out during the face's initial presentation. The key finding is that the scene can be read out (imperfectly but above chance) during the subsequent delay when Os are looking at just a fixation point. Apparently, seeing the face conjures up the scene in the mind's eye.

    This is a solid and believable result. The only issue, for me, is whether it is telling us anything specifically about search. Suppose you trained Os on the face-scene pairing but never did anything connected to the search. If you presented the face, would you not see evidence of recall of the associated scene? Maybe you would see the activation of the scene in different areas and you could identify some areas as search specific. I don't think anything like that was discussed here.

    You might also expect this result to be asymmetric. The idea is that the big scene gives the search information about the little face. The face should activate the larger useful scene more than the scene should activate the more incidental face, if the task was reversed. That might be true if the finding is related to a search where the scene context is presumed to be the useful attention guiding stimulus. You might not expect an asymmetry if Os were just learning an association.

    It is clear in this study that the face and the scene have been associated and that this can be seen in the fMRI data. It is also clear that a valid scene background speeds the behavioral response in the search task. The linkage between these two results is not entirely clear but perhaps future research will shed more light.

    It is also possible that I missed the clear evidence of the search-specific nature of the activation by the scene during the delay period. If so, I apologize and suggest that the point be underlined for readers like me.

    We will respond to this question by acknowledging that the reviewer is right in that the delay period activation of the scene is not necessarily search-specific. We will then discuss how this possibility affects the interpretation of our results and what kind of studies would need to be conducted in order to fully establish a causal link between delay period activity and visual search performance. We will also discuss the literature on cued attention and situate our work within the context of these other studies that have used similar task paradigms to infer attentional processes. Finally, we will discuss the interpretation of delay period activity in PPA and IFJ.

    Reviewer #2 (Public review):

    Summary:

    This work is one of the best instances of a well-controlled experiment and theoretically impactful findings within the literature on templates guiding attentional selection. I am a fan of the work that comes out of this lab and this particular manuscript is an excellent example as to why that is the case. Here, the authors use fMRI (employing MVPA) to test whether during the preparatory search period, a search template is invoked within the corresponding sensory regions, in the absence of physical stimulation. By associating faces with scenes, a strong association was created between two types of stimuli that recruit very specific neural processing regions - FFA for faces and PPA for scenes. The critical results showed that scene information that was associated with a particular cue could be decoded from PPA during the delay period. This result strongly supports the invoking of a very specific attentional template.

    Strengths:

    There is so much to be impressed with in this report. The writing of the manuscript is incredibly clear. The experimental design is clever and innovative. The analysis is sophisticated and also innovative. The results are solid and convincing.

    Weaknesses:

    I only have a few weaknesses to point out.

    This point is not so much of a weakness, but a further test of the hypothesis put forward by the authors. The delay period was long - 8 seconds. It would be interesting to split the delay period into the first 4seconds and the last 4seconds and run the same decoding analyses. The hypothesis here is that semantic associations take time to evolve, and it would be great to show that decoding gets stronger in the second delay period as opposed to the period right after the cue. I don't think this is necessary for publication, but I think it would be a stronger test of the template hypothesis.

    We will conduct the suggested analysis. Depending on the outcome, we will include it in supplemental materials or the main text.

    Type in the abstract "curing" vs "during."

    We will fix this.

    It is hard to know what to do with significant results in ROIs that are not motivated by specific hypotheses. However, for Figure 3, what are the explanations for ROIs that show significant differences above and beyond the direct hypotheses set out by the authors?

    We will address how each of the ROIs wdas selected based on the use of a priori networks as masks with ROIs as sub-parcels. We will explain why specific ROIs were associated with the strongest hypotheses but how the entire networks are relevant and related to existing literatures on attentional control and working memory. This content will be included in the introduction and discussion sections.

    Reviewer #3 (Public review):

    The manuscript contains a carefully designed fMRI study, using MVPA pattern analysis to investigate which high-level associate cortices contain target-related information to guide visual search. A special focus is hereby on so-called 'target-associated' information, that has previously been shown to help in guiding attention during visual search. For this purpose the author trained their participants and made them learn specific target-associations, in order to then test which brain regions may contain neural representations of those learnt associations. They found that at least some of the associations tested were encoded in prefrontal cortex during the cue and delay period.

    The manuscript is very carefully prepared. As far as I can see, the statistical analyses are all sound and the results integrate well with previous findings.

    I have no strong objections against the presented results and their interpretation.

    Thank you.