Article activity feed

  1. Author Response:

    Reviewer #1:

    The authors found a switch between "retrospective", sensory recruitment-like representations in visual regions when a motor response could not be planned in advance, and "prospective" action-like representations in motor regions when a specific button response could be anticipated. The use of classifiers trained on multiple tasks - an independent spatial working memory task, spatial localizer, and a button-pressing task - to decode working memory representations makes this a strong study with straightforward interpretations well-supported by the data. These analyses provide a convincing demonstration that not only are different regions involved when a retrospective code is required (or alternatively when a prospective code can be used), but the retrospective representations resemble those evoked by perceptual input, and the prospective representations resemble those evoked by actual button presses.

    I have just a couple of points that could be elaborated on:

    1. While there is a clear transition from representations in visual cortex to representations in sensorimotor regions when a button press can be planned in advance, the visual cortex representations do not disappear completely (Figs 2B and C). Is the most plausible interpretation that participants just did not follow the cue 100% of the time, or that some degree of sensory recruitment is happening in visual cortex obligatorily (despite being unnecessary for the task) and leading to a more distributed, and potentially more robust code?

    This is a very good point, and indeed could be considered surprising. While previous work suggests that sensory recruitment is not obligatory when an item can be dropped from memory entirely (e.g., Harrison & Tong, 2009; Lewis-Peacock et al., 2012; Sprague et al., 2014, Sprague et al., 2016; Lorenc et al., 2020), other work suggests that an item which might still be relevant later in a trial (i.e., a socalled “unattended memory item”) can still be decoded during the delay (see the re-analyses in Iamshchinina et al., 2021 from the original Christophel et al. 2018 paper). In short, we cannot exclude that in our paradigm there is some low-grade sensory recruitment happening in visual cortex, even when an action-oriented code can theoretically be used. This would be consistent with a more distributed code, which could potentially increase the overall robustness of working memory.

    At the same time, as the reviewer points out, there is a possibility that on some fraction of trials, participants failed to perfectly encode the cue, or forgot the cue, which might mean they were using a sensory-like code even on some trials in the informative cue condition. This is a reasonable possibility given that we used a trial-by-trial interleaved design, where participants needed to pay close attention on each trial in order to know the current condition. Since we averaged decoding performance across all trials, the above-chance decoding accuracy could be driven by a small fraction of trials during which spatial strategies were used despite the informative nature of the preview disk.

    Finally, another factor is the averaging of data across multiple TRs from the delay period. In Figure 2B, the decoding was performed using data that was averaged over several TRs around the middle of the delay period (8-12.8 seconds from trial start). This interval is early enough that the process of re-coding a representation from sensory to motor cortex may not be complete yet, so this might be an explanation for the relatively high decoding accuracy seen in the informative condition in Figure 2B. Indeed, the time-resolved analyses (Figure 2C, Figure 2 – figure supplement 1) show that the decoding accuracy for the informative condition continues to decline later in the delay period, though it does not go entirely to chance (with the possible exception of area V1).

    Of course, our ability to decode spatial position despite participants having the option to use a pure action-oriented code may be due to a combination of all of the above: some amount of low-grade obligatory sensory recruitment, as well as occasional trials with higher-precision spatial memory due to a missed cue. We have added a paragraph to the discussion to now acknowledge these possibilities.

    Finally, although it is conceptually important to consider the reasons why decoding in the uninformative condition did not drop entirely to chance, we also note that whether the decoding goes to chance in one condition is not critical to the main findings of our paper. The data show a robust difference between the spatial decoding accuracy in visual cortex between the two conditions, which indicates that the relative amount of information in visual cortex was modulated by the task condition, regardless of what the absolute information content was in each condition.

    1. To what extent might the prospective code reflect an actual finger movement (even just increased pressure on the button to be pressed) in advance of the button press? For instance, it could be the case that the participant with extremely high button press-trained decoding performance in 4B, especially, was using such a strategy. I know that participants were instructed not to make overt button presses in advance, but I think it would be helpful to elaborate a bit on the evidence that these action-related representations are truly "working memory" representations.

    This is a good point, and we acknowledge the possibility of some amount of preparatory motor activity during the delay period on trials in the informative condition. However, we still interpret the delayperiod representations during the informative condition as a signature of working memory, for several reasons.

    First, the participants were explicitly instructed to withhold overt finger movements until the final probe disk was shown. We monitored participants closely during their task training phase, which took place outside the scanner, for early button presses, and ensured that they understood and followed the directive to withhold a button press until the correct time. We also confirmed that participants were not engaging in any noticeable motor rehearsal behaviors, such as tapping their fingers just above the buttons. During the scans, we also monitored participants using a video feed that was positioned in a way that allowed us to see their hands on the response box and confirmed that participants were not making any overt finger movements during the delay period. Additionally, most of our participants were relatively experienced, having participated in at least one other fMRI study with our group in the past, and therefore we expect them to have followed the task instructions accurately.

    The distribution of response times for trials in the informative condition also provides some evidence against the idea that participants were already making a button press ahead of the response window. The earliest presses occurred around 250 ms (see below figure, left panel). This response time is consistent with the typical range of human choice response times observed experimentally (e.g. Luce, 1991), suggesting that participants did not execute a physical response in advance of the probe disk appearance, but waited until the response disk stimulus appeared to begin motor response execution.

    Finally, even if we assume that some amount of low-grade motor preparatory activity was occurring, this is still broadly consistent with the way that working memory has been defined in past literature. Past work has distinguished between retrospective and prospective working memory, with retrospective memory being similar in format to previously encountered sensory stimuli, and prospective memory being more closely aligned with upcoming events or actions (Funahashi, Chafee, & Goldman-Rakic, 1993; Rainer, Rao & D’Esposito, 1999; Curtis, Rao, & D’Esposito, 2004; Rahmati et al., 2018; Nobre & Stokes, 2019). Indeed, the transformation of a memory representation from a retrospective code to prospective memory code is often associated with increased engagement of circuits directly related to motor control (Schneider, Barth, & Wascher, 2017; Myers, Stokes, & Nobre, 2017). According to this framework, covert motor preparation could be considered a representation at the extreme end of the prospective memory continuum. Also consistent with this idea, past work has demonstrated that the selection and manipulation of items in working memory can be accompanied by systematic eye movements biased to the locations at which memoranda were previously presented (Spivey & Geng, 2001; Ferreira et al., 2008; van Ede et al., 2019b; van Ede et al. 2020). These physical eye movements may indeed play a functional role in the retrieval of items from memory (Ferreira et al., 2008; van Ede et al., 2019b). These findings suggest that working memory is tightly linked with both the planning and execution of motor actions, and that the mnemonic representations in our task, even if they include some degree of covert motor preparatory activity, are within the realm of representations that can be defined as working memory.

    We have now included a discussion of this issue in the text of our manuscript.

    Reviewer #2:

    Henderson, Rademaker and Serences use fMRI to arbitrate between theories of visual working memory proposing fixed x flexible loci for maintaining information. By comparing activation patterns in tasks with predictable x unpredictable motor responses, they find different extents of information retrieval in sensory- x motor-related areas, thus arguing that the amount/format of retrospective sensory-related x prospective motor-related information maintained depends on what is strategically beneficial for task performance.

    I share the importance of this fundamental question and the enthusiasm for the conclusions, and I applaud the advanced methodology. I did, however, struggle with some aspects of the experimental design and (therefore) the logic of interpretation. I hope these are easily addressable.

    Conceptual points:

    1. The main informative x non-informative conditions differ more than just in the knowledge about the response. In the informative case, participants could select both the relevant sensory information (light, dark shade) and the corresponding response. In essence, their task was done, and they just needed to wait for a later go signal - the second disk. (The activity in the delay could be considered to be one of purely motor preparation or of holding a decision/response.) In the uninformative condition, neither was sensory information at the spatial location relevant and nor could the response be predicted. Participants had, instead, to hold on to the spatial location to apply it to the second disk. These conditions are more different than the authors propose and therefore it is not straightforward to interpret findings in the framework set up by the authors. A clear demonstration for the question posed would require participants to hold the same working-memory content for different purposes, but here the content that needs to be held differs vastly between conditions. The authors may argue this is, nevertheless, the essence of their point, but this is a weak strawman to combat.

    It is true that the conditions in our task differ in several respects, including the content of the representation that must be stored. The uninformative condition trials required the participant to maintain a high-precision, sensory-like spatial representation of the target stimulus, without the ability to plan a motor response or re-code the representation into a coarser format. In contrast, the informative condition trials allowed the participant to re-code their representation into a more actionoriented format than the representation needed for the uninformative condition trials, and the code is also binary (right or left) rather than continuous.

    However, we do not think these differences present an issue for the interpretation of our study. The primary goal of our study was to demonstrate that the brain regions and representational formats utilized for working memory storage may differ depending on parameters of the task, rather than having fixed loci or a single underlying neural mechanism. To achieve this, we intentionally created conditions that are meant to sit at fairly extreme ends of the continuum of working memory task paradigms employed in past work. Our uninformative condition is similar to past studies of spatial working memory with human participants that encourage high-precision, sensory-like codes (i.e., Bays & Husain, 2008; Sprague et al., 2014; Sprague et al., 2016; Rahmati et al., 2018) and our informative condition is more similar to classic delayed-saccade task studies in non-human primates, which often allowed explicit motor planning (Funahashi et al., 1989; Goldman-Rakic, 1995). By having the same participants perform these distinct task conditions on interleaved trials, we can better understand the relationship between these task paradigms and how they influence the mechanisms of working memory.

    Importantly, it is not trivial or guaranteed that we should have found a difference in neural representations across our task conditions. In particular, an alternative perspective presented in past work is that the memory representations detected in early visual cortex in various tasks are actually not essential to mnemonic storage (Leavitt, Mendoza-Halliday, & Martinez-Trujillo, 2017; Xu, 2020). On this view, if visual cortex representations are not functionally relevant for the task, one might have predicted that our spatial decoding accuracy in early visual areas would have been similar across conditions, with visual cortex engaged in an obligatory manner regardless of the exact format of the representation required. Instead, we found a dramatic difference in decoding accuracy across our task conditions. This finding underscores the functional importance of early visual cortex in working memory maintenance, because its engagement appears to be dependent on the format of the representation required for the current task.

    Relatedly, some past work has also suggested that in the context of an oculomotor delayed response task, the maintenance of action-oriented motor codes can be associated with topographically specific patterns of activation in early visual cortex which resemble those recorded during sensory-like spatial working memory maintenance (Saber et al., 2015; Rahmati et al., 2018). This is true for both prosaccade trials, in which saccade goals are linked to past sensory inputs, and anti-saccade trials, in which motor plans are dissociated from past sensory inputs. These findings indicate that even for task conditions which on the surface would appear to require very different cognitive strategies, there can, at least in some contexts, be a substantial degree of overlap between the neural mechanisms supporting sensory-like and action-oriented working memory. This again highlights the novelty of our findings, in which we demonstrate a robust dissociation between the brain areas and neural coding format that support working memory maintenance for different task conditions, rather than overlapping mechanisms for all types of working memory.

    Additionally, there are important respects in which the task conditions have similarities, rather than being entirely different. As pointed out by Reviewer #1, the decoding of spatial information in early visual cortex regions did not drop entirely to chance in the informative condition, even by the end of the delay period (Figure 2C, Figure 2 – figure supplement 1). As discussed above in our reply to R1, this finding may suggest that the neural code in the informative condition continues to rely on visual cortex activation to some extent, even when an action-oriented coding strategy is available. This possibility of a partially distributed code suggests that while the two conditions in our task appear different in terms of the optimal strategy associated with each one, in practice the neural mechanisms supporting the tasks may be somewhat overlapping (although the different mechanisms are differentially recruited based on task demands, which is our main point).

    Another aspect of our results which suggests a degree of similarity between the task conditions is that the univariate delay period activation in early visual cortex (V1-hV4) was not significantly different between conditions (Figure 1 – figure supplement 1). Thus, it is not simply the case that the participants switched from relying purely on visual cortex to purely on motor cortex – the change in information content instead reflects a much more strategically graded change to the pattern of neural activation. This point is elaborated further in the response to point (2) below.

    1. Given the nature of the manipulation and the fact that the nature of the upcoming trial (informative x uninformative) was cued, how can effects of anticipated difficulty, arousal, or other nuisance variables be discounted? Although pattern-based analyses suggest the effects are not purely related to general effects (authors argue this in the discussion, page 14), general variables can interact with specific aspects of information processing, leading to modulation of specific effects.

    There are several aspects of our results which suggest that our results are not due to effects such as anticipated difficulty or general arousal. First, we designed our experiment using a randomly interleaved trial order, such that participants could not anticipate experimental condition on a trialby-trial basis. Participants only learned which condition each trial was in when the condition cue (color change at fixation; Figure 1A) appeared, which happened 1.5 seconds into the delay period. Thus, any potential effects of anticipated difficulty could not have influenced the initial encoding of the target stimulus, and would have had to take effect later in the trial. Second, as the reviewer pointed out, we did not observe any statistically significant modulation of the univariate delay period BOLD signal in early visual ROIs V1-hV4 between task conditions (Figure 1D, Figure 1 – figure supplement 1), which argues against the idea that there is a global modulation of early visual cortex induced by arousal or changes in difficulty.

    Additionally, our results demonstrate a dissociation between univariate delay period activation in IPS and sensorimotor cortex ROIs as a function of task condition (Figure 1D, Figure 1 – figure supplement 1). In each IPS subregion (IPS0-IPS3), the average BOLD signal was significantly greater during the uninformative versus the informative condition at several timepoints in the delay period, while in S1, M1, and PMc, average signal was significantly greater for the informative than the uninformative condition at several timepoints. If a global change in mean arousal or anticipated difficulty were a main driving factor in our results, then we would have expected to see an increase in the univariate response throughout the brain for the more difficult task condition (i.e., the uninformative condition). Instead, we observed effects of task condition on univariate BOLD signal that were specific to particular ROIs. This indicates that modulations of neural activation in our task reflect a more finegrained change in neural processing, rather than a global change in arousal or anticipated difficulty.

    Furthermore, to determine whether the changes in decoding accuracy in early visual cortex were specific to the memory representation or reflected a more general change in signal-to-noise ratio, we provide a new analysis assessing the possibility that processing of incoming sensory information differed between our two conditions. As mentioned above, initial sensory processing of the memory target stimulus was equated across conditions, since participants didn’t know the task condition until the cue was presented 1.5s into the trial. However, because the “preview disk” was presented after the cue, it is possible that the preview disk stimulus was processed differently as a function of task condition. If evidence for differential processing of the preview disk stimulus is present, this might suggest that non-mnemonic factors – such as arousal – might influence the observed differences in decoding accuracy because they should interact with the processing of all stimuli. However, a lack of evidence for differential processing of the preview disk would be consistent with a mnemonic source of differences between task conditions.

    As shown in the new figure below (now Figure 2 – figure supplement 3), we used a linear decoder to measure the representation of the “preview disk” stimulus that was shown to participants early in the delay period, just after the condition cue (Figure 1A). This disk has a light and dark half separated by a linear boundary whose orientation can span a range of 0°-180°. To measure the representation of the disk’s orientation, we binned the data into four bins centered at 0°, 45°, 90°, and 135°, and trained two binary decoders to discriminate the bins that were 90° apart (an adapted version of the approach shown in Figure 2A; similar to Rademaker et al., 2019). Importantly, the orientation of this disk was random with respect to the memorized spatial location, allowing us to run this analysis independently from the spatial-position decoding in the main manuscript text.

    We found that in both conditions, the orientation of the preview disk boundary could be decoded from early visual cortex (all p-values<0.001 for V1-hV4 in both conditions; evaluated using nonparametric statistics as described in Methods), with no significant difference between our two task conditions (all p-values>0.05 for condition difference in V1-hV4). This indicates that in both task conditions, the incoming sensory stimulus (“preview disk”) was represented with similar fidelity in early visual cortex. At the same time, and in the same regions, the representation of the remembered spatial stimulus was significantly stronger in the uninformative condition than the informative condition. Therefore, the difference between task conditions appears to be specific to the quality of the spatial memory representation itself, rather than a change in the overall signal-to-noise ratio of representations in early visual cortex. This suggests that the difference between task conditions in early visual cortex reflects a difference in the brain networks that support memory maintenance in the two conditions, rather than extra processing of the preview disk in one condition over the other, a more general effect of arousal, or anticipated difficulty.

    This result is also relevant to the concerns raised by the reviewer in point (1) regarding the possibility that the selection of relevant sensory information (i.e., the light/dark side of the disk) was different between the two task conditions. Since the decoding accuracy for the preview disk orientation did not differ between task conditions, this argues against the idea that differential processing of the preview disk may have contributed to the difference in memory decoding accuracy that we observed.

    1. I see what the authors mean by retrospective and prospective codes, but in a way all the codes are prospective. Even the sensory codes, when emphasized, are there to guide future discriminations or to add sensory granularity to responses, etc. Perhaps casting this in terms of sensory/perceptual x motor/action~ may be less problematic.

    This is a good point, and we agree that in some sense all the memory codes could be considered prospective because in both conditions, the participant has some knowledge of the way that their memory will be probed in the future, even when they do not know their exact response yet. We have changed our language in the text to reflect the suggested terms “perceptual” and “action”, which will hopefully also make the difference between the conditions clearer to the reader.

    1. In interpreting the elevated univariate activation in the parietal IPSO-3 area, the authors state "This pattern is consistent with the use of a retrospective spatial code in the uninformative condition and a prospective motor code in the informative condition". (page 6) (Given points 1 and 3 above) Instead, one could think of this as having to hold onto a different type of information (spatial location as opposed to shading) in uninformative condition, which is prospectively useful for making the necessary decision down the line.

    It is true that a major difference between the two conditions was the type of information that the participants had to retain, with a sensory-like spatial representation being required for the uninformative condition, and a more action-oriented (i.e., left or right finger) representation being required for the informative condition. To clarify, the participant never had to explicitly hold onto the shading (light or dark gray side of the disk), since the shading was always linked to a particular finger, and this mapping was known in advance at the start of each task run (although we did change this mapping across task runs within each participant to counterbalance the mapping of light/dark and the left/right finger – one mapping used in the first scanner session, the other mapping used in the second scanning session). We have clarified this sentence and we have removed the use of the terms “retrospective” and “prospective” as suggested in the previous comment. The sentence now reads: “This pattern is consistent with the use of a spatial code in the uninformative condition and a motor code in the informative condition.”

    Other points to consider:

    1. Opening with the Baddeley and Hitch 1974 reference when defining working memory implicitly implies buying into that particular (multi-compartmental) model. Though Baddeley and Hitch popularised the term, the term was used earlier in more neutral ways or in different models. It may be useful to add a recent more neutral review reference too?

    This is a nice suggestion. We have added a few more references to the beginning of the manuscript, which should together present a more neutral perspective (Atkinson & Shiffron, 1968; and Jonides, Lacey and Nee, 2005).

    1. The body of literature showing attention-related selection/prioritisation in working memory linked to action preparation is also relevant to the current study. There's a nice review by Heuer, Ohl, Rolfs 2020 in Visual Cognition.

    We thank the reviewer for pointing out this interesting body of work, which is indeed very relevant here. We have added a new paragraph to our discussion which includes a discussion of this paper and its relation to our work.

    Was this evaluation helpful?
  2. Evaluation Summary:

    This rigorous, carefully designed and executed functional magnetic-resonance imaging study provides compelling evidence against a rigid, fixed model for how working-memory representations are maintained in the human brain. By analyzing patterns and strength of brain activity, the authors show that networks for maintaining contents in mind vary depending on the task demands and foreknowledge of anticipated responses. This manuscript will be of interest to scientists studying working memory, both in humans and in non-human primates.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their names with the authors.)

    Was this evaluation helpful?
  3. Reviewer #1 (Public Review):

    The authors found a switch between "retrospective", sensory recruitment-like representations in visual regions when a motor response could not be planned in advance, and "prospective" action-like representations in motor regions when a specific button response could be anticipated. The use of classifiers trained on multiple tasks - an independent spatial working memory task, spatial localizer, and a button-pressing task - to decode working memory representations makes this a strong study with straightforward interpretations well-supported by the data. These analyses provide a convincing demonstration that not only are different regions involved when a retrospective code is required (or alternatively when a prospective code can be used), but the retrospective representations resemble those evoked by perceptual input, and the prospective representations resemble those evoked by actual button presses.

    I have just a couple of points that could be elaborated on:

    1. While there is a clear transition from representations in visual cortex to representations in sensorimotor regions when a button press can be planned in advance, the visual cortex representations do not disappear completely (Figs 2B and C). Is the most plausible interpretation that participants just did not follow the cue 100% of the time, or that some degree of sensory recruitment is happening in visual cortex obligatorily (despite being unnecessary for the task) and leading to a more distributed, and potentially more robust code?

    2. To what extent might the prospective code reflect an actual finger movement (even just increased pressure on the button to be pressed) in advance of the button press? For instance, it could be the case that the participant with extremely high button press-trained decoding performance in 4B, especially, was using such a strategy. I know that participants were instructed not to make overt button presses in advance, but I think it would be helpful to elaborate a bit on the evidence that these action-related representations are truly "working memory" representations.

    Was this evaluation helpful?
  4. Reviewer #2 (Public Review):

    Henderson, Rademaker and Serences use fMRI to arbitrate between theories of visual working memory proposing fixed x flexible loci for maintaining information. By comparing activation patterns in tasks with predictable x unpredictable motor responses, they find different extents of information retrieval in sensory- x motor-related areas, thus arguing that the amount/format of retrospective sensory-related x prospective motor-related information maintained depends on what is strategically beneficial for task performance.

    I share the importance of this fundamental question and the enthusiasm for the conclusions, and I applaud the advanced methodology. I did, however, struggle with some aspects of the experimental design and (therefore) the logic of interpretation. I hope these are easily addressable.

    Conceptual points:

    1. The main informative x non-informative conditions differ more than just in the knowledge about the response. In the informative case, participants could select both the relevant sensory information (light, dark shade) and the corresponding response. In essence, their task was done, and they just needed to wait for a later go signal - the second disk. (The activity in the delay could be considered to be one of purely motor preparation or of holding a decision/response.) In the uninformative condition, neither was sensory information at the spatial location relevant and nor could the response be predicted. Participants had, instead, to hold on to the spatial location to apply it to the second disk. These conditions are more different than the authors propose and therefore it is not straightforward to interpret findings in the framework set up by the authors. A clear demonstration for the question posed would require participants to hold the same working-memory content for different purposes, but here the content that needs to be held differs vastly between conditions. The authors may argue this is, nevertheless, the essence of their point, but this is a weak strawman to combat.

    2. Given the nature of the manipulation and the fact that the nature of the upcoming trial (informative x uninformative) was cued, how can effects of anticipated difficulty, arousal, or other nuisance variables be discounted? Although pattern-based analyses suggest the effects are not purely related to general effects (authors argue this in the discussion, page 14), general variables can interact with specific aspects of information processing, leading to modulation of specific effects.

    3. I see what the authors mean by retrospective and prospective codes, but in a way all the codes are prospective. Even the sensory codes, when emphasized, are there to guide future discriminations or to add sensory granularity to responses, etc. Perhaps casting this in terms of sensory/perceptual x motor/action~ may be less problematic.

    4. In interpreting the elevated univariate activation in the parietal IPSO-3 area, the authors state "This pattern is consistent with the use of a retrospective spatial code in the uninformative condition and a prospective motor code in the informative condition". (page 6) (Given points 1 and 3 above) Instead, one could think of this as having to hold onto a different type of information (spatial location as opposed to shading) in uninformative condition, which is prospectively useful for making the necessary decision down the line.

    Other points to consider:

    1. Opening with the Baddeley and Hitch 1974 reference when defining working memory implicitly implies buying into that particular (multi-compartmental) model. Though Baddeley and Hitch popularised the term, the term was used earlier in more neutral ways or in different models. It may be useful to add a recent more neutral review reference too?

    2. The body of literature showing attention-related selection/prioritisation in working memory linked to action preparation is also relevant to the current study. There's a nice review by Heuer, Ohl, Rolfs 2020 in Visual Cognition.

    Was this evaluation helpful?