A generalized cortical activity pattern at internally generated mental context boundaries during unguided narrative recall

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    This manuscript is of interest to cognitive neuroscientists working on topics broadly related to memory, event segmentation and mental context. It presents an interesting set of new analyses related to internally versus externally driven changes in mental context. The idea is innovative and the analyses and methods are thoughtful and rigorous. There are some concerns about the degree to which the interpretations are supported by the data, but they could potentially be resolved with additional control analyses.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1, Reviewer #2 and Reviewer #3 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Current theory and empirical studies suggest that humans segment continuous experiences into events based on the mismatch between predicted and actual sensory inputs; detection of these ‘event boundaries’ evokes transient neural responses. However, boundaries can also occur at transitions between internal mental states, without relevant external input changes. To what extent do such ‘internal boundaries’ share neural response properties with externally driven boundaries? We conducted an fMRI experiment where subjects watched a series of short movies and then verbally recalled the movies, unprompted, in the order of their choosing. During recall, transitions between movies thus constituted major boundaries between internal mental contexts, generated purely by subjects’ unguided thoughts. Following the offset of each recalled movie, we observed stereotyped spatial activation patterns in the default mode network, especially the posterior medial cortex, consistent across different movie contents and even across the different tasks of movie watching and recall. Surprisingly, the between-movie boundary patterns did not resemble patterns at boundaries between events within a movie. Thus, major transitions between mental contexts elicit neural phenomena shared across internal and external modes and distinct from within-context event boundary detection, potentially reflecting a cognitive state related to the flushing and reconfiguration of situation models.

Article activity feed

  1. Evaluation Summary:

    This manuscript is of interest to cognitive neuroscientists working on topics broadly related to memory, event segmentation and mental context. It presents an interesting set of new analyses related to internally versus externally driven changes in mental context. The idea is innovative and the analyses and methods are thoughtful and rigorous. There are some concerns about the degree to which the interpretations are supported by the data, but they could potentially be resolved with additional control analyses.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1, Reviewer #2 and Reviewer #3 agreed to share their name with the authors.)

  2. Reviewer #1 (Public Review):

    Lee and Chen investigate the representation of between-movie boundaries in the brain, with a particular focus on the spontaneous boundaries that occur as people shift between movie recalls. Are these sorts of recall boundaries represented the same as those that occur (a) between the visual presentation of different movies (between-movies boundaries at encoding) and/or (b) between events within a single movie (within-movie boundaries at encoding)-or are these recall boundaries different? The main findings were that between-movie boundaries were quite similar regardless of the task phase (encoding/retrieval), but dissimilar from within-movie boundaries.

    This paper has many strengths, including the interesting research question and sophisticated analytic approach. The authors have done an excellent job presenting many important controls and, despite its complexity, the work is presented in a way that is clear and enjoyable to engage with. While a relatively brief paper and simple story, as the authors note in the discussion, there are many possible interpretations or underlying mechanisms that could be giving rise to this phenomenon-which is quite exciting. So, the paper may be a source of more questions than answers! I see this as a great feature. I think this work will inspire many new investigations into boundary representation and event segmentation.

    I do have a few suggestions and questions for the authors to address the current weaknesses, as outlined below:

    1. I am generally interested in better understanding how differences in the sensory experience (most notably, presentation of visual input in the movies versus its absence during the between-movie boundaries) across timepoints could be playing a role in these results. If I understand correctly, the between-movie boundaries will always contain a (mostly) blank screen (with simple white text at encoding) along and silence, for both encoding and recall phases. In contrast, the no-boundary periods as well as the within-movie boundaries would always contain visual input (movies). There are a few reasons why this is concerning to me. First, the boundary periods are relatively much more homogenous in terms of input/experience, and so it intuitively makes sense to me that the neural pattern would also therefore be quite similar across different boundary periods (even across phases). The primary comparisons as shown in Fig 2 are comparing these homogenous boundary experiences with highly variable within-movie experiences (either due to ongoing recall/speech, or movie viewing). It seems to follow that for this reason alone one should expect the boundary patterns to be more similar to one another, and I am not sure whether that is the sort of boundary processing that is of interest here. Second, as is evident in Figure 3A, the "middle" (no-boundary movie) patterns are a much more heterogeneous bunch, with some pairs showing positive and others showing negative correlations with one another-potentially reflecting variability in the input. Given this, it of course has to be the case that the average correlation of within-movie patterns is low (near zero; Figure 3C) but it also may follow that the boundary patterns are negatively correlated with the event offset (within-movie boundary) patterns. I appreciated the analysis related to the audio controls, but am not sure the authors were able to account for the visual differences.

    2. I was not sure I fully understood the offset vs. onset yoking analysis-both how it was performed, and how the conclusions followed from the results. First, I was a bit confused about how the difference in delay duration between movies at encoding (6s) versus at recall (9.3s on average, but variable; see also comment #5) would play into this and whether those are meaningful time points to display on the Figure 3D charts that might help the reader interpret those findings. Second, the authors state that this analysis shows the boundary patterns were driven by offset (more than onset) responses, but I was not sure what aspect of the results led to that conclusion. Can the authors say more about the evidence supporting this conclusion? It looks to me like there are strong correlations that emerge both after offset and onset (i.e., just above and to the right of the origin there are numerous time points with positive [red] correlations). Perhaps it is because the red positive correlations start earlier, prior to the recall itself, when yoked to the onset, but I am not sure why this means it is related to offset and not some preparation for the onset of recall (see also comment #5). Also, is it interesting or meaningful that the patterns seem more compressed at recall than at encoding (i.e., the outlined red areas are skinnier than they are tall)?

    3. I am not sure the reason for masking Figure 2B and C with the a>0 and c>0 maps. First of all, it seems as though in RSA the actual correlation being positive or negative is not terribly meaningful and can depend on preprocessing decisions, etc. In addition to that potential issue though I'm also just generally interested in more understanding the logic behind this decision. Can the authors explain that and include it in the main paper? Were there any regions that showed for example a
    4. I am inferring (though could be incorrect) that some of the pattern similarity analyses would be directly comparing (i.e., correlating) patterns derived from the same scanning run. Can the authors confirm if this is the case? If so, it would be important to consider in the paper how temporal autocorrelation within scanning run may be impacting the results (for example, how does temporal distance between the different events vary (or not) across the different comparisons?). Ideally, the authors would be able to demonstrate that the same pattern of results can be found when limiting to cross-run comparisons only. Relatedly, it would be important to know whether the across-phase comparisons (e.g., in that there are more regions that show significant recall-recall similarity vs. encoding-recall in Figure 2B/C) might also be impacted by differences in whether the patterns were derived from the same run (whereas half of the comparisons could be same-run for recall-recall, none of the comparisons would be from the same run in encoding-recall, and so the overall correlation may be higher for recall-recall... or encoding-encoding).

    5. The definition of the time periods of interest were a bit confusing to me. For example in the main analysis, the duration seemed arbitrary at 15 seconds, and I believe it always began at the offset of the preceding movie (shifted by 4.5s for hemodynamic lag). To clarify, at encoding, this means that it would always include the beginning of the "next" movie, but never the end of the preceding movie, is that correct? So the boundary between movie A and movie B (looking at Fig 1) would include some activation associated with the beginning of movie B viewing-is that correct? It seems a bit strange to me given the goals and framing of the study that movie B would be included here, given I thought most of the "action" would be happening with the movie A memory in this case. It seems as though this definition may also produce systematic differences between encoding and recall: for one, the delays between recalls are variable and longer (9.3s on average with an SD of 16.8s) than the fixed 6s title screen, so the contents going into the neural patterns at recall would be different (contain less recall time and more blank-screen time); but also during recall, it seems as though the participant would be bringing to mind memories of the upcoming movie B they are about to recall, while there is no way for participants to anticipate anything specific about the upcoming movie during encoding. Can the authors clarify these points in the paper?

  3. Reviewer #2 (Public Review):

    This experiment by Lee and Chen sought to examine internally-guided boundaries between events during recall, specifically recall of audiovisual short movies viewed a few minutes prior. They performed a set of well thought out pattern similarity analyses to determine if activity in the Default Mode Network (DMN) and more specifically Poster Medial Cortex (PMC) were related across encoding and retrieval of separate movies. Briefly, the authors found characteristic univariate activation patterns in the brain's 'Default Mode Network' (DMN) during event transitions, but extend prior work to show that these activations are present during internally-guided event transitions. Furthermore, fascinatingly, the authors report increases in pattern similarity at event offsets that persist across encoding and recall, and which were not present during the middle of events. This is taken as evidence of a general cognitive state that exists at event transitions, and exists beyond the level of a single event.

    In our view, the authors' results support their interpretations. Not only are there internally generated boundaries that mark shifts between broader contexts, but these boundaries appear to be distinct from those that are found within continuous narratives. This is a very interesting dataset, cleverly analyzed, which points to interesting new directions for the fields of event cognition and event memory. Namely, the distinction between within-event and between-event boundaries adds new depth to the discussion of event boundaries more broadly, and the notion of a general common cognitive state during event transitions is a thought-provoking result that will certainly influence our group's thinking on this topic, and likely many others.

    We truthfully do not have substantive criticisms of this manuscript. We think the study is well done, the analyses seem properly conducted, and the manuscript is generally written well and clearly. There are a few minor requests for clarification that we will note in our separate recommendation to the authors, but the only critique bordering on a 'major' concern is the very short 'Discussion' portion of the manuscript. While we recognize that this is a short-format submission, and while the authors did a fine job of trying to synthesize their results and situate them in the context of the field in 2 short paragraphs, I honestly think there should be more discussion of the findings and their implications. In sum, however, we think that this is a solid paper that we found very exciting.

  4. Reviewer #3 (Public Review):

    The aim of this paper is to investigate whether internally driven changes in mental context involve similar neural mechanisms as externally driven changes. In particular, the paper investigates whether there are consistent neural responses that align with the transition between stimuli when watching or recalling movies. The authors show that there is a consistent pattern of neural responses, particularly in precuneus and angular gyrus, that is evoked by the offset of a movie during movie watching and by the self-generated transitions between movies during movie-recall. Their results suggest that self-generated shifts in mental context involve similar neural processes as externally-generated shifts.

    The paper is well written and the results are interesting. The analyses and methods are thoughtful and rigorous and provide an interesting new perspective on internally driven changes in mental context that (as far as I know) have not been investigated before in this way. It is clear that the authors spend a lot of time and effort on additional analyses to understand in detail what is going on and which factors might be driving the observed differences.

    I do have a couple of concerns about the interpretation of the findings. In the abstract the authors state that the findings reflect: 'a cognitive state related to the flushing and reconfiguration of situation models'. If the between-movie activity patterns reflect the flushing/reconfiguration of the prior context, it is very surprising that there is a negative correlation to within-movie boundary patterns. The premise of event segmentation theory is that event boundaries (within a given context/movie) result in a reset of the event model, therefore also resulting in a 'flushing' of the prior context. I have three main concerns related to this point:

    1. To what extent can the similarity between recall and encoding be driven by the (sudden lack of) external input that occurs at transitions between movies? The authors already investigated the role of auditory input, but of course there is also a sudden lack/reduction of visual input during the boundaries between movies at encoding. The authors state that: "Visual features (i.e., black screen) or pauses in speech cannot explain boundary-specific similarity between encoding and recall phases, because boundary and non-boundary periods were identical in terms of visual input during recall and speech generation during movie watching." I do not agree with this statement. When looking at the similarity between encoding and recall during boundaries, the characteristics of the input are more similar (absent visual and auditory input) than when looking at the similarity between encoding and recall in the middle of movies (present vs. absent visual and auditory input). This confound should be taken into account in the analyses.

    2. Could the negative correlation between within-movie boundary patterns and between-movie boundary patterns be due to the long time-window that is averaged (15 seconds)? Both within and between-movie boundaries might result in a similar transient activity patterns, which persists longer for the between-movie boundaries possibly due to the 6 seconds of black/title screen at movie offset.

    3. In addition to these two concerns, it is unclear to me at this point to what extent these findings can be related to a reset of context that might occur in a real-life setting or if it is specific to the current experimental setup. Do the authors think that subtle shifts in context (e.g. within a given movie/context) involve fundamentally different mechanisms as compared to more stark transitions that occur between contexts (e.g. between movies)?