Perceptual difficulty modulates the direction of information flow in familiar face recognition
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (eLife)
Abstract
Article activity feed
-
-
###Reviewer #3:
This is a manuscript by Karimi-Rouzbahani et al, about the neural encoding of facial familiarity using EEG and MVPA.
I essentially found the article interesting, clear and using solid methods. Besides a few minor comments, which I list below, I found only one major issue which has to be addressed.
Major comment:
My only major problem with the results lies in the simple interpretation of anterior contributions to the encoding of familiarity as feed-back. You find, using a clever partialling out method, that eliminating the occipital contributions from the frontal (or rather anterior, as it involves temporal cortex too) electrode pattern familiarity decoding reduces stronger and earlier-longer information encoding about familiarity, when compared to the opposite, when you partial out the frontal information from that of …
###Reviewer #3:
This is a manuscript by Karimi-Rouzbahani et al, about the neural encoding of facial familiarity using EEG and MVPA.
I essentially found the article interesting, clear and using solid methods. Besides a few minor comments, which I list below, I found only one major issue which has to be addressed.
Major comment:
My only major problem with the results lies in the simple interpretation of anterior contributions to the encoding of familiarity as feed-back. You find, using a clever partialling out method, that eliminating the occipital contributions from the frontal (or rather anterior, as it involves temporal cortex too) electrode pattern familiarity decoding reduces stronger and earlier-longer information encoding about familiarity, when compared to the opposite, when you partial out the frontal information from that of the occipital/posterior electrode pattern. The former is interpreted as a signal of feed-back, while the opposite as feed-forward information flow. This makes sense but only if the frontal cortex does not play a role, on its own right, in face processing. However, the inferior frontal face area (see e.g. Collins and Olson,2014) is known to be associated with the STS and playing a role in social, dynamic and eye-movement related information processing. If we assume that these tasks are more related to the frontal than to the posterior areas, as for example Duchaine and Yovel, 2015 do, then the results of the partialling out analysis merely mean that the functions of the frontal areas are modulated more by the posterior areas (in other words, in those functions the parietal areas also play a role) than the other way around. The lower-level functions of the posterior sites are, on the other hand, modulated less, shorter, later by the removal of frontal areas, in other words the frontal cortexes do not play much role in them.
This is different from your conclusion where you state feed-forward vs feed-back connections. I don't see any good way to come around this alternative (and simpler) conclusion than your assumption about connectivity. Time would be a potential factor to resolve it, feed-back being later, but in your figures it is clear that the two periods overlap entirely and the peaks also almost fall into identical windows.
Unless I overlooked something and you can give a convincing way to exclude this possibility I would recommend a) discuss this in the paper and b) tune down your respective conclusions throughout the manuscript.
-
###Reviewer #2:
The authors employed a clever experimental paradigm to investigate how the brain integrates visual information to reach a decision on the familiarity of a presented face. Eighteen subjects performed an EEG experiment while they were presented with images of themselves, close friends, famous individuals, or unfamiliar individuals. They were required to perform a 2AFC task to decide on the familiarity of the image (familiar/unfamiliar). The authors report behavioral differences in accuracy and reaction times depending on the task difficulty (more or less degraded images) and depending on the familiarity of the face, with self and personally familiar faces being recognized more easily and faster. Some of these behavioral differences were reflected in brain activity as evaluated by ERPs, decoding, and RSA analyses. Adopting …
###Reviewer #2:
The authors employed a clever experimental paradigm to investigate how the brain integrates visual information to reach a decision on the familiarity of a presented face. Eighteen subjects performed an EEG experiment while they were presented with images of themselves, close friends, famous individuals, or unfamiliar individuals. They were required to perform a 2AFC task to decide on the familiarity of the image (familiar/unfamiliar). The authors report behavioral differences in accuracy and reaction times depending on the task difficulty (more or less degraded images) and depending on the familiarity of the face, with self and personally familiar faces being recognized more easily and faster. Some of these behavioral differences were reflected in brain activity as evaluated by ERPs, decoding, and RSA analyses. Adopting a novel RSA-based connectivity method, the authors claim that under conditions with limited visual information (more degraded images), top-down effects from frontal areas to occipital areas are stronger than in conditions with increased visual information (less degraded images).
The main question of this work is of interest and important in the face processing literature. The paradigm is clever and has the potential to address the question of interest. However, I have strong concerns about the methods, as well as some issues with the interpretation and framework in which the authors place the results of this work.
Methods:
There is little information about single-subject results or effect sizes, except for behavioral results. Only the mean values across subjects are reported with significance values (however, the reader cannot be sure about this as it is not explicitly mentioned anywhere). It's unclear from the description of the methods how data from different subjects were pooled for group analysis. Similarly, it's unclear how the null distributions were generated across subjects for permutation testing.
Different analyses use either correct trials only or both incorrect and correct trials, without any clear rationale of why this is warranted. This is especially important in a task with highly different accuracy values depending on the conditions of interest. Figure 1B shows different levels of behavioral accuracy depending on coherence levels, while Figure 1D shows different levels of accuracy depending on familiarity type. This is very interesting, but it creates challenges for the analysis of brain data.
On the one hand, if only correct trials are selected for the analysis (as in the decoding results), then different conditions will have a different number of trials. In turn, this will change the distribution of samples into classes, it will change the theoretical chance level, and it will change the levels of noise for estimates of central tendency. For example, the difference in decoding results between different familiarity types in Figure 3B could potentially be driven by a different number of trials belonging to each of the subclasses of familiarity.
On the other hand, if both correct and incorrect trials are selected for the analysis (as in the RSA analysis), then results are confounded by potentially different brain processes that take place for correct and incorrect trials. Consider that in a 2AFC task, participants can be correct in one way only (correct classification), while they can be incorrect in many ways (slow RT, low attention level, or true misclassification). Given this experimental paradigm, I think the more straightforward approach would be to analyze correct and incorrect trials separately for all analyses and report both results. This would limit confounding effects in the interpretation of the data.
For the decoding analyses, I find it suboptimal (and potentially problematic) to use a binary classifier (familiar vs. unfamiliar) to investigate a multiclass problem (levels of familiarity). A better approach would be to run a 4-way classification from the beginning, and then use this classifier to generate a 2-way classifier. This approach would preserve the actual structure of the data, which is divided into four classes of interest and not only two. In addition, I cannot tell from the methods whether the labels were permuted appropriately for permutation testing. Since there is a different number of trials in each class, the label permutation should maintain the same proportion of trials in each class to preserve the original structure and generate an appropriate null distribution (Etzel, 2015; Etzel & Braver, 2013; Nichols & Holmes, 2002)
It's unclear to me what the brain-behavior correlation analysis is meant to represent (Figure 3C) when the decoding analysis is performed on correct trials only, while behavioral accuracy is (necessarily) computed on all trials. In addition, I am left to wonder whether the overall within-subject behavioral accuracy is predicted by (or correlates with) the overall decoding accuracy across timepoints based on within-subject brain data. If such an effect exists, then the more complicated, time-varying analysis would be warranted. However, this analysis should be reported with individual subject's results to highlight the effect size of such a correlation. Finally, I would suggest the authors move some of the text describing this analysis from the methods to the main text. I find the description in the main text to be particularly opaque and much clearer in the methods section.
It's unclear how the RSA results were pooled across subjects. In addition, these analyses used both correct and incorrect trials. I don't see why these analyses cannot be performed on correct and incorrect trials separately by sub-selecting rows and columns of the RDMs for each subject. This would make the interpretation of the results much more straightforward. These results are now confounded by whether the image was correctly or incorrectly classified by the participant.
I'm not convinced the partial correlation results with low-level visual features are sufficient to account for the effect of visual differences. These differences necessarily exist when using pictures of famous people with less staged pictures of friends and other individuals. I'd like to know how much each image class can be predicted by image statistics alone either by mimicking the experiment using a classifier or by training a classifier to distinguish familiarity type on the actual images. This would quantify whether the familiarity of the person can be decoded simply based on low-level visual properties (such as luminance values from pixel intensities), or from more biologically inspired features that simulate early visual cortex, such as HMAX features or the first layer of a general recognition visual DNN.
I find the proposed connectivity method quite interesting, but I'm highly concerned whenever a method is developed and tested in a single dataset to support the main hypothesis. I realize it is hard to obtain a real "ground truth" dataset to test this method, especially in our global condition. However, I would be more confident in this method if it were applied to some simulated data to show that it can recover the simulated feedforward/feedback dynamics with different amounts of noise in the dataset. In addition, especially for this analysis, differences between correct and incorrect trials should be analyzed. Otherwise, the interesting findings in Figure 4D could be confounded by a different number of correct trials in each of the coherence levels (with more incorrect trials for the 22% condition).
Interpretation:
- Throughout the manuscript, I find the description of the visual pathway and the face processing network to be too simplified. It is described with a simple distinction into "peri-occipital" and "peri-frontal" areas, and a dichotomy between feed-forward/feed-back connection. While EEG cannot afford a more precise spatial resolution, I think both the introduction and the discussion should place the results of this manuscript within the broader and more precise knowledge we have about the visual system and the face processing system. For example, how do these results fit within the framework of (familiar) face processing (Duchaine & Yovel, 2015; Freiwald et al., 2016; Haxby et al., 2000; Visconti di Oleggio Castello et al., 2017)?
While I agree that the evidence for top-down effects from frontal areas in visual recognition is substantial (as the seminal work by Moshe Bar and others has shown), recurrent and feedback connections exist much earlier in the pathway (Kravitz et al., 2013). These recurrent connections have been shown to play a role in tasks with occluded images as well (Tang et al., 2018), which has similarities with the task presented in this manuscript. Thus, for this task, do we really need to assume a contribution from frontal areas? Could it be more easily explained by these recurrent connections in occipital and temporal areas alone? I think the discussion should present a more precise (and nuanced) description of the visual pathway and the face processing network, rather than a simplified dichotomy between frontal/occipital areas.
References:
Duchaine, B., & Yovel, G. (2015). A Revised Neural Framework for Face Processing. Annual Review of Vision Science, 1(1), 393-416.
Etzel, J. A. (2015). MVPA Permutation Schemes: Permutation Testing for the Group Level. 2015 International Workshop on Pattern Recognition in NeuroImaging, 65-68.
Etzel, J. A., & Braver, T. S. (2013). MVPA Permutation Schemes: Permutation Testing in the Land of Cross-Validation. 2013 International Workshop on Pattern Recognition in Neuroimaging, 140-143.
Freiwald, W., Duchaine, B., & Yovel, G. (2016). Face Processing Systems: From Neurons to Real-World Social Perception. Annual Review of Neuroscience, 39(1), 325-346.
Haxby, J. V., Hoffman, E. A., & Gobbini, M. I. (2000). The distributed human neural system for face perception. Trends in Cognitive Sciences, 4(6), 223-233.
Kravitz, D. J., Saleem, K. S., Baker, C. I., Ungerleider, L. G., & Mishkin, M. (2013). The ventral visual pathway: an expanded neural framework for the processing of object quality. Trends in Cognitive Sciences, 17(1), 26-49.
Nichols, T. E., & Holmes, A. P. (2002). Nonparametric permutation tests for functional neuroimaging: a primer with examples. Human Brain Mapping, 15(1), 1-25.
Tang, H., Schrimpf, M., Lotter, W., Moerman, C., Paredes, A., Ortega Caro, J., Hardesty, W., Cox, D., & Kreiman, G. (2018). Recurrent computations for visual pattern completion. Proceedings of the National Academy of Sciences of the United States of America. https://doi.org/10.1073/pnas.1719397115
Visconti di Oleggio Castello, M., Halchenko, Y. O., Swaroop Guntupalli, J., Gors, J. D., & Gobbini, M. I. (2017). The neural representation of personally familiar and unfamiliar faces in the distributed system for face perception. In Sci. Rep. (Issue 1, p. 138297). https://doi.org/10.1038/s41598-017-12559-1
-
###Reviewer #1:
In this manuscript the authors report a study investigating the "neural familiarity spectrum" of face recognition. The authors used a paradigm via which stimuli (i.e. facial identities with varied levels of familiarity) were gradually revealed. In general, I entirely agree that the previous overemphasis of and/or arguing "for a dominance of feed-forward processing" ought to be replaced by a more "nuanced view". In my opinion, the constraints imposed by our methodological choices, which ultimately determine the nature of our observations, also need to be humbly considered. I commend the authors for their efforts and their well-written, interesting manuscript, which I believe represents a valuable and needed contribution to the field of face cognition and beyond.
Major Points:
Throughout the manuscript references are …
###Reviewer #1:
In this manuscript the authors report a study investigating the "neural familiarity spectrum" of face recognition. The authors used a paradigm via which stimuli (i.e. facial identities with varied levels of familiarity) were gradually revealed. In general, I entirely agree that the previous overemphasis of and/or arguing "for a dominance of feed-forward processing" ought to be replaced by a more "nuanced view". In my opinion, the constraints imposed by our methodological choices, which ultimately determine the nature of our observations, also need to be humbly considered. I commend the authors for their efforts and their well-written, interesting manuscript, which I believe represents a valuable and needed contribution to the field of face cognition and beyond.
Major Points:
Throughout the manuscript references are warranted to a number of studies that have:
(i) Used similar approaches to a) decelerate the categorization process and b) investigate representations across time by applying uni-/multivariate analyses that were stimulus onset and/or reaction time aligned (eg, Carlson et al., 2006; Jiang et al., 2011; Ramon et al., 2015; Quek et al., 2018)
(ii) Have reported findings related to frontal contributions towards familiar face recognition (numerous EEG studies by Caharel and colleagues, and Ramon et al. (2010, 2015) What I am missing is an explicit discussion of the challenging effect of expectations related to identities (as well as specific images since observers provided stimuli themselves). The authors discuss the role of perceptual difficulty and familiarity level, but the latter is in fact confounded with expectations of the specific to-be-presented identities that moreover appear in the context of the active (vs. orthogonal) task, both of which increase signal strength. (Note: this is not a critique and applies to all studies using personally familiar identities - especially those that have used a relatively small number of identities).
In light of this, I believe that statements related to the dominance of "feed-forward flow" in relation to perceptual difficulty should be more nuanced. Examples include:
-"perceptual difficulty and the level of familiarity influence the neural representation of familiar faces and the degree to which peri-frontal neural networks contribute to familiar face recognition"
-"We observed that the direction of information flow is influenced by the familiarity of the stimulus"
Level of familiarity and perceptual difficulty are correlated in the present study, as well as most studies precisely because observers know who will be seen. Therefore, one could argue that the expectations, not the level of familiarity per se determine "the involvement of peri-frontal cognitive areas in familiar face recognition". (cf. Huang et al., (2017) and Ramon & Gobbini (2018) for a discussion).
Related to this aspect and relevant for the analyses is the different number of trials across categories (3x as many unfamiliar face trials vs. each of the familiar ones). How was this dealt with statistically (cf. also stats reported in Figure 2) and were Ss informed about the ratio beforehand? Given the provision of self and personally familiar images, the task could also be considered a n-identity search task (cf. Besson et al., 2017), as they match sensory inputs to one of n possible known vs. an unknown number of unfamiliar identities / events. (To illustrate, the effects of expectations can determine the degree to which recovery from neural adaptation is observed across different face-preferential regions using the same task; e.g. Rotshtein et al, 2005, Nat Neurosci vs. Ramon et al., 2010, EJN)
The authors list "levels of categorization [...], task difficulty [...] and perceptual difficulty [...]" as potentially affecting "the complex interplay of feed-forward and feedback mechanisms in the brain" (l.442). I agree and point towards further relevant papers to be cited that additionally investigate the impact of expectations or "decisional space" on categorical decisions in the healthy as well as impaired brain (eg Ramon, 2018, Cogn Neuropsychol; Ramon et al., 2019, Cognition; Ramon et al., 2019, Cogn Neuropsychol).
To summarize, can "accumulation of sensory evidence in the brain across the time course of stimulus presentation" (l.267) and "the strength of incoming perceptual evidence and the familiarity of the face stimulus" considered to determine the direction of information processing be distinguished from the effect of expectations that potentially increases over time? (This is naturally non-existent for unfamiliar stimuli, for which no "domination of feed-forward flow of information" was found).
-
##Preprint Review
This preprint was reviewed using eLife’s Preprint Review service, which provides public peer reviews of manuscripts posted on bioRxiv for the benefit of the authors, readers, potential readers, and others interested in our assessment of the work. This review applies only to version 2 of the manuscript.
###Summary:
The reviewers appreciated the clever paradigm and the focus on top-down influences during familiar face recognition. However, the reviewers also raised several serious methodological concerns. For example, they noted that the familiarity conditions cannot be easily compared, considering that these conditions differed in multiple ways beyond the level of familiarity (e.g., staged vs supplied photos, one vs many identities).
-