Inferential eye movement control while following dynamic gaze

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    This important work substantially advances our understanding of how human eye movements are shaped by social cues. Using clever experimental manipulations and innovative artificial intelligence analysis tools, the paper identifies distinctive patterns of saccadic eye movements tracking another person's gaze during dynamic video-scene viewing. This work will be of broad interest to psychologists, biologists, and neuroscientists interested in human social behavior.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Attending to other people’s gaze is evolutionary important to make inferences about intentions and actions. Gaze influences covert attention and triggers eye movements. However, we know little about how the brain controls the fine-grain dynamics of eye movements during gaze following. Observers followed people’s gaze shifts in videos during search and we related the observer eye movement dynamics to the time course of gazer head movements extracted by a deep neural network. We show that the observers’ brains use information in the visual periphery to execute predictive saccades that anticipate the information in the gazer’s head direction by 190–350ms. The brain simultaneously monitors moment-to-moment changes in the gazer’s head velocity to dynamically alter eye movements and re-fixate the gazer (reverse saccades) when the head accelerates before the initiation of the first forward gaze-following saccade. Using saccade-contingent manipulations of the videos, we experimentally show that the reverse saccades are planned concurrently with the first forward gaze-following saccade and have a functional role in reducing subsequent errors fixating on the gaze goal. Together, our findings characterize the inferential and functional nature of social attention’s fine-grain eye movement dynamics.

Article activity feed

  1. eLife assessment

    This important work substantially advances our understanding of how human eye movements are shaped by social cues. Using clever experimental manipulations and innovative artificial intelligence analysis tools, the paper identifies distinctive patterns of saccadic eye movements tracking another person's gaze during dynamic video-scene viewing. This work will be of broad interest to psychologists, biologists, and neuroscientists interested in human social behavior.

  2. Reviewer #1 (Public Review):

    Han and Eckstein asked human participants to follow the gaze of a person and to judge the presence/absence of a target person in videos. The videos contained a gazer and an additional person as gaze goal in present conditions. In absent conditions, this person was digitally removed from the video. The results show that participants use peripheral information about the most likely gaze goal to predictively execute a saccade towards the gaze goal before the gazer's head is oriented towards the goal. At the same time, foveal information about the head velocity of the gazer is processed, leading to more reverse saccades to the gazer when the head velocity of the gazer is low and/or when the head accelerates before the first saccade to the goal. Further control experiments show that the reverse saccades are effective in reducing the error of the following saccade because additional foveal information of the gazer's head direction is sampled. Predictive saccades are also observed when participants are not instructed to follow the gaze.

    Strengths:

    The study uses very clever experimental manipulations and analysis methods to understand when and where information is sampled for saccade programming. This is especially challenging because natural videos are used to investigate gaze control in an ecologically highly relevant scenario. Compared to previous studies on the sampling of information, in which mostly artificial and static targets were used, this is a large conceptual and methodological step forward and advances the state-of-the-art. The complex stimulus material is analysed using advanced AI techniques and traditional human annotations. Overall, the study contains a complex and rich data set that is created and analysed with innovative methods and it will certainly stimulate further research.

    Weaknesses:

    While the study uses clever and sophisticated manipulations to dissect the influence of different types of information on eye movement control, these manipulations inevitably lead to a few limitations of ecological validity, which might contribute to the findings:

    1. Role of expectations: It seems that whenever there was a second person present in the video, it was always the gaze goal. This might influence the gaze dynamics of participants because participants can anticipate that the gazer will look towards the second person. This expectation might allow participants to infer the gaze goal with peripheral vision and reduce the necessity to rely on foveal information about the head direction of the gazer. Some or all of the differences between the present/absent conditions might actually reflect the effect of this expectation.

    2. Absent videos: Absent videos were created by digitally removing the target/distractor person from the video. This is definitely useful to maximize the visual similarity of absent and present videos, but it also might lead to absent videos that do not contain a meaningful gaze goal in the scene. This can be seen in Figure 1e, where the gazer seems to look towards something that is outside of the video frame. This absence of a potential gaze goal might delay saccades and render them more variable, especially in terms of amplitude.

  3. Reviewer #2 (Public Review):

    As described in the manuscript, gaze following is a dynamic process that should be investigated with similarly dynamic stimuli (wherever possible). In this case, the authors used videos, rich with visual information, that could be deemed an appropriate example of such stimuli. By constructing scenarios where actors gazed toward 1) a target person, 2) distractor or 3) nothing, the authors were able to easily study observers' eye movements. First, they were able to determine a baseline for how observers follow gaze in each of the three aforementioned conditions which is an important reference for future studies of this nature. Further, they suggest that eye movements are affected by how gaze following interacts with peripheral information (i.e., processing gaze-related information from the actor is combined with peripheral information about the presence/absence of a target person). Second, the authors also determined that eye movement behavior is affected by gaze information (i.e., changes in the gaze of the principal actor), in an anticipatory manner. This was verified using a DNN approach (using only the gazer's head direction) and then, confirmed through human observers' ratings. Lastly, the authors noted the presence of subsequent, reverse saccades (in the direction of the gazer and then, toward the target), which were shown to play a role in correcting an initial inference based on a slow head velocity of the gazer (confirmed with an SVM approach). While these are important first inquiries related to understanding eye movement behavior elicited in response to gaze following, a few items remain to be further elucidated, including what additional, peripheral information (besides target/distractor absence and presence) drives eye movements during gaze following. Overall, the dynamic videos used by the authors, in combination with their investigations, provide an important first step toward studying gaze following in more realistic conditions.

  4. Reviewer #3 (Public Review):

    In this work, the authors explored some of the oculomotor mechanisms that humans put in place when observing other people looking somewhere. This tendency is generally known as 'gaze following' and represents a fundamental behaviour to obtain fluid social interactions with both others and the environment.

    The strengths of this work can be found in the approach of the analysis, which provides a rich perspective on how human eye movements are shaped by social cues. I have appreciated the combination of more traditional analyses with more sophisticated approaches such as artificial intelligence.

    At the same time, the complexity of the data analysis could lead to difficulties in understanding the whole picture emerging from here. The task itself should be described in more detail. In addition, I have also the feeling that some theoretical aspects concerning gaze following and social attention, in general, have been little discussed, leaving room for more technical and formal aspects. For instance, I am wondering if a control condition in which the gazer is looking towards a non-social item (such as an object) could be of interest and potentially important to better qualify these data within a social dimension.