A neural mechanism for detecting object motion during self-motion

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    This paper will be of broad interest to readers in the field of motion perception. The authors use concurrent psychophysics and single unit recordings, along with modeling, to investigate how primate cortical area MT uses specific visual signals to make inferences that distinguish between visual motion induced by self-motion and the motion of other objects in the world. The experiments and stimuli are expertly designed and the analyses are careful.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 and Reviewer #3 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Detection of objects that move in a scene is a fundamental computation performed by the visual system. This computation is greatly complicated by observer motion, which causes most objects to move across the retinal image. How the visual system detects scene-relative object motion during self-motion is poorly understood. Human behavioral studies suggest that the visual system may identify local conflicts between motion parallax and binocular disparity cues to depth and may use these signals to detect moving objects. We describe a novel mechanism for performing this computation based on neurons in macaque middle temporal (MT) area with incongruent depth tuning for binocular disparity and motion parallax cues. Neurons with incongruent tuning respond selectively to scene-relative object motion, and their responses are predictive of perceptual decisions when animals are trained to detect a moving object during self-motion. This finding establishes a novel functional role for neurons with incongruent tuning for multiple depth cues.

Article activity feed

  1. Author Response

    Reviewer #1 (Public Review):

    In this paper, the authors investigate the tuning of visual neurons in primate area MT to motion parallax signals and to binocular disparity. Among this class of neurons, some are tuned incongruently to depth using these two cues - that is, neurons can be tuned to more distant objects through motion parallax but closer ones through binocular disparity. Using carefully designed visual stimuli, the authors investigate the tuning properties of these neurons and how they relate to a psychophysical task in which a monkey distinguishes world-frame-moving objects from world-frame-stationary objects during self-motion of the monkey.

    The experiments and stimuli are expertly designed and the analyses are careful.

    We thank the reviewer for their supportive comments and for raising good questions.

    My primary question, in reading this paper, is how much of the psychophysical effect can be attributed to these incongruently tuned neurons, rather than simply having a population of neurons with a relatively wide range of tunings. The analyses and simulations as presented don't back up the central claim as strongly as they could that it's these incongruent neurons in particular that facilitate these psychophysical percepts.

    We appreciate the reviewer raising this issue, which has led to us digging into the data further.

    Reviewer #3 (Public Review):

    The authors investigated how the visual system solves the important and challenging problem of detecting independently moving objects while the observer undergoes self-motion. The paper focuses on a certain population of neurons in brain area MT ('opposite cells') that exhibit tuning to combinations of motion parallax (i.e. speed and direction) and binocular disparity that would generally not be compatible with the retinal motion created by stationary objects in the environment during self-motion. One example is tuning to fast speeds and far away depth through disparity. Such combinations of signals that preferentially activate opposite cells are more likely to arise from an independently moving object than self-motion relative to a stationary environment, assuming both sources of information are available. The main hypothesis tested in this paper is whether opposite cells could be used as a neural mechanism to detect independently moving objects. Consistent with their tuning properties, the authors found that opposite cells demonstrate stronger activation to moving objects than stationary objects. More generally, there was an inverse correlation between congruence in motion parallax+disparity tuning and the preference for moving objects. In support of the main hypothesis, an ROC analysis revealed that opposite cells were more effective in detecting moving from stationary objects through a difference in firing rate when the object was labeled as moving either according to the ground truth or monkey judgments. The estimates of a linear classifier trained on model fits of the MT data reinforced the authors' findings.

    The investigated topic is very interesting and the work is a valuable contribution to the field. The paper is well written. The experiments were well-designed and controlled. The analyses were appropriate and support the hypothesis.

    We thank the reviewer for their supportive comments.

    The proposed local mechanism has a few limitations, mainly in its scope. First, the proposed local mechanism critically depends on the availability of binocular disparity. Humans are capable of detecting moving objects based on monocular optic flow, even when the moving object is aligned with the motion due to self-motion and varies based on speed alone (Royden & Moore, 2012). This scenario would not engage the proposed mechanism because disparity is not available and thus another mechanism like flow parsing would be needed. Second, while the proposed local mechanism may be more 'economical' (p. 3) than flow parsing, flow parsing addresses more phenomena than moving object detection. For example, flow parsing implicates the estimation of the world-relative direction (Warren & Rushton 2009; Fajen et al., 2013) and speed (Jörges & Harris, 2021) of independently moving objects. Layton & Fajen (2020) showed that a neural model of flow parsing can be used to detect moving objects in both monocular and binocular optic flow fields. The visual system may require a more 'complicated mechanism' (p. 28) to robustly perform the broader range of tasks, in situations where disparity may or may not be available and informative.

    We have no disagreement with the reviewer regarding these broader issues. Indeed, in work still to be published, we have found effects of flow parsing in MT and we consider that to be a different mechanism. Indeed, the two are likely to be complementary, and we certainly did not mean to imply that our findings obviate the need for a flow parsing mechanism. We have made text revisions throughout the manuscript to clarify this, and have added text to the Discussion (pp. 17-18) to specifically address the points raised here.

  2. Evaluation Summary:

    This paper will be of broad interest to readers in the field of motion perception. The authors use concurrent psychophysics and single unit recordings, along with modeling, to investigate how primate cortical area MT uses specific visual signals to make inferences that distinguish between visual motion induced by self-motion and the motion of other objects in the world. The experiments and stimuli are expertly designed and the analyses are careful.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 and Reviewer #3 agreed to share their name with the authors.)

  3. Reviewer #1 (Public Review):

    In this paper, the authors investigate the tuning of visual neurons in primate area MT to motion parallax signals and to binocular disparity. Among this class of neurons, some are tuned incongruently to depth using these two cues - that is, neurons can be tuned to more distant objects through motion parallax but closer ones through binocular disparity. Using carefully designed visual stimuli, the authors investigate the tuning properties of these neurons and how they relate to a psychophysical task in which a monkey distinguishes world-frame-moving objects from world-frame-stationary objects during self-motion of the monkey.

    The experiments and stimuli are expertly designed and the analyses are careful.

    My primary question, in reading this paper, is how much of the psychophysical effect can be attributed to these incongruently tuned neurons, rather than simply having a population of neurons with a relatively wide range of tunings. The analyses and simulations as presented don't back up the central claim as strongly as they could that it's these incongruent neurons in particular that facilitate these psychophysical percepts.

  4. Reviewer #2 (Public Review):

    In this study, Kim et al. investigate an unsolved riddle in the neural mechanisms of motion perception: how does the visual system detect dynamic objects in the presence of self motion? They recorded from single neurons in macaque area MT while animals performed an object motion detection task while undergoing self motion. They find that MT neurons with incongruent depth tuning to binocular disparity and motion parallax cues respond selectively to dynamic objects during self motion. They further show that the response of these neurons are predictive of the animal's choice. They conclude that such incongruent neurons might play a novel role in solving the problem of scene relative object motion.

    The study is elegantly conducted and the paper is very well written. The findings, although correlative, represent an important step forward in our understanding of the neural mechanisms of motion perception.

    I have no major comments.

  5. Reviewer #3 (Public Review):

    The authors investigated how the visual system solves the important and challenging problem of detecting independently moving objects while the observer undergoes self-motion. The paper focuses on a certain population of neurons in brain area MT ('opposite cells') that exhibit tuning to combinations of motion parallax (i.e. speed and direction) and binocular disparity that would generally not be compatible with the retinal motion created by stationary objects in the environment during self-motion. One example is tuning to fast speeds and far away depth through disparity. Such combinations of signals that preferentially activate opposite cells are more likely to arise from an independently moving object than self-motion relative to a stationary environment, assuming both sources of information are available. The main hypothesis tested in this paper is whether opposite cells could be used as a neural mechanism to detect independently moving objects. Consistent with their tuning properties, the authors found that opposite cells demonstrate stronger activation to moving objects than stationary objects. More generally, there was an inverse correlation between congruence in motion parallax+disparity tuning and the preference for moving objects. In support of the main hypothesis, an ROC analysis revealed that opposite cells were more effective in detecting moving from stationary objects through a difference in firing rate when the object was labeled as moving either according to the ground truth or monkey judgments. The estimates of a linear classifier trained on model fits of the MT data reinforced the authors' findings.

    The investigated topic is very interesting and the work is a valuable contribution to the field. The paper is well written. The experiments were well-designed and controlled. The analyses were appropriate and support the hypothesis.

    The proposed local mechanism has a few limitations, mainly in its scope. First, the proposed local mechanism critically depends on the availability of binocular disparity. Humans are capable of detecting moving objects based on monocular optic flow, even when the moving object is aligned with the motion due to self-motion and varies based on speed alone (Royden & Moore, 2012). This scenario would not engage the proposed mechanism because disparity is not available and thus another mechanism like flow parsing would be needed. Second, while the proposed local mechanism may be more 'economical' (p. 3) than flow parsing, flow parsing addresses more phenomena than moving object detection. For example, flow parsing implicates the estimation of the world-relative direction (Warren & Rushton 2009; Fajen et al., 2013) and speed (Jörges & Harris, 2021) of independently moving objects. Layton & Fajen (2020) showed that a neural model of flow parsing can be used to detect moving objects in both monocular and binocular optic flow fields. The visual system may require a more 'complicated mechanism' (p. 28) to robustly perform the broader range of tasks, in situations where disparity may or may not be available and informative.