Contrasting action and posture coding with hierarchical deep neural network models of proprioception

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    This manuscript presents a valuable framework and blueprint for the study, in artificial systems, of the principles and mechanisms that underlie proprioception in biological systems. Using artificial neural networks trained on synthetic hand movement data, the authors present solid, albeit incomplete, evidence that action recognition can explain important features of the mechanisms that underlie proprioception in biological systems. Experiments with architectures trained using losses that, in addition to action, take into account velocity and/or other states, could strengthen the authors' findings.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Biological motor control is versatile, efficient, and depends on proprioceptive feedback. Muscles are flexible and undergo continuous changes, requiring distributed adaptive control mechanisms that continuously account for the body’s state. The canonical role of proprioception is representing the body state. We hypothesize that the proprioceptive system could also be critical for high-level tasks such as action recognition. To test this theory, we pursued a task-driven modeling approach, which allowed us to isolate the study of proprioception. We generated a large synthetic dataset of human arm trajectories tracing characters of the Latin alphabet in 3D space, together with muscle activities obtained from a musculoskeletal model and model-based muscle spindle activity. Next, we compared two classes of tasks: trajectory decoding and action recognition, which allowed us to train hierarchical models to decode either the position and velocity of the end-effector of one’s posture or the character (action) identity from the spindle firing patterns. We found that artificial neural networks could robustly solve both tasks, and the networks’ units show tuning properties similar to neurons in the primate somatosensory cortex and the brainstem. Remarkably, we found uniformly distributed directional selective units only with the action-recognition-trained models and not the trajectory-decoding-trained models. This suggests that proprioceptive encoding is additionally associated with higher-level functions such as action recognition and therefore provides new, experimentally testable hypotheses of how proprioception aids in adaptive motor control.

Article activity feed

  1. eLife assessment

    This manuscript presents a valuable framework and blueprint for the study, in artificial systems, of the principles and mechanisms that underlie proprioception in biological systems. Using artificial neural networks trained on synthetic hand movement data, the authors present solid, albeit incomplete, evidence that action recognition can explain important features of the mechanisms that underlie proprioception in biological systems. Experiments with architectures trained using losses that, in addition to action, take into account velocity and/or other states, could strengthen the authors' findings.

  2. Reviewer #1 (Public Review):

    This paper makes two major contributions. First, the authors provide a large synthetic dataset of human arm trajectories tracing the alphabet in 3D space. They also model the musculoskeletal system and the muscle spindles during tracing. This dataset would be very valuable for later studies. I thank the authors for making the effort.

    Second, the authors train various neural networks on two tasks, a character trajectory-decoding task and an action recognition task, from spindle outputs and find that artificial neural network representations from the action-recognition task better explain what is known about the proprioceptive system. This is potentially an important finding, because, the authors claim that trajectory decoding is the canonical hypothesis for proprioception's role.

    The authors are very systematic in the methodology with which they did their machine learning and representational analysis. I don't have any major comments on that.

    I have concerns about the main finding though. While it is true that state estimation is thought to be a major function of proprioception, this state estimation is part of a control loop. If the goal is to refute the canonical hypothesis for proprioception, the authors should actually simulate/train a full control loop. This is likely to change their conclusions because authors interpret state prediction as the prediction of end effector coordinates at each time step. However, to run a control system one may need to predict other state variables - like effector velocities and accelerations, muscle configurations, etc. - as well, and these may change the intermediate level representations.

  3. Reviewer #2 (Public Review):

    Does our proprioceptive system try to recognize our own actions?

    Proprioception is our sense of the motion and posture of our own body. This sixth sense uses signals from receptors in the joints, tendons, muscles, and skin that measure forces and degrees of extension. These receptors enable us to sense, for example, the posture of our body as we wake from sleep. They also provide feedback signals that help us precisely control our limbs, for example during handwriting.

    Feedback is thought to be essential to motor control, enabling the controller in our brains to rapidly adapt to the unexpected. The unexpected may include changes in the environment (like something pushing our hand that we didn't see coming), changes in our bodies (such as muscle fatigue or injury), and shortcomings of the motor program (such as a lack of precision or a badly planned limb trajectory). Feedback can come from vision and even audition, but proprioception provides an essential additional feedback path that informs us directly about the motion and posture of our limbs, and any forces on them.

    How does feedback control work in the human motor system? I want to write a 'k', but there are forces on my limbs resulting from the friction of chalk on this particular blackboard. Also, my muscles are recovering from tennis practice this morning, and I haven't used chalk on a blackboard in years.

    If the goal is to write a 'k', I have some flexibility. I am committed, not to a precise trajectory, but to a more abstractly defined objective: to write a legible 'k'. This suggests that feedback processing should evaluate to what extent I am succeeding at the action, not at tracing out a particular trajectory. Does what I'm actually doing look like writing a 'k'?

    In a new paper, Sandbrink et al. (pp2022) report on simulations of the human musculoskeletal system and neural network models that suggest that the tuning properties of neurons in the somatosensory cortex (S1) can be explained by assuming that the objective of the proprioceptive system is to recognize the action being performed.

    They used recorded traces of a person writing lower-case letters to simulate the responses of muscle spindles sensing the lengths and velocities of muscles in the human arm as would be present if the hand was moved passively along these trajectories. The physical simulation uses a 3D model of the human arm with two parameters for the direction of the upper arm and two more for the direction of the lower arm. These four parameters are inferred by inverse kinematics from the hand trajectories tracing each letter in a variety of vertical and horizontal planes. A 3D muscle model then enables the authors to compute the expected spindle responses that reflect the lengths and velocities of 25 relevant upper arm muscles.

    The authors then trained neural network models of proprioceptive processing that took the simulated muscle spindle signals as input. The neural net architectures included one that first integrates information over the muscle spindles and then across time ("spatial-temporal"), one that integrated across muscle spindles and time simultaneously ("spatiotemporal") and a recurrent long-short-term-memory model.
    Each architecture was trained on two objectives: to decode the trajectory (i.e. the position of the hand tracing a letter as a function of time) or to recognize the action (i.e. the letter being traced). The two objectives correspond to two hypotheses about the function of proprioceptive processing: To inform the feedback controller about either the current position of the hand or the letter being drawn.

    The models trained to recognize the action developed tuning more consistent with what is known about the tuning of neurons in the primary somatosensory cortex in primates. In particular, direction tuning with roughly equal numbers of units preferring each direction emerged in the middle layers of the neural network models trained to recognize the action, similar to what has been observed in primate neural recordings. Direction tuning is already present in the muscle-spindle signals, but the spindle signals do not uniformly represent the directions.

    The task-optimization approach to neural network modeling is inspired by work in vision, where neural networks trained on the task of image classification explained responses to novel images in populations of neurons in the inferior temporal cortex. This result suggested a tentative answer to the why question: Why do inferior temporal neurons exhibit the response profiles and representational geometry they exhibit? Because their function (or one of their functions) is to recognize the objects in the images. Here, similarly, the authors address a why question with task-optimized neural network models: Why do somatosensory cortical neurons exhibit the types of tuning that have been reported in the literature?

    The function of proprioception, of course, is not for the brain to recognize which letter it is trying to write. It already knows that. The function is to sense how the current trajectory - the actual, not the intended one - differs from, say, a legible "k" (if that was the intention), and to map from that difference to a modification vector that will improve the outcome.
    Why is action decoding relevant for performing the action? A key reason may be that the goal is not to produce a fixed trajectory, but to produce a legible 'k'. A legible 'k' is not a single trajectory, but a class of trajectories containing an infinity of viable solutions. If someone nudged my arm while writing, adaptive feedback control should not attempt to return me to the originally intended trajectory, but to a new trajectory that traces the most legible 'k' that is still in the cards, which may be a different style of 'k' than I originally intended.

    The paper contributes a useful data set for training models and a qualitative comparison of models to real neurons in terms of tuning properties. It would be good, in follow-up studies, to directly test to what extent each of the models can quantitatively predict either single-neuron responses or population representational geometries, as has been done in vision, and to perform statistical comparisons between models.

    Importantly, this paper develops the idea of combining simulations body and brain, of the musculoskeletal system, and the processing of control-related signals in the nervous system, which provides a very exciting direction for future research.

    Strengths

    • The paper introduces a highly original research program that marries simulation of the musculoskeletal system and neural network modelling to predict neural representations in the proprioceptive pathway.
    • The authors performed an architecture search and trained multiple instances of different neural network architectures with each of the two objectives.
    • The paper includes comprehensive analyses of the proprioceptive representations from the simulated muscle-spindle signals through the layers of the models. These analyses characterize unit tuning, linear decodability, and representational similarity.
    • The results suggest an explanation for the direction tuning with a roughly uniform distribution of the units' direction preferences that has been reported previously for neurons in the primate primary somatosensory (S1) cortex.
    • If the simulated muscle-spindle data set, models and analysis code were shared along with the published paper, this work could form the basis for quantitative model evaluation and further model development.

    Weaknesses

    • The models are qualitatively evaluated by comparison of model unit tuning to what is known about the tuning of neurons in the somatosensory cortex. Follow-up studies should quantitatively evaluate the models by inferential analyses of their ability to predict measured responses.
    • The two training objectives differ in multiple respects, making it difficult to assess what the necessary requirements are for the emergence of representations similar to primate S1. Decoding the hand position may be too simple, but what about decoding velocity, or trajectory descriptors such as curvature? There may be a middle ground between trajectory decoding and action recognition that also leads to the emergence of tuning properties as found in primate S1.