Theoretical principles explain the structure of the insect head direction circuit

Curation statements for this article:
  • Curated by eLife

    eLife logo

    eLife assessment

    This important work suggests that the observed cosine-like activity in the head direction circuit of insects not only subserves vector addition but also minimizes noise in the representation. The authors provide solid evidence using the locust and fruit fly connectomes. The work raises important theoretical questions about the organization of the navigation system and will be of interest to theoretical and experimental researchers studying navigation.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

To navigate their environment, insects need to keep track of their orientation. Previous work has shown that insects encode their head direction as a sinusoidal activity pattern around a ring of neurons arranged in an eight-column structure. However, it is unclear whether this sinusoidal encoding of head direction is just an evolutionary coincidence or if it offers a particular functional advantage. To address this question, we establish the basic mathematical requirements for direction encoding and show that it can be performed by many circuits, all with different activity patterns. Among these activity patterns, we prove that the sinusoidal one is the most noise-resilient, but only when coupled with a sinusoidal connectivity pattern between the encoding neurons. We compare this predicted optimal connectivity pattern with anatomical data from the head direction circuits of the locust and the fruit fly, finding that our theory agrees with experimental evidence. Furthermore, we demonstrate that our predicted circuit can emerge using Hebbian plasticity, implying that the neural connectivity does not need to be explicitly encoded in the genetic program of the insect but rather can emerge during development. Finally, we illustrate that in our theory, the consistent presence of the eight-column organisation of head direction circuits across multiple insect species is not a chance artefact but instead can be explained by basic evolutionary principles.

Article activity feed

  1. Author Response

    Reviewer #1 (Public Review):

    Strengths:

    • The paper is clearly written, and all the conclusions stem from a set of 3 principles: circular topology, rotational symmetry, and noise minimization. The derivations are sound and such rigor by itself is commendable.
    • The authors provide a compelling argument on why evolution might have picked an eight-column circuit for path-integration, which is a great example of how theory can inform our thinking about the organization of neural systems for a specific purpose.
    • The authors provide a self-consistency argument on how cosine-like activity supports cosine-like connectivity with a simple Hebbian rule. However, their framework doesn't answer the question of how this system integrates angular velocity with the correct gain in the absence of allothetic cues to produce a heading estimate (more on that on point 3 below).

    Weaknesses:

    • The authors make simplifying assumptions to arrive at the cosine activity/cosine connectivity circuit. Among those are the linear activation function, and cosine driving activity u. The authors provide justification for the linearization in methods 3.1, however, this ignores the well-established fact that bump amplitude is modulated by angular velocity in the fly head direction system (Turner-Evans et al 2017). In such a case, nonlinearities in the activation function cannot be ignored and would introduce harmonics in the activity.

    We thank the reviewer for pointing out this omission. We added a paragraph at the end of section 4.1 clarifying that transient non-linearity, for instance when the circuit is actively receiving external input, is compatible with our work because we only need linearity in the line attractor, but not outside (lines 407-419).

    “In more intuitive terms, the neurons have a saturating nonlinear activation function where they modulate their gain based on the total activity in the network. If the activity in the network is above the desired level, r, the gain is reduced and the activity decreases, and when the activity of the network is less than desired level, both the gain and the activity increase. Note that in this scenario transient deviations from the line attractor, which would induce nonlinear behaviour in the circuit dynamics, are tolerable. External inputs, u(t), could transiently modify the shape of the activity, producing activity shapes deviating from what the linear model can accommodate. For example, the shape of the bump attractor could be modified through nonlinearities while the insect attains high angular velocity (Turner-Evans et al., 2017).

    Such nonlinear dynamics do not conflict with the theory developed here, which only requires linearity when the activity is projected onto the circular line attractor. In our framework, the linearity of integration at the circular line attractor is not a computational assumption, but rather it emerges from the principle of symmetry.”

    Furthermore, even though activity has been reported to be cosine-like, in fact in the fruit fly it takes the form of a somewhat concentrated activity bump (~80-100 degrees, Seelig & Jayaraman 2015; Turner-Evans et al 2017), and one has to take into account the smoothing effect of calcium dynamics too which might make the bump appear more cosine-like. So in general, it would be nice to see how the conclusions extend if the driving activity is more square-like, which would also introduce further harmonics.

    We added a cautionary comment on the sinusoidal activity (lines 222-226).

    “We note, however, that data from the fruit fly shows a more concentrated activity bump than what would be expected from a perfect sinusoidal profile (Seelig and Jayaraman, 2015; Turner-Evans et al., 2017), and that calcium imaging (which was used to measure the activity) can introduce biases in the activity measurements (Siegle et al., 2021; Huang et al., 2021). Thus the sinusoidal activity we model is an approximation of the true biological process rather than a perfect description.”

    Overall, it would be interesting to see whether, despite the harmonics introduced by these two factors interacting in the learning rule, Oja's rule can still pick up the "base" frequency and produce sinusoidal weights (as mentioned in methods 3.8). At this point, the examples shown in Figure 5 (tabula rasa and slightly perturbed weights) are quite simple. Such a demonstration would greatly enhance the generality of the results.

    We also extended the self-consistency framework from Oja’s rule to the non-linear case, and found that while Oja’s rule with non-linear neurons would not give pure harmonics, the secondary harmonics will remain small. We added a sentence explaining this in the main text (section 2.4, lines 309-312) and a methods section to develop the self-consistency framework for the case of non-linear activations (section 4.7.2).

    “For neurons with a nonlinear activation function, secondary harmonics would emerge, but would remain small under mild assumptions, as shown in Section 4.7.2. Oja’s rule will still cause the weights to converge to approximately sinusoidal connectivity.”

    • The match of the theoretical prediction of cosine-like connectivity profiles with the connectivity data is somewhat lacking. In the locust the fit is almost perfect, however, the low net path count combined with the lack of knowledge about synaptic strengths makes this a motivating example in my opinion. In the fruit fly, the fit is not as good, and the function-fitting comparison (Methods Figure 6) is not as convincing. First, some function choices clearly are not a good fit (f1+2, f2). Second, the profile seems to be better fit by a Gaussian or other localized function, however the extra parameter of the Gaussian results in the worst AIC and AICc. To better get at the question of whether the shape of the connectivity profile matches a cosine or a Gaussian, the authors could try for example to fix the width of the Gaussian (e.g. to the variance of the best-fit cosine, which seems to match the data very well even though it wasn't itself fit), and then fit the two other parameters to the data. In that case, no AIC or AICc is needed. And then do the same for a circular distribution, e.g. von Mises.

    We also included the fit with von Mises and Gaussian with the width parameters fixed to match the cosine as the reviewer suggested. We found that even though these two distributions fit the data better, the difference is very small (2%), probably due to the high variability of the fruit fly connectome data. We also changed the wording and state that the theory is compatible with experimental data.

    In the Methods 4.6 (lines 568-585), we wrote

    “As a complementary approach to evaluate the shape of the distribution, we first fit the Gaussian and von Mises distributions to the best fit f = 1 curve. We then freeze the width parameters of the distributions (σ_g for the Gaussian and κ_v for the von Mises) and only optimise the amplitude and vertical offset parameters (β and γ) to fit the data. This approach limits the number of free parameters for the Gaussian and von Mises distributions to two, to match the sinusoid. The results are shown in Methods Fig. 6 and Table 5. Both the fixed-width Gaussian and von Mises distributions are a slightly better fit to the data than the sinusoid, but the differences between the three curves are very small.

    In simplifying the fruit fly connectome data, we assumed all synapses of different types were of equal weight, as no data to the contrary were available. Different synapse types having different strengths could introduce nonlinear distortions between our net synaptic path count and the true synaptic strength, which could in turn make the data a better or worse fit for a sinusoidal compared to a Gaussian profile. As such, we don’t consider the only 2% relative differences between the f = 1 sinusoid and fixed-width Gaussian and von Mises distributions to be conclusive.

    Overall, we find that the cosine weights that emerge from our derivations are a very close match for the locust, but less precise for the fly, where other functions fit slightly better. Given the limitations in using the currently available data to provide an exact estimate of synaptic strength (for the locust), and due to the high variability of the synaptic count (for the fruit fly), we consider that our theory is compatible with the observed data.”

    In addition, the theoretical prediction of cosine-like connectivity is not clearly stated in the abstract, introduction, or discussion. As a prediction, I believe it should be center forward, as it might be revisited again in the future in lieu of e.g. new experimental data.

    We added the explicit prediction in the abstract and the introduction (lines 52-53).

    • I find the authors' claim that Oja's rule suffices to learn the insect head direction circuit (l. 273-5) somewhat misleading/vague. The authors seem to not be learning angular integration here at all. First, it is unclear to me what is the form of u(t). Is it the desired activity in the network at time t given angular velocity? This is different than modelling a population of PEN neurons jointly tuned to head direction and angular velocity, and learning weights so as to integrate angular velocity with the correct gain (Vafidis et al 2022). The learning rule here establishes a self-consistency between sinusoidal weights and activity, however, it does not learn the weights from PEN to EPG neurons so as to perform angular integration. Similar simple Hebbian rules have been used before to learn angular integration (Stringer et al 2002), however, they failed to learn the correct gain. Therefore, the authors should limit the statement that their simpler learning rule is enough to learn the circuit (l. 273-5), making sure to outline differences with the current literature (Vafidis et al 2022).

    We agree and we clarified that we focus only on the self-sustained activity condition. We appended the following text to the first and last paragraphs of section 2.4.

    For the first (lines 279-284): “Our approach follows from previous research which has shown that simple Hebbian learning rules can lead to the emergence of circular line attractors in large neural populations (Stringer et al., 2002), and that a head direction circuit can emerge from a predictive rule (Vafidis et al., 2022). In contrast to this work, we focus only on the self-sustaining nature of the heading integration circuit in insects and show that our proposed sinusoidal connectivity profile can emerge naturally.”

    For the last (lines 317-321): “However, this learning rule only applies to the weights that ensure stable, self-sustaining activity in the network. The network connectivity responsible for correctly integrating angular velocity inputs (given by the PEN to EPG connections in the fly) might require more elements than a purely Hebbian rule (Stringer et al., 2002), such as the addition of a predictive component (Vafidis et al., 2022).”

  2. eLife assessment

    This important work suggests that the observed cosine-like activity in the head direction circuit of insects not only subserves vector addition but also minimizes noise in the representation. The authors provide solid evidence using the locust and fruit fly connectomes. The work raises important theoretical questions about the organization of the navigation system and will be of interest to theoretical and experimental researchers studying navigation.

  3. Reviewer #1 (Public Review):

    Strengths:

    - The paper is clearly written, and all the conclusions stem from a set of 3 principles: circular topology, rotational symmetry, and noise minimization. The derivations are sound and such rigor by itself is commendable.

    - The authors provide a compelling argument on why evolution might have picked an eight-column circuit for path-integration, which is a great example of how theory can inform our thinking about the organization of neural systems for a specific purpose.

    - The authors provide a self-consistency argument on how cosine-like activity supports cosine-like connectivity with a simple Hebbian rule. However, their framework doesn't answer the question of how this system integrates angular velocity with the correct gain in the absence of allothetic cues to produce a heading estimate (more on that on point 3 below).

    Weaknesses:

    - The authors make simplifying assumptions to arrive at the cosine activity/cosine connectivity circuit. Among those are the linear activation function, and cosine driving activity u. The authors provide justification for the linearization in methods 3.1, however, this ignores the well-established fact that bump amplitude is modulated by angular velocity in the fly head direction system (Turner-Evans et al 2017). In such a case, nonlinearities in the activation function cannot be ignored and would introduce harmonics in the activity. Furthermore, even though activity has been reported to be cosine-like, in fact in the fruit fly it takes the form of a somewhat concentrated activity bump (~80-100 degrees, Seelig & Jayaraman 2015; Turner-Evans et al 2017), and one has to take into account the smoothing effect of calcium dynamics too which might make the bump appear more cosine-like. So in general, it would be nice to see how the conclusions extend if the driving activity is more square-like, which would also introduce further harmonics. Overall, it would be interesting to see whether, despite the harmonics introduced by these two factors interacting in the learning rule, Oja's rule can still pick up the "base" frequency and produce sinusoidal weights (as mentioned in methods 3.8). At this point, the examples shown in Figure 5 (tabula rasa and slightly perturbed weights) are quite simple. Such a demonstration would greatly enhance the generality of the results.

    - The match of the theoretical prediction of cosine-like connectivity profiles with the connectivity data is somewhat lacking. In the locust the fit is almost perfect, however, the low net path count combined with the lack of knowledge about synaptic strengths makes this a motivating example in my opinion. In the fruit fly, the fit is not as good, and the function-fitting comparison (Methods Figure 6) is not as convincing. First, some function choices clearly are not a good fit (f1+2, f2). Second, the profile seems to be better fit by a Gaussian or other localized function, however the extra parameter of the Gaussian results in the worst AIC and AICc. To better get at the question of whether the shape of the connectivity profile matches a cosine or a Gaussian, the authors could try for example to fix the width of the Gaussian (e.g. to the variance of the best-fit cosine, which seems to match the data very well even though it wasn't itself fit), and then fit the two other parameters to the data. In that case, no AIC or AICc is needed. And then do the same for a circular distribution, e.g. von Mises. In addition, the theoretical prediction of cosine-like connectivity is not clearly stated in the abstract, introduction, or discussion. As a prediction, I believe it should be center forward, as it might be revisited again in the future in lieu of e.g. new experimental data.

    - I find the authors' claim that Oja's rule suffices to learn the insect head direction circuit (l. 273-5) somewhat misleading/vague. The authors seem to not be learning angular integration here at all. First, it is unclear to me what is the form of u(t). Is it the desired activity in the network at time t given angular velocity? This is different than modelling a population of PEN neurons jointly tuned to head direction and angular velocity, and learning weights so as to integrate angular velocity with the correct gain (Vafidis et al 2022). The learning rule here establishes a self-consistency between sinusoidal weights and activity, however, it does not learn the weights from PEN to EPG neurons so as to perform angular integration. Similar simple Hebbian rules have been used before to learn angular integration (Stringer et al 2002), however, they failed to learn the correct gain. Therefore, the authors should limit the statement that their simpler learning rule is enough to learn the circuit (l. 273-5), making sure to outline differences with the current literature (Vafidis et al 2022).