Learning excitatory-inhibitory neuronal assemblies in recurrent networks
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (eLife)
Abstract
In sensory circuits with poor feature topography, stimulus-specific feedback inhibition necessitates carefully tuned synaptic circuitry. Recent experimental data from mouse primary visual cortex (V1) show that synapses between pyramidal neurons and parvalbumin-expressing (PV) inhibitory interneurons tend to be stronger for neurons that respond to similar stimulus features. The mechanism that underlies the formation of such excitatory-inhibitory (E/I) assemblies is unresolved. Here, we show that activity-dependent synaptic plasticity on input and output synapses of PV interneurons generates a circuit structure that is consistent with mouse V1. Using a computational model, we show that both forms of plasticity must act synergistically to form the observed E/I assemblies. Once established, these assemblies produce a stimulus-specific competition between pyramidal neurons. Our model suggests that activity-dependent plasticity can enable inhibitory circuits to actively shape cortical computations.
Article activity feed
-
Author Response
We thank the reviewers for their thoughtful and constructive comments. We have updated the manuscript to take their suggestions and concerns into account and uploaded a new version to bioRxiv. Detailed replies to the comments can be found below.
Summary: The work detailed here explores a model of recurrent cortical networks and shows that homeostatic synaptic plasticity must be present in connections between both excitatory (E) to inhibitory (I) neurons and vice versa to produce the known E/I assemblies found in the cortex. There are some interesting findings about the consequences of assemblies formed in this way: there are stronger synapses between neurons that respond to similar stimuli; excitatory neurons show feature-specific suppression after plasticity; and the inhibitory network does not just provide a general …
Author Response
We thank the reviewers for their thoughtful and constructive comments. We have updated the manuscript to take their suggestions and concerns into account and uploaded a new version to bioRxiv. Detailed replies to the comments can be found below.
Summary: The work detailed here explores a model of recurrent cortical networks and shows that homeostatic synaptic plasticity must be present in connections between both excitatory (E) to inhibitory (I) neurons and vice versa to produce the known E/I assemblies found in the cortex. There are some interesting findings about the consequences of assemblies formed in this way: there are stronger synapses between neurons that respond to similar stimuli; excitatory neurons show feature-specific suppression after plasticity; and the inhibitory network does not just provide a general untuned inhibitory signal, but instead sculpts excitatory processing A major claim in the manuscript that argues for the broad impact of the work is that this is one of only a handful of papers to show how a local approximation rule can instantiate feedback (akin to the back-propagation of error used to train neural networks in machine learning) in a biologically plausible way.
Reviewer #1:
The manuscript investigates the situations in which stimulus-specific assemblies can emerge in a recurrent network of excitatory (E) and inhibitory (I, presumed parvalbumin-positive) neurons. The authors combine 1) Hebbian plasticity of I->E synapses that is proportional to the difference between the E neuron's firing rate and a homeostatic target and 2) plasticity of E->I synapses that is proportional to the difference between the total excitatory input to the I neuron and a homeostatic target. These are sufficient to produce E/I assemblies in a network in which only the excitatory recurrence exhibits tuning at the initial condition. While the full implementation of the plasticity rules, derived from gradient descent on an objective function, would rely on nonlocal weight information, local approximations of the rules still lead to the desired results.
Overall the results make sense and represent a new unsupervised method for generating cell assemblies consisting of both excitatory and inhibitory neurons. Major concerns are that the proposed rule ends up predicting a rather nonstandard form of plasticity for certain synapses, and that the results could be fleshed out more. Also, the strong novelty claimed could be softened or contextualized better, given that other recent papers have shown how to achieve something like backprop in recurrent neural networks (e.g. Murray eLife 2019).
Comments:
- The main text would benefit from greater exposition of the plasticity rule and the distinction between the full expression and the approximation. While the general idea of backpropagation may be familiar to a good number of readers, here it is being used in a nonstandard way (to implement homeostasis), and this should be described more fully, with a few key equations.
Additionally, the point that, for a recurrent network, the proposed rules are only related to gradient descent under the assumption that the network adiabatically follows the stimulus, seems important enough to state in the main text.
Thanks, that's a good point. We modified the relevant portion of the main text as follows (l. 88):
“[…] To that end, we derive synaptic plasticity rules for excitatory input and inhibitory output connections of PV interneurons that are homeostatic for the excitatory population (see Materials & Methods). A stimulus-specific homeostatic control can be seen as a "trivial" supervised learning task, in which the objective is that all pyramidal neurons should learn to fire at a given target rate ρ 0 for all stimuli. Hence, a gradient-based optimisation would effectively require a backpropagation of error [Rumelhart et al., 1985] through time [BPTT; Werbos, 1990].
Because backpropagation rules rely on non-local information that might not be available to the respective synapses, their biological plausibility is currently debated [Lillicrap et al., 2020, Sacramento et al., 2018, Guerguiev et al., 2017, Whittington and Bogacz, 2019, Bellec et al., 2020]. However, a local approximation of the full BPTT update can be obtained under the following assumptions: First, we assume that the sensory input to the network changes on a time scale that is slower than the intrinsic time scales in the network. This eliminates the necessity of backpropagating information through time, albeit still through the synapses in the network. This assumption results in what we call the ”gradient-based” rules (Eq. 15 in the Supplementary Materials), which are spatially non-local. Second, we assume that synaptic interactions in the network are sufficiently weak that higher-order synaptic interactions can be neglected. Third and finally, we assume that over the course of learning, the Pyr→PV connections and the PV→Pyr connections become positively correlated [Znamenskiy et al., 2018], such that we can replace PV->Pyr synapses by the reciprocal Pyr->PV synapse in the Pyr->PV learning rule, without rotating the update too far from the true gradient (see Supplementary Materials)."
We also added the learning rules to the main text (l. 108).
- The paper has a clear and simple message, but not much exploration of that message or elaboration on the results. Figures 2 and 3 do not convey much information, other than the fact that blocking either form of plasticity fails to produce the desired effects. This seems somewhat obvious -- almost by definition one can't have E/I assemblies if E->I or I->E connections are forced to remain random. This point deserves at most one figure, or maybe even just a few panels.
We appreciate that the result that both forms of plasticity are necessary may feel somewhat obvious. However, it may not be as obvious as it appears, because the incoming synapses onto INs follow a long-tailed distribution, like many other synapse types. Randomly sampling from such a distribution could in principle generate sufficient stimulus selectivity to render learning in the E->I connections superfluous (see Litwin-Kumar et al., 2017). That’s why we made sure to initialize the E->I weights such that they show a similar variability as in the data. We now comment on this aspect in the results section (l. 135):
"Having shown that homeostatic plasticity acting on both input and output synapses of interneurons are sufficient to learn E/I assemblies, we now turn to the question of whether both are necessary . To this end, we perform "knock-out" experiments, in which we selectively block synaptic plasticity in either of the synapses. The motivation for these experiments is the observation that the incoming PV synapses follow a long-tailed distribution (Znamenskiy et al., 2018). This could provide a sufficient stimulus selectivity in the PV population for PV->Pyr plasticity alone to achieve a satisfactory E/I balance. A similar reasoning holds for static, but long-tailed outgoing PV synapses. This intuition is supported by result of Litwin-Kumar et al. (2017) that in a population of neurons analogous to our interneurons, the dimensionality of responses in that population can be high for static input synapses, when those are log-normally distributed."
Secondly, we tried to write a manuscript for both fellow modelers (how to self-organize an E/I assembly?) and to our experimental colleagues (what conclusions can we draw from the Znamenskiy data?). In electrophysiological studies, the plasticity of incoming and outgoing synapses of INs both have been studied independently. The insight that those two forms of plasticity should act in synergy is something that we wanted to emphasize, because it could be studied in parallel in paired recordings. Hence the two figures. Looks as if we got only modelers as reviewers ;). Along these lines, we added a short paragraph to the discussion (l. 348):
“Both Pyr->PV and PV->Pyr plasticity have been studied in slice (for reviews, see, Kullmann et al. 2007, Vogels et al. 2013), but mostly in isolation. The idea that the two forms of plasticity should act in synergy suggests that it may be interesting to study both forms in the same system, e.g., in reciprocally connected Pyr-PV pairs.“
- The derived plasticity rule for E->I synapses, which requires modulation of I synapses based on a difference from a target value for the excitatory subcomponent of the input current, does not take a typical form for biologically plausible learning rules (which usually operate on firing rates or voltages, for example). The authors should explore and discuss in more depth this assumption. Is there experimental evidence for it? It seems like it might be a difficult quantity to signal to the synapse in order to guide plasticity. The authors note in the discussion that BCM-type rules fail here -- are there other approaches that would work? What about a more local form of plasticity that involves only the excitatory current local to a dendrite, for example?
We agree that the rule we propose for E->I synapses warrants a more extensive discussion regarding its potential biological implementation. We have added the following paragraph to the manuscript (l. 295):
“A cellular implementation of such a plasticity rule would require the following ingredients: i) a signal that reflects the cell-wide excitatory current ii) a mechanism that changes Pyr->PV synapses in response to variations in this signal. On PV interneurons, NMDA receptors are enriched in excitatory feedback relative to feedforward connections [LeRoux et al., 2013]. Intracellular sodium and calcium could hence be a proxy of recurrent excitatory input. In addition, the activation of NMDA receptors has been shown to track intracellular sodium concentration [Yu and Salter, 1998] which at least partially reflects glutamatergic synaptic currents. Due to a lack of spines in PV dendrites, both postsynaptic sodium and calcium are expected to diffuse more broadly in the dendritic arbor [Hu et al., 2014, Kullmann and Lamsa, 2007], and thus might provide a signal for overall dendritic excitatory currents. Depending on how the excitatory inputs are distributed on PV interneuron dendrites [Larkum and Nevian, 2008, Jia et al., 2010, Grienberger et al., 2015], this integration does not need to be cell-wide, but could be local, e.g., to a dendrite, if the local excitatory input is a proxy for the global input.
NMDA receptors at IN excitatory input synapses can mediate Hebbian long-term plasticity [Kullmann and Lamsa, 2007}, and blocking excitatory currents can abolish plasticity in those synapses [LeRoux et al., 2013]. Furthermore, NMDAR-dependent plasticity is expressed post-synaptically, and seems to require presynaptic activation [Kullmann and Lamsa, 2007]. Other molecular signals that reflect excitatory activity have been implicated in the homeostatic regulation of synapses onto INs, including Narp and BDNF [Chang et al., 2010, Rutherford et al., 1998, Lamsa et al., 2007]. In summary, we conjecture that PV interneurons and their excitatory inputs have the necessary prerequisites to implement the suggested local Pyr->PV plasticity rule.”
Concerning other potential types of plasticity, we certainly do not expect that the suggested pair of rules is the only one that will work. We have added the following paragraph to the discussion (l. 322):
“We expect that the rules we suggest here are only one set of many that can establish E/I assemblies. Given that the role of the input plasticity in the interneurons is the formation of a stimulus specificity, it is tempting to assume that this could equally well be achieved by classical forms of plasticity like the Bienenstock-Cooper-Munro (BCM) rule [Bienenstock, et al. 1982], which is commonly used in models of receptive field formation. However, in our hands, the combination of BCM plasticity in Pyr->PV synapses with homeostatic inhibitory plasticity in the \ItoE synapses showed complex dynamics, an analysis of which is beyond the scope of this article. In particular, this combination of rules often did not converge to a steady state, probably for the following reason. BCM rules tend to [...].
We suspect that this instability can also arise for other Hebbian forms of plasticity in interneuron input synapses when they are combined with homeostatic inhibitory plasticity [Vogels et al. 2011] in their output synapses. The underlying reason is that for convergence, the two forms of plasticity need to work synergistically towards the same goal, i.e., the same steady state. For two arbitrary synaptic plasticity rules acting in different sets of synapses, it is likely that they aim for two different overall network configurations. Such competition can easily result in latching dynamics with a continuing turn-over of transiently stable states, in which the form of plasticity that acts more quickly gets to reach its goal transiently, only to be undermined by the other one later [Clopath et al. 2016].”
- Does the initial structure in excitatory recurrence play a role, or is it just there to match the data?
For the results of Fig 4, the structure of excitatory recurrence is essential, because similarly tuned Pyr neurons should excite each other (absent the E-I assemblies). Without that structure in the Pyr->Pyr connections, the “paradoxical” inhibitory effect we report would not be paradoxical at all. For the results of Fig 1-3 the excitatory recurrence plays a role only insofar as it permits and reinforces stimulus selectivity in pyramidal neurons. If those synapses were unstructured (and strong), it could disrupt the Pyr selectivity, and there would be nothing to guide the formation of E/I assemblies. We have added the following sentence to the beginning of the results section (l. 77):
“[...] Note that the Pyr->Pyr connections only play a decisive role for the results in Fig. 4, but are present in all simulations for consistency. [...]”
Reviewer #2:
In this work, the authors simulated a rate-based recurrent network with 512 excitatory and 64 inhibitory neurons. The authors use this model to investigate which forms of synaptic plasticity are needed to reproduce the stimulus-specific interactions observed between pyramidal neurons and parvalbumin-expressing (PV) interneurons in mouse V1. When there is homeostatic synaptic plasticity from both excitatory to inhibitory and reciprocally from inhibitory to excitatory neurons in the simulated networks, they showed that the emergent E/I assemblies are qualitatively similar to those observed in mouse V1, e.g. there are stronger synapses for neurons responding to similar stimuli. They also identified that synaptic plasticity must be present in both directions (from pyramidal neurons to PV neurons and vice versa) to produce such E/I assemblies. Furthermore, they identified that these E/I assemblies enable the excitatory population in their simulations to show feature-specific suppression. Therefore, the author claimed that they found evidence that these inhibitory circuits do not provide a "blanket of inhibition", but rather a specific, activity-dependent sculpting of the excitatory response. They also claim that the learning rule they developed in this model shows for the first time how a local approximation rule can instantiate feedback alignment in their network, which is a method for achieving an approximation to a backpropagation-like learning rule in realistic neural networks.
We thank you for your thorough evaluation of the role of feedback alignment (FA) in our model. While we will attempt to address them point-by-point below, we feel that we may have misled this reviewer regarding the focus of the article. The core novelty of this work lies in elucidating potential mechanisms of experimentally observed E/I neuronal assemblies in mouse V1, and furthermore in proposing plasticity rules that can achieve such E/I assemblies. That they do so via a mechanism akin to feedback alignment is mentioned relatively briefly in the manuscript, and is merely offered as a mechanistic explanation for how inhibitory currents are ultimately balanced with excitation. We are fully aware of the fact that the suggested rules are by no means a local approximation of the full BPTT problem in RNNs, but feel that the reviewer read our paper primarily as a contribution to this very interesting literature (which it isn't in our claim).
Major points:
- The authors claim that their synaptic plastic rule implements a recurrent variant of feedback alignment. Namely, "When we compare the weight updates the approximate rules perform to the updates that would occur using the gradient rule, the weight updates of the local approximations align to those of the gradient rules over learning". They also claim that this is the first time feedback alignment is demonstrated in a recurrent network. It seems that the weight replacement in this synaptic plastic rule is uniquely motivated by E/I balance, but the feedback alignment in [Lillicrap et al., 2016] is much more general. Thus, the precise connections between feedback alignment and this work remains a bit unclear.
We had hoped that our claims in the manuscript were phrased sufficiently carefully, and regret that the reviewer was led to believe that our goal was to provide a general solution to biological backprop in recurrent networks. Of course, the problem we are tackling is not the full backprop problem, and we do not expect that the approximation holds for general tasks. It clearly won't, given that it effectively relies on a truncation after two time steps and makes a stationarity assumption. Still, we felt that it would have been a lost opportunity not to discuss the relation to feedback alignment, because any approximation warrants a justification, and for the replacement of I->E weights by E->I weights, feedback alignment readily provides one. We now discuss the assumptions underlying the local approximation more extensively in the main paper (see reply to Reviewer 1, comment 1).
We also added a discussion to the section in the supplementary material, where the local approximations are derived (l. 760):
“Overall, the local approximation of the learning rule relies on three assumptions: Slowly varying inputs, weak synaptic weights and alignment of input and output synapses of the interneurons. These assumptions clearly limit the applicability of the learning rules for other learning tasks. In particular, the learning rules will not allow the network to learn temporal sequences.”
It would be good if the following things about this major claim of the manuscript could be expanded and/or clarified:
i) In Fig S3 (upper, right vs. left), it is surprising that the Pyr->PV knock-out seems to produce a better alignment in PV->Pyr. Comparing the upper right of Fig S3 and the bottom figure of Fig 1g, it seems that the Pyr->PV knock-out performs equally well with a local approximation for the output connections of PV interneurons. Is this a special condition in this model that results in the emergence of the overall feedback alignment?
The 0-th order approximation of I->E plasticity is, by itself, relatively good at following the full gradient for those synapses (because I->E synapses have virtually unmediated control over Pyr neuron activity). When E->I plasticity is also present, we believe that the higher variance in angle to the gradient (for I->E updates) may be due to perturbations introduced by the E->I updates. Each update to one weight matrix changes the gradient for the other, but this is ultimately what brings them into alignment with one another. Because this is a very technical point, we prefer not to discuss this at length in the manuscript. The more important point is summarized in the two bottom figures, which demonstrate that the gradients on the E->I synapses only align within 90 degrees when both synapse types are plastic.
ii) In the feedback alignment paper [Lillicrap et al., 2016], those authors introduce a "Random Feedback Weights Support"; this uses a random matrix, B, to replace the transpose of the backpropagation weight matrix. Here, the alignment seems to be based on the intuition that "The excitatory input connections onto the interneurons serve as a proxy for the transpose of the output connections," and "the task of balancing excitation by feedback inhibition favours symmetric connection." It seems synaptic plasticity here is mechanistically different; it is only similar to the feedback alignment [Lillicrap et al., 2016] because both reach a final balanced state. Please clarify how the results here are to be interpreted as an instantiation of feedback alignment - whether it is simply that the end state is similar, or if the mechanism is thought to be more deeply connected.
We believe that the mechanisms are indeed more deeply connected, as supported by the fact that the gradients align early on during learning. We added an extended discussion to the supplementary material (l. 744):
“In feedback alignment, the matrix that backpropagates the errors is replaced by a random matrix B. Here, we instead use the feedforward weights in the layer below. Similar to the extension to feedback alignment of Akrout et al. [2019], those weights are themselves plastic. However, we believe that the underlying mechanism of feedback alignment still holds. The representation in the hidden layer (the interneurons) changes as if the weights to the output layer (the Pyr neurons) were equal to the weights they are replaced with (here, the input weights to the PV neurons). To exploit this representation, the weights to the output layer then align to the replacement weights, justifying the replacement post-hoc (Fig. 1G).”
iii) The feedback alignment [Lillicrap et al., 2016] works when the weight matrix has its entries near zero (e^TWBe>0). Are there any analogous conditions for the synaptic plastic rule to succeed?
Yes, the condition is very similar. We have added a corresponding discussion to the supplementary material (l. 753):
“Note that the condition for feedback alignment to provide an update in the appropriate direction (e T B T W e>0, where e denotes the error, W the weights in the second layer, and B the random feedback matrix) reduces to the condition that W ei W ie is positive definite (assuming the errors are full rank). One way of assuring this is a sufficiently positive diagonal of this matrix product, i.e., a sufficiently high correlation between the incoming and outgoing synapses of the interneurons. A positive correlation of these weights is one of the observations of Znamenskiy et al. 2018 and also a result of learning in our model.
While such a positive correlation is not necessarily present for all learning tasks or network models, we speculate that it will be for the task of learning an E/I balance in a Dalean network.”
iv) In the supplementary material, the local approximation rule is developed using a 0th-order truncation of Eq's 15a and 15b. Is it noted that "If synapses are sufficiently weak ..., this approximation can be substituted into Eq. 15a and yields an equation that resembles a backpropagation rule in a feedforward network (E -> I -> E) with one hidden layer -- the interneurons." It would be helpful if the authors could discuss how this learning rule works in a general recurrent network, or if it will work for any network with sufficiently weak synapses.
We now discuss the assumptions and their consequences more extensively, see reply to reviewer 1, comment 1.
v) This synaptic plasticity rule seems to be closely related to another local approximation of backpropagation in recurrent neural network: e-prop in (Bellec et.al 2020, https://www.nature.com/articles/s41467-020-17236-y) and broadcast alignment (Nøkland 2016, Samadi et.al, 2017). These previous papers do not consider E/I balance in their approximations, but is E/I balance necessary for successful local approximation to these rules?
We are not sure if we fully understand the comment. We do not expect that E/I balance is necessary for other biologically plausible approximations of BPTT. We merely suggest that for the task of learning E/I balance, the presented local approximation is valid.
- In the discussion, it reads as if the BCM rule cannot apply to this recurrent network because of the limited number of interneurons in the simulation ("parts of stimulus space are not represented by any interneurons"). Is this a limitation of the size of the model? Would scaling up the simulation change how applicable the BCM learning rule is? It would be helpful if the authors offer a more detailed discussion on why some forms of plasticity in interneurons fail to produce stimulus specificity.
Increasing the size of the model would help only if it would increase the redundancy in the Pyr population response. Otherwise, the problem can only be solved by changing the E to I ratio.
We feel that an exhaustive discussion of the dynamics of BCM in our network is beyond the scope of the paper, particularly because BCM comes in a broad variety (weight normalisation, weight limits, exact form of the sliding threshold?) and the exact behavior depends on various parameter choices. Similarly, we preferred to limit the discussion of other Hebbian rules, because it would be somewhat arbitrary which rules to discuss. Instead we added the following more abstract arguments to the discussion section (l. 322):
“We expect that the rules we suggest here are only one set of many that can establish E/I assemblies. Given that the role of the input plasticity in the interneurons is the formation of a stimulus specificity, it is tempting to assume that this could equally well be achieved by classical forms of plasticity like the Bienenstock-Cooper-Munro (BCM) rule \citep{Bienenstock82}, which is commonly used in models of receptive field formation. However, in our hands, the combination of BCM plasticity in Pyr->PV synapses with homeostatic inhibitory plasticity in the PV->Pyr synapses showed complex dynamics, an analysis of which is beyond the scope of this article. In particular, this combination of rules often did not converge to a steady state, probably for the following reason. [...]
We suspect that this instability can also arise for other Hebbian forms of plasticity in interneuron input synapses when they are combined with homeostatic inhibitory plasticity (Vogels et al., 2011) in their output synapses. The underlying reason is that for convergence, the two forms of plasticity need to work synergistically towards the same goal, i.e., the same steady state. For two arbitrary synaptic plasticity rules acting in different sets of synapses, it is likely that they aim for two different overall network configurations. Such competition can easily result in dynamics with a continuing turn-over of transiently stable states, in which the form of plasticity that acts more quickly gets to reach its goal transiently, only to be undermined by the other one later.”
Minor comments:
- Section 1 of the Results is confusing. The authors jump back and forth between emphasizing the emergence of E/I assemblies and connecting the local approximation rule to general feedback alignment. It would be helpful if the authors reorganized this section: maybe discuss the E/I assemblies first (with Figure 1), then go on to discuss why it is important to compare this synaptic plastic rule with feedback alignment.
We have extended the explanation of the plasticity rules [l. 108] and hope that this section is now more accessible.
- Although the authors claim that there exists a significant change after PV->Pyr knockout (Fig 2b), the current presentation of this result is confusing: how many neurons change their responses? (Reading directly from the distributional difference, it seems that the gray and blue distributions only differ by about 5-8 neurons).
The change is admittedly modest, but significant.
- Effect sizes instead of p-values should be quoted and used throughout, because the large data size of the simulations seems to make even the smallest correlations significant.
We used p-values to remain consistent with the article of Znamenskiy et al. Please note that we took care to sample a comparable number of synapses from the network as in Znamenskiy et al., to keep the p-values comparable. If we had sampled all synapses from the network, significance would indeed be trivial.
-
Summary: The work detailed here explores a model of recurrent cortical networks and shows that homeostatic synaptic plasticity must be present in connections between both excitatory (E) to inhibitory (I) neurons and vice versa to produce the known E/I assemblies found in the cortex. There are some interesting findings about the consequences of assemblies formed in this way: there are stronger synapses between neurons that respond to similar stimuli; excitatory neurons show feature-specific suppression after plasticity; and the inhibitory network does not just provide a general untuned inhibitory signal, but instead sculpts excitatory processing A major claim in the manuscript that argues for the broad impact of the work is that this is one of only a handful of papers to show how a local approximation rule can instantiate feedback (akin …
Summary: The work detailed here explores a model of recurrent cortical networks and shows that homeostatic synaptic plasticity must be present in connections between both excitatory (E) to inhibitory (I) neurons and vice versa to produce the known E/I assemblies found in the cortex. There are some interesting findings about the consequences of assemblies formed in this way: there are stronger synapses between neurons that respond to similar stimuli; excitatory neurons show feature-specific suppression after plasticity; and the inhibitory network does not just provide a general untuned inhibitory signal, but instead sculpts excitatory processing A major claim in the manuscript that argues for the broad impact of the work is that this is one of only a handful of papers to show how a local approximation rule can instantiate feedback (akin to the back-propagation of error used to train neural networks in machine learning) in a biologically plausible way.
Reviewer #1:
The manuscript investigates the situations in which stimulus-specific assemblies can emerge in a recurrent network of excitatory (E) and inhibitory (I, presumed parvalbumin-positive) neurons. The authors combine 1) Hebbian plasticity of I->E synapses that is proportional to the difference between the E neuron's firing rate and a homeostatic target and 2) plasticity of E->I synapses that is proportional to the difference between the total excitatory input to the I neuron and a homeostatic target. These are sufficient to produce E/I assemblies in a network in which only the excitatory recurrence exhibits tuning at the initial condition. While the full implementation of the plasticity rules, derived from gradient descent on an objective function, would rely on nonlocal weight information, local approximations of the rules still lead to the desired results.
Overall the results make sense and represent a new unsupervised method for generating cell assemblies consisting of both excitatory and inhibitory neurons. Major concerns are that the proposed rule ends up predicting a rather nonstandard form of plasticity for certain synapses, and that the results could be fleshed out more. Also, the strong novelty claimed could be softened or contextualized better, given that other recent papers have shown how to achieve something like backprop in recurrent neural networks (e.g. Murray eLife 2019).
Comments:
- The main text would benefit from greater exposition of the plasticity rule and the distinction between the full expression and the approximation. While the general idea of backpropagation may be familiar to a good number of readers, here it is being used in a nonstandard way (to implement homeostasis), and this should be described more fully, with a few key equations.
Additionally, the point that, for a recurrent network, the proposed rules are only related to gradient descent under the assumption that the network adiabatically follows the stimulus, seems important enough to state in the main text.
The paper has a clear and simple message, but not much exploration of that message or elaboration on the results. Figures 2 and 3 do not convey much information, other than the fact that blocking either form of plasticity fails to produce the desired effects. This seems somewhat obvious -- almost by definition one can't have E/I assemblies if E->I or I->E connections are forced to remain random. This point deserves at most one figure, or maybe even just a few panels.
The derived plasticity rule for E->I synapses, which requires modulation of I synapses based on a difference from a target value for the excitatory subcomponent of the input current, does not take a typical form for biologically plausible learning rules (which usually operate on firing rates or voltages, for example). The authors should explore and discuss in more depth this assumption. Is there experimental evidence for it? It seems like it might be a difficult quantity to signal to the synapse in order to guide plasticity. The authors note in the discussion that BCM-type rules fail here -- are there other approaches that would work? What about a more local form of plasticity that involves only the excitatory current local to a dendrite, for example?
Does the initial structure in excitatory recurrence play a role, or is it just there to match the data?
Reviewer #2:
In this work, the authors simulated a rate-based recurrent network with 512 excitatory and 64 inhibitory neurons. The authors use this model to investigate which forms of synaptic plasticity are needed to reproduce the stimulus-specific interactions observed between pyramidal neurons and parvalbumin-expressing (PV) interneurons in mouse V1. When there is homeostatic synaptic plasticity from both excitatory to inhibitory and reciprocally from inhibitory to excitatory neurons in the simulated networks, they showed that the emergent E/I assemblies are qualitatively similar to those observed in mouse V1, e.g. there are stronger synapses for neurons responding to similar stimuli. They also identified that synaptic plasticity must be present in both directions (from pyramidal neurons to PV neurons and vice versa) to produce such E/I assemblies. Furthermore, they identified that these E/I assemblies enable the excitatory population in their simulations to show feature-specific suppression. Therefore, the author claimed that they found evidence that these inhibitory circuits do not provide a "blanket of inhibition", but rather a specific, activity-dependent sculpting of the excitatory response. They also claim that the learning rule they developed in this model shows for the first time how a local approximation rule can instantiate feedback alignment in their network, which is a method for achieving an approximation to a backpropagation-like learning rule in realistic neural networks.
Major points:
- The authors claim that their synaptic plastic rule implements a recurrent variant of feedback alignment. Namely, "When we compare the weight updates the approximate rules perform to the updates that would occur using the gradient rule, the weight updates of the local approximations align to those of the gradient rules over learning". They also claim that this is the first time feedback alignment is demonstrated in a recurrent network. It seems that the weight replacement in this synaptic plastic rule is uniquely motivated by E/I balance, but the feedback alignment in [Lillicrap et al., 2016] is much more general. Thus, the precise connections between feedback alignment and this work remains a bit unclear.
It would be good if the following things about this major claim of the manuscript could be expanded and/or clarified:
i) In Fig S3 (upper, right vs. left), it is surprising that the Pyr->PV knock-out seems to produce a better alignment in PV->Pyr. Comparing the upper right of Fig S3 and the bottom figure of Fig 1g, it seems that the Pyr->PV knock-out performs equally well with a local approximation for the output connections of PV interneurons. Is this a special condition in this model that results in the emergence of the overall feedback alignment?
ii) In the feedback alignment paper [Lillicrap et al., 2016], those authors introduce a "Random Feedback Weights Support"; this uses a random matrix, B, to replace the transpose of the backpropagation weight matrix. Here, the alignment seems to be based on the intuition that "The excitatory input connections onto the interneurons serve as a proxy for the transpose of the output connections," and "the task of balancing excitation by feedback inhibition favours symmetric connection." It seems synaptic plasticity here is mechanistically different; it is only similar to the feedback alignment [Lillicrap et al., 2016] because both reach a final balanced state. Please clarify how the results here are to be interpreted as an instantiation of feedback alignment - whether it is simply that the end state is similar, or if the mechanism is thought to be more deeply connected.
iii) The feedback alignment [Lillicrap et al., 2016] works when the weight matrix has its entries near zero (e^TWBe>0). Are there any analogous conditions for the synaptic plastic rule to succeed?
iv) In the supplementary material, the local approximation rule is developed using a 0th-order truncation of Eq's 15a and 15b. Is it noted that "If synapses are sufficiently weak ..., this approximation can be substituted into Eq. 15a and yields an equation that resembles a backpropagation rule in a feedforward network (E -> I -> E) with one hidden layer -- the interneurons." It would be helpful if the authors could discuss how this learning rule works in a general recurrent network, or if it will work for any network with sufficiently weak synapses.
v) This synaptic plasticity rule seems to be closely related to another local approximation of backpropagation in recurrent neural network: e-prop in (Bellec et.al 2020, https://www.nature.com/articles/s41467-020-17236-y) and broadcast alignment (Nøkland 2016, Samadi et.al, 2017). These previous papers do not consider E/I balance in their approximations, but is E/I balance necessary for successful local approximation to these rules?
- In the discussion, it reads as if the BCM rule cannot apply to this recurrent network because of the limited number of interneurons in the simulation ("parts of stimulus space are not represented by any interneurons"). Is this a limitation of the size of the model? Would scaling up the simulation change how applicable the BCM learning rule is? It would be helpful if the authors offer a more detailed discussion on why some forms of plasticity in interneurons fail to produce stimulus specificity.
-