Presynaptic stochasticity improves energy efficiency and helps alleviate the stability-plasticity dilemma

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    In large nervous systems such as mammalian cortex excitatory synapses are stochastic and the probability of release of neurotransmitter can be modulated by plasticity and neural activity. This paper presents a simple biologically plausible mechanism that regulates the probability of release during learning. Using network simulations the authors show that this can result in more energy efficient processing of learned stimuli by enhancing the reliability of important connections, with lower expected rates of transmission at less important synapses.

    This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

When an action potential arrives at a synapse there is a large probability that no neurotransmitter is released. Surprisingly, simple computational models suggest that these synaptic failures enable information processing at lower metabolic costs. However, these models only consider information transmission at single synapses ignoring the remainder of the neural network as well as its overall computational goal. Here, we investigate how synaptic failures affect the energy efficiency of models of entire neural networks that solve a goal-driven task. We find that presynaptic stochasticity and plasticity improve energy efficiency and show that the network allocates most energy to a sparse subset of important synapses. We demonstrate that stabilising these synapses helps to alleviate the stability-plasticity dilemma, thus connecting a presynaptic notion of importance to a computational role in lifelong learning. Overall, our findings present a set of hypotheses for how presynaptic plasticity and stochasticity contribute to sparsity, energy efficiency and improved trade-offs in the stability-plasticity dilemma.

Article activity feed

  1. Author Response:

    Reviewer 1 (Public Review):

    'Presynaptic Stochasticity Improves Energy Efficiency and Alleviates the Stability-Plasticity Dilemma' by Schug et al moves energy efficiency questions of stochastic synaptic transmission that were asked at the level of the single synapse and the single cell to the network level. This is important since local advantages in terms of energy cost may have unknown consequences at the larger scale. And stochastic synapses may have an unknown advantage in learning paradigms at the network level.

    I have some concerns regarding this work

    (1) The considerations are made in one/two particular network architecture with one parameter combination. The generality of the conclusions is not given and there is no reason to believe that the observations made here will hold for other network architectures or even different parameters. In this way, the current manuscript seems to describe the beginning of a project that hasn't really been worked through.

    We agree that it is an important concern that our findings are not a result of overfitting certain parameters to specific networks and tasks. We took considerable care to ensure robustness of the results and to avoid overfitting parameters to specific tasks. This information was not easily accessible in the original manuscript and we made corresponding changes to address this issue, see subsections "Metaplasticity Parameters" and "Model Robustness" in the Materials and Methods. In addition, we would like to point out that on top of standard rate-based neural network models used for the main experiments, we test our presynaptic learning rule on a standard perceptron model where we found qualitatively matching results. These results are complimented by a theoretical analysis of our learning rule, which further suggests robustness.

    (2) Additionally, the network architectures used here are rather artificial (multilayer perceptron) and come from machine learning. Linking a physical measure in a biological system (the metabolic cost) with task solving in a machine learning setting that does not have a biological pendant seems far-fetched and would not be the first thing in my mind to do to study the information transmission in biological neuronal networks.

    We decided to choose as simple models and metrics as possible that allowed us to isolate the effect of presynaptic stochasticity and plasticity on neuronal networks in goal driven tasks. We believe that the rate-based neural network models we mainly study present a parsimonious choice to approach the question presented. Regarding the link between physiological measures and our model, we point out that, in rate based models, firing rate is a common proxy for metabolic cost (see e.g. Levy & Baxter, 1996). This is one of the measures we use, see Figure 6(b). In addition, some of our results are evidence for improved metabolic efficiency, even without a 1-to-1 match from model- to biological networks. For example, increased sparsity would most likely imply improved metabolic efficiency in biological neural networks as well.

    (3) A lot of different measures for efficiency of the network are all briefly addressed but not dissected properly. A more fundamental understanding of why and when stochastic synapses in the network might be useful is missing and seems rather unexplored apart from some select manipulations.

    We focus on one measure for efficiency, namely the ratio of mutual information and metabolic cost. This is a natural measure which has been employed in prior work. Subsequently, we provide detailed explanations for how the proposed mechanism operates. For example sparsity is a natural, biologically relevant lens onto our network, as are the lesion experiments and the theoretical analysis. We believe that presenting different views strengthens rather than weakens the evidence.

  2. Evaluation Summary:

    In large nervous systems such as mammalian cortex excitatory synapses are stochastic and the probability of release of neurotransmitter can be modulated by plasticity and neural activity. This paper presents a simple biologically plausible mechanism that regulates the probability of release during learning. Using network simulations the authors show that this can result in more energy efficient processing of learned stimuli by enhancing the reliability of important connections, with lower expected rates of transmission at less important synapses.

    This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #2 agreed to share their name with the authors.

  3. Reviewer #1 (Public Review):

    'Presynaptic Stochasticity Improves Energy Efficiency and Alleviates the Stability-Plasticity Dilemma' by Schug et al moves energy efficiency questions of stochastic synaptic transmission that were asked at the level of the single synapse and the single cell to the network level. This is important since local advantages in terms of energy cost may have unknown consequences at the larger scale. And stochastic synapses may have an unknown advantage in learning paradigms at the network level.

    I have some concerns regarding this work

    (1) The considerations are made in one/two particular network architecture with one parameter combination. The generality of the conclusions is not given and there is no reason to believe that the observations made here will hold for other network architectures or even different parameters. In this way, the current manuscript seems to describe the beginning of a project that hasn't really been worked through.

    (2) Additionally, the network architectures used here are rather artificial (multilayer perceptron) and come from machine learning. Linking a physical measure in a biological system (the metabolic cost) with task solving in a machine learning setting that does not have a biological pendant seems far-fetched and would not be the first thing in my mind to do to study the information transmission in biological neuronal networks.

    (3) A lot of different measures for efficiency of the network are all briefly addressed but not dissected properly. A more fundamental understanding of why and when stochastic synapses in the network might be useful is missing and seems rather unexplored apart from some select manipulations.

    To sum up, I think that the question is interesting but the work is yet premature.

    Otherwise, the paper is very well written and puts this work very nicely in context with the existing literature.

  4. Reviewer #2 (Public Review):

    In this study, every synapse is described by the probability of release (p) and the synaptic strength (m). The learning dynamics for the expected synaptic strength (p*m) follows a classical gradient-based rule while the learning rule for p is such that important synapses (in the Fisher Information sense) increase p while the others decrease p.

    Strengths

    One of the biggest strength of the proposed learning rule is its simplicity. It is remarkable how a simple learning rule for the probability of release (in addition to the classical gradient rule for the expected synaptic strength) can help in continual learning problems. In some sense, the authors leverage the fact synapses are described by 2 variables in order to keep important memories for a longer time. The simplicity of the learning rule makes it very attractive for practical implementations.

    Another strength of the paper is that the learning rule for p has some relationship to the Fisher Information giving thereby a (partial) justification for this rule.

    Finally, the energetic considerations are very welcome in a field where those aspects are too often neglected.

    Weaknesses

    Probably the strongest weakness of the manuscript is the lack of biological evidence for the proposed learning rule.

    To my opinion, the second weakness of the paper is that the learning rule for p is more of an heuristic than principally derived. I understand the benefit of its simplicity (as argued above), but the price to pay is that we are not guaranteed that P(g_i^2>g_lim^2) does always depend monotonically on E(g_i^2).

    Overall, I find the paper well written and the main claims are well supported by the data/analysis.