Unsupervised Bayesian Ising Approximation for decoding neural activity and other biological dictionaries

Damián G Hernández
Samuel J Sober
Ilya Nemenman

Curated by eLife

Evaluation Summary:

Hernandez et al use an elegant mathematical framework to build a novel tool for extracting unusually frequent (or infrequent) patterns in multidimensional biological data when only a small number of measurements are available. This is a common problem in many biological settings, so the tool could potentially be used to answer a wide range of statistically hard questions. As a first demonstration of its use, the authors show that the new tool can be used to reveal novel properties about neural responses in zebra finches during song generation.

(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (eLife)

Abstract

The problem of deciphering how low-level patterns (action potentials in the brain, amino acids in a protein, etc.) drive high-level biological features (sensorimotor behavior, enzymatic function) represents the central challenge of quantitative biology. The lack of general methods for doing so from the size of datasets that can be collected experimentally severely limits our understanding of the biological world. For example, in neuroscience, some sensory and motor codes have been shown to consist of precisely timed multi-spike patterns. However, the combinatorial complexity of such pattern codes have precluded development of methods for their comprehensive analysis. Thus, just as it is hard to predict a protein’s function based on its sequence, we still do not understand how to accurately predict an organism’s behavior based on neural activity. Here, we introduce the unsupervised Bayesian Ising Approximation (uBIA) for solving this class of problems. We demonstrate its utility in an application to neural data, detecting precisely timed spike patterns that code for specific motor behaviors in a songbird vocal system. In data recorded during singing from neurons in a vocal control region, our method detects such codewords with an arbitrary number of spikes, does so from small data sets, and accounts for dependencies in occurrences of codewords. Detecting such comprehensive motor control dictionaries can improve our understanding of skilled motor control and the neural bases of sensorimotor learning in animals. To further illustrate the utility of uBIA, we used it to identify the distinct sets of activity patterns that encode vocal motor exploration versus typical song production. Crucially, our method can be used not only for analysis of neural systems, but also for understanding the structure of correlations in other biological and nonbiological datasets.

Version published to 10.7554/elife.68192 on eLife
Mar 22, 2022
eLife
Nov 1, 2021
Author Response:

Reviewer #1 (Public Review):

This work presents a new Bayesian method for detecting those patterns of neural responses that are connected to behavioral output and are worth investigating further. The manuscript contains the derivation of the approach and its test on synthetic and real neural data.

The derivation should be improved by providing additional steps. For example, it was not clear how Eq. 5 was derived and why the double derivative with respect to parameters theta_mu and theta-nu are present ( these terms appear to be missing in the definition of log-likelihood).

Thank you for the suggestion. We have added steps in the derivation leading to the Ising equation for the indicator variables, now in Eq. (8). These intermediate steps corresponds to the two main approximations of the BIA method, namely, the …
Author Response:

Reviewer #1 (Public Review):

This work presents a new Bayesian method for detecting those patterns of neural responses that are connected to behavioral output and are worth investigating further. The manuscript contains the derivation of the approach and its test on synthetic and real neural data.

The derivation should be improved by providing additional steps. For example, it was not clear how Eq. 5 was derived and why the double derivative with respect to parameters theta_mu and theta-nu are present ( these terms appear to be missing in the definition of log-likelihood).

Thank you for the suggestion. We have added steps in the derivation leading to the Ising equation for the indicator variables, now in Eq. (8). These intermediate steps corresponds to the two main approximations of the BIA method, namely, the saddle point approximation for the posterior (Eq. (5)) and the Taylor expansion in the inverse regularization strength (Eq. (6)). We hope that these changes improved the readability of the derivation.

Parameter M should be more clearly defined as the number of samples. It is briefly mentioned on line 170, but it was difficult to connect this to equation (7) and those following that use M explicitly.

We thank the Reviewer for the suggestion. We have clarified the definition of M just after Eq. 11.

Is it possible to include multiple binary quantifications of behavior, similarly to how words are constructed from neural spike trains? For example, one can envision describing a particular song segment with respect to multiple binary features simultaneously.

We explicitly examine this question in “Dictionaries for exploratory vs. typical behaviors” and the corresponding Figure 6, which repeats our analysis for different binary discretizations of our behavioral data.

Reviewer #2 (Public Review):

Summary:

Hernandez et al propose a new statistical tool for identifying codeword in multivariate binary data (for instance neural activity patterns), with a small number of measurements. It demonstrates the utility of the approach on neural responses to analyzing the statistical structure of songbird responses and how they change in different contexts (during exploration vs typical song production).

Strengths:

The approach is innovative, in that it takes advantage of clever tools from sparse linear regression, in particular a method termed Bayesian Ising Approximation (BIA), to be able to identify codewords individually, rather than directly estimating a model of their joint statistics, by comparing to a null model that assumes independence across dimensions. This approach has the advantage of resulting in a very flexible model, with very few assumptions about the statistical structure of the data, that is applicable for a range of datasets sizes; the more data is available, more of the structure underlying it can be revealed .

The strong mathematical foundations provides clear bounds on data regimes in which the approximation is theoretically well justified and reasons to expect that the estimated models are minimal and interpretable.

The numerical estimation procedures are fast, and computationally efficient (for a reasonably sized neural dataset, can be run on a regular laptop).

The code is available on github for quick community dissemination.

Application to identification of behaviorally relevant patterns of co-activity goes beyond previous Ising-based models used in neuroscience.

When applied to songbird data, it reveals that the variability in neural responses during exploration has much more structure than previously thought.

Weaknesses:

Although the paper is written as a methods paper, emphasizing the technical contributions and promising wide applicability to a range of different types of datasets, the numerical validation of the method is very much restricted to the statistical regime of the songbird dataset. From the perspective of a potential future user of the tool it's less clear how the method would behave on different datasets, and what needs to happen in practice for adopting the tool to data with different statistics.

We have edited second half of the abstract and a few sentences in the Introduction (see latexdiff file) to make it clear that our main applications to date have been to songbird data.

The numerical comparison to other existing methods is minimal.

We have argued in our previous submission that there really are no other methods to compare to, designed to work in the regime similar to uBIA. It seemed to us that it would be unfair to run other methods on our datasets, see them not work well (as expected – because they make assumptions that are invalid in our regime), and then claim success. However, since the concern has been raised again, we really have to address it. To do this, we added a section in the Online Methods “Direct application of MaxEnt methods to synthetic and experimental data”, in which we compare uBIA to the relevant interactions model of Ganmor et al., with which uBIA has the highest similarity. The results are as expected – a method not designed for our data regime fails. We emphasize here again that the relative superiority of uBIA on these data should not be taken as a slight directed at other methods, but rather as an indication than, to cover different data regimes, multiple methods should be combined. We emphasized this in the “Overview of prior related methods in the literature” supplemental section.

The songbird analysis already reveals some challenges with respect to interpretability: in particular it is not clear how much information about the underlying neural processes can be revealed by summary statistics generated by the method, such as the number of codewords and their length distribution.

The reviewer is correct that our analysis of the songbird data raises a number of important questions for future studies. Although these remain to be answered, we emphasize that before the biological interpretation of over/underrepresented neural patterns can be attempted, such patterns must first be identified. uBIA therefore represents a crucial advance in our ability to address these questions.

Most conclusions are reasonably supported by the data. The analysis of the irreducibility of the codewords has insufficient support based on the numerical simulations. Moreover, the generality of the tool and comparison to other methods are discussed in almost entirely theoretical terms, which makes the claim on immediate utility for other datasets less convincing, especially outside the neuroscience community.

We hope that the addition of the new comparison figure partially alleviates these concerns. Additionally, we point out that 3rd and 4th order words are long, as most others deal with just pairs, as illustrated in the new Figure 7. Indeed, it is not easy to fit an N = 20 Ising model with 4th order terms, because there are 20 ∗ 19 ∗ 18 ∗ 17/(4 ∗ 3 ∗ 2 ∗ 1) = 4845 terms in this model, which cannot be fit from just a few hundred samples, which is precisely why the Ganmor model fails in this case (Fig. 7).

Nonetheless, the idea is quite interesting and likely of broad interest for theorists interested in the development of unsupervised statistical tools for neural data analysis, with practical applicability for a range of modern systems neuroscience experiments that involve task specific ensembles as the building block of circuit computation.
Read the original source
eLife
Nov 1, 2021

Evaluation Summary:

Hernandez et al use an elegant mathematical framework to build a novel tool for extracting unusually frequent (or infrequent) patterns in multidimensional biological data when only a small number of measurements are available. This is a common problem in many biological settings, so the tool could potentially be used to answer a wide range of statistically hard questions. As a first demonstration of its use, the authors show that the new tool can be used to reveal novel properties about neural responses in zebra finches during song generation.

(This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. The reviewers remained anonymous to the authors.)

Read the original source
eLife
Nov 1, 2021

Reviewer #1 (Public Review):

This work presents a new Bayesian method for detecting those patterns of neural responses that are connected to behavioral output and are worth investigating further. The manuscript contains the derivation of the approach and its test on synthetic and real neural data.

The derivation should be improved by providing additional steps. For example, it was not clear how Eq. 5 was derived and why the double derivative with respect to parameters theta_mu and theta-nu are present ( these terms appear to be missing in the definition of log-likelihood).

Parameter M should be more clearly defined as the number of samples. It is briefly mentioned on line 170, but it was difficult to connect this to equation (7) and those following that use M explicitly.

Is it possible to include multiple binary quantifications of …

Reviewer #1 (Public Review):

This work presents a new Bayesian method for detecting those patterns of neural responses that are connected to behavioral output and are worth investigating further. The manuscript contains the derivation of the approach and its test on synthetic and real neural data.

The derivation should be improved by providing additional steps. For example, it was not clear how Eq. 5 was derived and why the double derivative with respect to parameters theta_mu and theta-nu are present ( these terms appear to be missing in the definition of log-likelihood).

Parameter M should be more clearly defined as the number of samples. It is briefly mentioned on line 170, but it was difficult to connect this to equation (7) and those following that use M explicitly.

Is it possible to include multiple binary quantifications of behavior, similarly to how words are constructed from neural spike trains? For example, one can envision describing a particular song segment with respect to multiple binary features simultaneously.

Less emphasis should be made in the abstract on applications outside of neuroscience, because these are not studied here explicitly. It is fine as part of the discussion, but in its present form the abstract suggests a stronger connection to protein sequences than is actually in the manuscript.

Read the original source
eLife
Nov 1, 2021

Reviewer #2 (Public Review):

Summary:

Hernandez et al propose a new statistical tool for identifying codeword in multivariate binary data (for instance neural activity patterns), with a small number of measurements. It demonstrates the utility of the approach on neural responses to analyzing the statistical structure of songbird responses and how they change in different contexts (during exploration vs typical song production).

Strengths:

- The approach is innovative, in that it takes advantage of clever tools from sparse linear regression, in particular a method termed Bayesian Ising Approximation (BIA), to be able to identify codewords individually, rather than directly estimating a model of their joint statistics, by comparing to a null model that assumes independence across dimensions. This approach has the advantage of resulting …

Reviewer #2 (Public Review):

Summary:

Hernandez et al propose a new statistical tool for identifying codeword in multivariate binary data (for instance neural activity patterns), with a small number of measurements. It demonstrates the utility of the approach on neural responses to analyzing the statistical structure of songbird responses and how they change in different contexts (during exploration vs typical song production).

Strengths:

- The approach is innovative, in that it takes advantage of clever tools from sparse linear regression, in particular a method termed Bayesian Ising Approximation (BIA), to be able to identify codewords individually, rather than directly estimating a model of their joint statistics, by comparing to a null model that assumes independence across dimensions. This approach has the advantage of resulting in a very flexible model, with very few assumptions about the statistical structure of the data, that is applicable for a range of datasets sizes; the more data is available, more of the structure underlying it can be revealed .
- The strong mathematical foundations provides clear bounds on data regimes in which the approximation is theoretically well justified and reasons to expect that the estimated models are minimal and interpretable.
- The numerical estimation procedures are fast, and computationally efficient (for a reasonably sized neural dataset, can be run on a regular laptop).
- The code is available on github for quick community dissemination.
- Application to identification of behaviorally relevant patterns of co-activity goes beyond previous Ising-based models used in neuroscience.
- When applied to songbird data, it reveals that the variability in neural responses during exploration has much more structure than previously thought.

Weaknesses:

- Although the paper is written as a methods paper, emphasizing the technical contributions and promising wide applicability to a range of different types of datasets, the numerical validation of the method is very much restricted to the statistical regime of the songbird dataset. From the perspective of a potential future user of the tool it's less clear how the method would behave on different datasets, and what needs to happen in practice for adopting the tool to data with different statistics.
- The numerical comparison to other existing methods is minimal.
- The songbird analysis already reveals some challenges with respect to interpretability: in particular it is not clear how much information about the underlying neural processes can be revealed by summary statistics generated by the method, such as the number of codewords and their length distribution.

Most conclusions are reasonably supported by the data. The analysis of the irreducibility of the codewords has insufficient support based on the numerical simulations. Moreover, the generality of the tool and comparison to other methods are discussed in almost entirely theoretical terms, which makes the claim on immediate utility for other datasets less convincing, especially outside the neuroscience community.

Nonetheless, the idea is quite interesting and likely of broad interest for theorists interested in the development of unsupervised statistical tools for neural data analysis, with practical applicability for a range of modern systems neuroscience experiments that involve task specific ensembles as the building block of circuit computation.

Read the original source
Version published to 10.1101/849034 on bioRxiv
Nov 20, 2019

The Difference Neuron — A new spiking Neuron Model

This article has 4 authors:
1. Jacob Kanev
2. Chris Christodoulou
3. Achilleas Koutsou
4. Klaus Obermayer
This article has no evaluationsLatest version Feb 2, 2026
E-SKAN: Breaking the Efficiency-Accuracy Frontier in Neuromorphic Computing via Event-Driven Kolmogorov-Arnold Networks

This article has 2 authors:
1. Nihal Anil
2. Noora Sajil
This article has no evaluationsLatest version Jan 28, 2026
Coordination of spike timing among the neurons of the cerebellum

This article has 5 authors:
1. Reza Shadmehr
2. Mohammad Amin Fakharian
3. Elijah Taeckens
4. Alexander Vasserman
5. Alden Shoup
This article has no evaluationsLatest version Jan 30, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

The Difference Neuron — A new spiking Neuron Model

E-SKAN: Breaking the Efficiency-Accuracy Frontier in Neuromorphic Computing via Event-Driven Kolmogorov-Arnold Networks

Coordination of spike timing among the neurons of the cerebellum