Domain-adaptive matching bridges synthetic and in vivo neural dynamics for neural circuit connectivity inference
Curation statements for this article:-
Curated by eLife
eLife Assessment
This article reports an algorithm for inferring the presence of synaptic connection between neurons based on naturally occurring spiking activity of a neuronal network. One key improvement is to combine self-supervised and synthetic approaches to learn to focus on features that generalize to the conditions of the observed network. This valuable contribution is currently supported by incomplete evidence.
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (eLife)
Abstract
Accurately inferring neural circuit connectivity from in vivo recordings is essential for understanding the computations that support behavior and cognition. However, current deep learning approaches are limited by incomplete observability and the lack of ground-truth labels in real experiments. Consequently, models are often trained on synthetic data, which leads to the well-known “model mismatch” problem when simulated dynamics diverge from true neural activity. To overcome these challenges, we present Deep Domain-Adaptive Matching (DeepDAM), a training framework that adaptively matches synthetic and in vivo data domains for neural connectivity inference. Specifically, DeepDAM fine-tunes deep neural networks on a combined dataset of synthetic simulations and unlabeled in vivo recordings, aligning the model’s feature representations with real neural dynamics to mitigate model mismatch. We demonstrate this approach in rodent hippocampal CA1 circuits as a proof-of-concept, achieving near-perfect connectivity inference performance (Matthews correlation coefficient ∼0.97–1.0) and substantially surpassing classical methods (∼0.6–0.7). We further demonstrate robustness across multiple recording conditions within this hippocampal dataset. Additionally, to illustrate its broader applicability, we extend the framework to two distinct systems without altering the core methodology: a stomatogastric microcircuit in Cancer borealis ( ex vivo ) and single-neuron intracellular recordings in mouse, where DeepDAM significantly improves efficiency and accuracy over standard approaches. By effectively leveraging synthetic data for in vivo and ex vivo analysis, DeepDAM offers a generalizable strategy for overcoming model mismatch and represents a critical step towards data-driven reconstruction of functional neural circuits.
Article activity feed
-
eLife Assessment
This article reports an algorithm for inferring the presence of synaptic connection between neurons based on naturally occurring spiking activity of a neuronal network. One key improvement is to combine self-supervised and synthetic approaches to learn to focus on features that generalize to the conditions of the observed network. This valuable contribution is currently supported by incomplete evidence.
-
Reviewer #1 (Public review):
Summary:
The authors proposed a new method to infer connectivity from spike trains whose main novelty relies on their approach to mitigate the problem of model mismatch. The latter arises when the inference algorithm is trained or based on a model that does not accurately describe the data. They propose combining domain adaptation with a deep neural architecture and in an architecture called DeepDAM. They apply DeepDAM to an in vivo ground-truth dataset previously recorded in mouse CA1, show that it performs better than methods without domain adaptation, and evaluate its robustness. Finally, they show that their approach can also be applied to a different problem i.e., inferring biophysical properties of individual neurons.
Strengths:
(1) The problem of inferring connectivity from extracellular recording is …
Reviewer #1 (Public review):
Summary:
The authors proposed a new method to infer connectivity from spike trains whose main novelty relies on their approach to mitigate the problem of model mismatch. The latter arises when the inference algorithm is trained or based on a model that does not accurately describe the data. They propose combining domain adaptation with a deep neural architecture and in an architecture called DeepDAM. They apply DeepDAM to an in vivo ground-truth dataset previously recorded in mouse CA1, show that it performs better than methods without domain adaptation, and evaluate its robustness. Finally, they show that their approach can also be applied to a different problem i.e., inferring biophysical properties of individual neurons.
Strengths:
(1) The problem of inferring connectivity from extracellular recording is a very timely one: as the yield of silicon probes steadily increases, the number of simultaneously recorded pairs does so quadratically, drastically increasing the possibility of detecting connected pairs.
(2) Using domain adaptation to address model mismatch is a clever idea, and the way the authors introduced it into the larger architecture seems sensible.
(3) The authors clearly put a great effort into trying to communicate the intuitions to the reader.
Weaknesses:
(1) The validation of the approach is incomplete: due to its very limited size, the single ground-truth dataset considered does not provide a sufficient basis to draw a strong conclusion. While the authors correctly note that this is the only dataset of its kind, the value of this validation is limited compared to what could be done by carefully designing in silico experiments.
(2) Surprisingly, the authors fail to compare their method to the approach originally proposed for the data they validate on (English et al., 2017).
(3) The authors make a commendable effort to study the method's robustness by pushing the limits of the dataset. However, the logic of the robustness analysis is often unclear, and once again, the limited size of the dataset poses major limitations to the authors.
(4) The lack of details concerning both the approach and the validation makes it challenging for the reader to establish the technical soundness of the study.
Although in the current form this study does not provide enough basis to judge the impact of DeepDAM in the broader neuroscience community, it nevertheless puts forward a valuable and novel idea: using domain adaptation to mitigate the problem of model mismatch. This approach might be leveraged in future studies and methods to infer connectivity.
-
Reviewer #2 (Public review):
The article is very well written, and the new methodology is presented with care. I particularly appreciated the step-by-step rationale for establishing the approach, such as the relationship between K-means centers and the various parameters. This text is conveniently supported by the flow charts and t-SNE plots. Importantly, I thought the choice of state-of-the-art method was appropriate and the choice of dataset adequate, which together convinced me in believing the large improvement reported. I thought that the crossmodal feature-engineering solution proposed was elegant and seems exportable to other fields. Here are a few notes.
While the validation data set was well chosen and of high quality, it remains a single dataset and also remains a non-recurrent network. The authors acknowledge this in the …Reviewer #2 (Public review):
The article is very well written, and the new methodology is presented with care. I particularly appreciated the step-by-step rationale for establishing the approach, such as the relationship between K-means centers and the various parameters. This text is conveniently supported by the flow charts and t-SNE plots. Importantly, I thought the choice of state-of-the-art method was appropriate and the choice of dataset adequate, which together convinced me in believing the large improvement reported. I thought that the crossmodal feature-engineering solution proposed was elegant and seems exportable to other fields. Here are a few notes.
While the validation data set was well chosen and of high quality, it remains a single dataset and also remains a non-recurrent network. The authors acknowledge this in the discussion, but I wanted to chime in to say that for the method to be more than convincing, it would need to have been tested on more datasets. It should be acknowledged that the problem becomes more complicated in a recurrent excitatory network, and thus the method may not work as well in the cortex or in CA3.While the data is shown to work in this particular dataset (plus the two others at the end), I was left wondering when the method breaks. And it should break if the models are sufficiently mismatched. Such a question can be addressed using synthetic-synthetic models. This was an important intuition that I was missing, and an important check on the general nature of the method that I was missing.
While the choice of state-of-the-art is good in my opinion, I was looking for comments on the methods prior to that. For instance, methods such based on GLMs have been used by the Pillow, Paninski, and Truccolo groups. I could not find a decent discussion of these methods in the main text and thought that both their acknowledgement and rationale for dismissing were missing.
While most of the text was very clear, I thought that page 11 was odd and missing much in terms of introductions. Foremost is the introduction of the dataset, which is never really done. Page 11 refers to 'this dataset', while the previous sentence was saying that having such a dataset would be important and is challenging. The dataset needs to be properly described: what's the method for labeling, what's the brain area, what were the spike recording methodologies, what is meant by two labeling methodologies, what do we know about the idiosyncrasies of the particular network the recording came from (like CA1 is non-recurrent, so which connections)? I was surprised to see 'English et al.' cited in text only on page 13 since their data has been hailed from the beginning.
Further elements that needed definition are the Nsyn and i, which were not defined in the cortex of Equation 2-3: I was not sure if it referred to different samples or different variants of the synthetic model. I also would have preferred having the function f defined earlier, as it is defined for Equation 3, but appears in Equation 2.
When the loss functions are described, it would be important to define 'data' and 'labels' here. This machine learning jargon has a concrete interpretation in this context, and making this concrete would be very important for the readership.
While I appreciated that there was a section on robustness, I did not find that the features studied were the most important. In this context, I was surprised that the other datasets were relegated to supplementary, as these appeared more relevant.
Some of the figures have text that is too small. In particular, Figure 2 has text that is way too small. It seemed to me that the pseudo code could stand alone, and the screenshot of the equations did not need to be repeated in a figure, especially if their size becomes so small that we can't even read them.
-