Domain-adaptive matching bridges synthetic and in vivo neural dynamics for neural circuit connectivity inference

Kaiwen Sheng
Shanghang Zhang
Shenjian Zhang
Yutao He
Maxime Beau
Peng Qu
Xiaofei Liu
Youhui Zhang
Lei Ma
Kai Du

Curated by eLife

eLife Assessment

This article reports an algorithm for inferring the presence of synaptic connection between neurons based on naturally occurring spiking activity of a neuronal network. One key improvement is to combine self-supervised and synthetic approaches to learn to focus on features that generalize to the conditions of the observed network. This valuable contribution is currently supported by incomplete evidence.

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (eLife)

Abstract

Accurately inferring neural circuit connectivity from in vivo recordings is essential for understanding the computations that support behavior and cognition. However, current deep learning approaches are limited by incomplete observability and the lack of ground-truth labels in real experiments. Consequently, models are often trained on synthetic data, which leads to the well-known “model mismatch” problem when simulated dynamics diverge from true neural activity. To overcome these challenges, we present Deep Domain-Adaptive Matching (DeepDAM), a training framework that adaptively matches synthetic and in vivo data domains for neural connectivity inference. Specifically, DeepDAM fine-tunes deep neural networks on a combined dataset of synthetic simulations and unlabeled in vivo recordings, aligning the model’s feature representations with real neural dynamics to mitigate model mismatch. We demonstrate this approach in rodent hippocampal CA1 circuits as a proof-of-concept, achieving near-perfect connectivity inference performance (Matthews correlation coefficient ∼0.97–1.0) and substantially surpassing classical methods (∼0.6–0.7). We further demonstrate robustness across multiple recording conditions within this hippocampal dataset. Additionally, to illustrate its broader applicability, we extend the framework to two distinct systems without altering the core methodology: a stomatogastric microcircuit in Cancer borealis (ex vivo) and single-neuron intracellular recordings in mouse, where DeepDAM significantly improves efficiency and accuracy over standard approaches. By effectively leveraging synthetic data for in vivo and ex vivo analysis, DeepDAM offers a generalizable strategy for overcoming model mismatch and represents a critical step towards data-driven reconstruction of functional neural circuits.

eLife
Feb 4, 2026

Author response:

General Response

We thank the reviewers for their positive assessment of our work and for acknowledging the timeliness of the problem and the novelty of using domain adaptation to address model mismatch. We appreciate the constructive feedback regarding validation and clarity. In the revised manuscript, we will address these points as follows:

(1) Systematic Validation: We will design and perform systematic in silico experiments to evaluate the method beyond the single in vivo dataset , including robustness tests regarding recording length and network synchrony.

(2) Recurrent Networks & Failure Analysis: We will test our method on synthetic datasets generated from highly recurrent networks and analyze exactly when the method breaks as a function of mismatch magnitude.

(3) Method Comparisons: We will report the Matthews …

Author response:

General Response

We thank the reviewers for their positive assessment of our work and for acknowledging the timeliness of the problem and the novelty of using domain adaptation to address model mismatch. We appreciate the constructive feedback regarding validation and clarity. In the revised manuscript, we will address these points as follows:

(1) Systematic Validation: We will design and perform systematic in silico experiments to evaluate the method beyond the single in vivo dataset , including robustness tests regarding recording length and network synchrony.

(2) Recurrent Networks & Failure Analysis: We will test our method on synthetic datasets generated from highly recurrent networks and analyze exactly when the method breaks as a function of mismatch magnitude.

(3) Method Comparisons: We will report the Matthews Correlation Coefficient (MCC) for the approach by English et al. (2017) and expand our comparison and discussion of GLM-based methods.

(4) Clarifications: We will rigorously define the dataset details (labeling, recording methodology), mathematical notation, and machine learning terminology ('data', 'labels').

(5) Discussion of Limitations: We will explicitly discuss the challenges and limitations inherent in generalizing to more recurrently connected regions.

Below are our more detailed responses:

Public Reviews:

Reviewer #1 (Public review):

Weaknesses:

(1) The validation of the approach is incomplete: due to its very limited size, the single ground-truth dataset considered does not provide a sufficient basis to draw a strong conclusion. While the authors correctly note that this is the only dataset of its kind, the value of this validation is limited compared to what could be done by carefully designing in silico experiments.

We thank the reviewer for acknowledging the scarcity of suitable in vivo ground-truth datasets and the limitations this poses. We agree that additional validation is necessary to draw strong conclusions. In the revised manuscript, we will systematically design and perform in silico experiments for evaluations beyond the single in vivo dataset.

(2) Surprisingly, the authors fail to compare their method to the approach originally proposed for the data they validate on (English et al., 2017).

We agree that this is an essential comparison. We will report the Matthews Correlation Coefficient (MCC) result of the approach by English et al. (2017) on the spontaneous period of the recording.

(3) The authors make a commendable effort to study the method's robustness by pushing the limits of the dataset. However, the logic of the robustness analysis is often unclear, and once again, the limited size of the dataset poses major limitations to the authors.

We appreciate the reviewer recognizing our initial efforts to evaluate robustness. In our original draft, we tested recording length, network model choices, and analyzed failure cases. However, we agree that the limited real data restricts the scope of these tests. To address this, we will perform more systematic robustness tests on the newly generated synthetic datasets in the revised version, allowing us to evaluate performance under a wider range of conditions.

(4) The lack of details concerning both the approach and the validation makes it challenging for the reader to establish the technical soundness of the study.

We will revise the manuscript thoroughly to better present the methodology of our framework and the validation pipelines. We will ensure that the figures and text clearly articulate the technical details required to assess the soundness of the study.

Although in the current form this study does not provide enough basis to judge the impact of DeepDAM in the broader neuroscience community, it nevertheless puts forward a valuable and novel idea: using domain adaptation to mitigate the problem of model mismatch. This approach might be leveraged in future studies and methods to infer connectivity.

We thank the reviewer again for acknowledging the novelty and importance of our work.

Reviewer #2 (Public review):

While the validation data set was well chosen and of high quality, it remains a single dataset and also remains a non-recurrent network. The authors acknowledge this in the discussion, but I wanted to chime in to say that for the method to be more than convincing, it would need to have been tested on more datasets. It should be acknowledged that the problem becomes more complicated in a recurrent excitatory network, and thus the method may not work as well in the cortex or in CA3.

We will carefully revise our text to specifically discuss this limitation and the challenges inherent in generalizing to more recurrently connected regions. Furthermore, to empirically address this concern, we will test our method extensively on synthetic datasets generated from highly recurrent networks to quantify performance in these regimes.

While the data is shown to work in this particular dataset (plus the two others at the end), I was left wondering when the method breaks. And it should break if the models are sufficiently mismatched. Such a question can be addressed using synthetic-synthetic models. This was an important intuition that I was missing, and an important check on the general nature of the method that I was missing.

We thank the reviewer for this insight regarding the general nature of the method. While we previously analyzed failure cases regarding strong covariation and low spike counts, we agree that a systematic analysis of mismatch magnitude is missing. Building on our planned experiments with synthetic data, we will analyze and discuss exactly when the method breaks as a function of the mismatch magnitude between datasets.

While the choice of state-of-the-art is good in my opinion, I was looking for comments on the methods prior to that. For instance, methods such based on GLMs have been used by the Pillow, Paninski, and Truccolo groups. I could not find a decent discussion of these methods in the main text and thought that both their acknowledgement and rationale for dismissing were missing.

As the reviewer noted, we extensively compared our method with a GLM-based method (GLMCC) and CoNNECT, whose superiority over other GLM-based methods, such as extend GLM method (Ren et al., 2020, J Neurophysiol), have already been demonstrated in their papers (Endo et al., Sci Rep, 2021). However, we acknowledge that the discussion of the broader GLM literature was insufficient. To make the comparison more thorough, we will conduct comparisons with additional GLM-based methods and include a detailed discussion of these approaches.

Endo, D., Kobayashi, R., Bartolo, R., Averbeck, B. B., Sugase-Miyamoto, Y., Hayashi, K., ... & Shinomoto, S. (2021). A convolutional neural network for estimating synaptic connectivity from spike trains. Scientific Reports, 11(1), 12087.

Ren, N., Ito, S., Hafizi, H., Beggs, J. M., & Stevenson, I. H. (2020). Model-based detection of putative synaptic connections from spike recordings with latency and type constraints. Journal of Neurophysiology, 124(6), 1588-1604.

While most of the text was very clear, I thought that page 11 was odd and missing much in terms of introductions. Foremost is the introduction of the dataset, which is never really done. Page 11 refers to 'this dataset', while the previous sentence was saying that having such a dataset would be important and is challenging. The dataset needs to be properly described: what's the method for labeling, what's the brain area, what were the spike recording methodologies, what is meant by two labeling methodologies, what do we know about the idiosyncrasies of the particular network the recording came from (like CA1 is non-recurrent, so which connections)? I was surprised to see 'English et al.' cited in text only on page 13 since their data has been hailed from the beginning.

Further elements that needed definition are the Nsyn and i, which were not defined in the cortex of Equation 2-3: I was not sure if it referred to different samples or different variants of the synthetic model. I also would have preferred having the function f defined earlier, as it is defined for Equation 3, but appears in Equation 2.

When the loss functions are described, it would be important to define 'data' and 'labels' here. This machine learning jargon has a concrete interpretation in this context, and making this concrete would be very important for the readership.

We thank the reviewer for these constructive comments on the writing. We will clarify the introduction of the dataset (labeling method, brain area, recording methodology) and ensure all mathematical terms (such as Nsyn, i, and function f) and machine learning terminology (definitions of 'data' and 'labels' in this context) are rigorously defined upon first use in the revised manuscript.

While I appreciated that there was a section on robustness, I did not find that the features studied were the most important. In this context, I was surprised that the other datasets were relegated to supplementary, as these appeared more relevant.

Robustness is an important aspect of our framework to demonstrate its applicability to real experimental scenarios. We specifically analyzed how synchrony between neurons, the number of recorded spikes and the choice of the network influence the performance of our method. We also agree that these aspects are limited by the one dataset we evaluated on. Therefore, we will test the robustness of our method more systematically on synthetic datasets.

With more extensive analysis on synthetic datasets, we believe that the results on inferring biophysical properties of single neuron and microcircuit models remain in the supplement, such that the main figures focus purely on synaptic connectivity inference.

Some of the figures have text that is too small. In particular, Figure 2 has text that is way too small. It seemed to me that the pseudo code could stand alone, and the screenshot of the equations did not need to be repeated in a figure, especially if their size becomes so small that we can't even read them.

We will remove the pseudo-code and equations from Figure 2 to improve readability. The pseudo-code will be presented as a distinct box in the main text.

Read the original source
eLife
Jan 22, 2026

eLife Assessment

This article reports an algorithm for inferring the presence of synaptic connection between neurons based on naturally occurring spiking activity of a neuronal network. One key improvement is to combine self-supervised and synthetic approaches to learn to focus on features that generalize to the conditions of the observed network. This valuable contribution is currently supported by incomplete evidence.

Read the original source
eLife
Jan 22, 2026

Reviewer #1 (Public review):

Summary:

The authors proposed a new method to infer connectivity from spike trains whose main novelty relies on their approach to mitigate the problem of model mismatch. The latter arises when the inference algorithm is trained or based on a model that does not accurately describe the data. They propose combining domain adaptation with a deep neural architecture and in an architecture called DeepDAM. They apply DeepDAM to an in vivo ground-truth dataset previously recorded in mouse CA1, show that it performs better than methods without domain adaptation, and evaluate its robustness. Finally, they show that their approach can also be applied to a different problem i.e., inferring biophysical properties of individual neurons.

Strengths:

(1) The problem of inferring connectivity from extracellular recording is …

Reviewer #1 (Public review):

Summary:

The authors proposed a new method to infer connectivity from spike trains whose main novelty relies on their approach to mitigate the problem of model mismatch. The latter arises when the inference algorithm is trained or based on a model that does not accurately describe the data. They propose combining domain adaptation with a deep neural architecture and in an architecture called DeepDAM. They apply DeepDAM to an in vivo ground-truth dataset previously recorded in mouse CA1, show that it performs better than methods without domain adaptation, and evaluate its robustness. Finally, they show that their approach can also be applied to a different problem i.e., inferring biophysical properties of individual neurons.

Strengths:

(1) The problem of inferring connectivity from extracellular recording is a very timely one: as the yield of silicon probes steadily increases, the number of simultaneously recorded pairs does so quadratically, drastically increasing the possibility of detecting connected pairs.

(2) Using domain adaptation to address model mismatch is a clever idea, and the way the authors introduced it into the larger architecture seems sensible.

(3) The authors clearly put a great effort into trying to communicate the intuitions to the reader.

Weaknesses:

(1) The validation of the approach is incomplete: due to its very limited size, the single ground-truth dataset considered does not provide a sufficient basis to draw a strong conclusion. While the authors correctly note that this is the only dataset of its kind, the value of this validation is limited compared to what could be done by carefully designing in silico experiments.

(2) Surprisingly, the authors fail to compare their method to the approach originally proposed for the data they validate on (English et al., 2017).

(3) The authors make a commendable effort to study the method's robustness by pushing the limits of the dataset. However, the logic of the robustness analysis is often unclear, and once again, the limited size of the dataset poses major limitations to the authors.

(4) The lack of details concerning both the approach and the validation makes it challenging for the reader to establish the technical soundness of the study.

Although in the current form this study does not provide enough basis to judge the impact of DeepDAM in the broader neuroscience community, it nevertheless puts forward a valuable and novel idea: using domain adaptation to mitigate the problem of model mismatch. This approach might be leveraged in future studies and methods to infer connectivity.

Read the original source
eLife
Jan 22, 2026

Reviewer #2 (Public review):

The article is very well written, and the new methodology is presented with care. I particularly appreciated the step-by-step rationale for establishing the approach, such as the relationship between K-means centers and the various parameters. This text is conveniently supported by the flow charts and t-SNE plots. Importantly, I thought the choice of state-of-the-art method was appropriate and the choice of dataset adequate, which together convinced me in believing the large improvement reported. I thought that the crossmodal feature-engineering solution proposed was elegant and seems exportable to other fields. Here are a few notes.
While the validation data set was well chosen and of high quality, it remains a single dataset and also remains a non-recurrent network. The authors acknowledge this in the …

Reviewer #2 (Public review):

The article is very well written, and the new methodology is presented with care. I particularly appreciated the step-by-step rationale for establishing the approach, such as the relationship between K-means centers and the various parameters. This text is conveniently supported by the flow charts and t-SNE plots. Importantly, I thought the choice of state-of-the-art method was appropriate and the choice of dataset adequate, which together convinced me in believing the large improvement reported. I thought that the crossmodal feature-engineering solution proposed was elegant and seems exportable to other fields. Here are a few notes.
While the validation data set was well chosen and of high quality, it remains a single dataset and also remains a non-recurrent network. The authors acknowledge this in the discussion, but I wanted to chime in to say that for the method to be more than convincing, it would need to have been tested on more datasets. It should be acknowledged that the problem becomes more complicated in a recurrent excitatory network, and thus the method may not work as well in the cortex or in CA3.

While the data is shown to work in this particular dataset (plus the two others at the end), I was left wondering when the method breaks. And it should break if the models are sufficiently mismatched. Such a question can be addressed using synthetic-synthetic models. This was an important intuition that I was missing, and an important check on the general nature of the method that I was missing.

While the choice of state-of-the-art is good in my opinion, I was looking for comments on the methods prior to that. For instance, methods such based on GLMs have been used by the Pillow, Paninski, and Truccolo groups. I could not find a decent discussion of these methods in the main text and thought that both their acknowledgement and rationale for dismissing were missing.

While most of the text was very clear, I thought that page 11 was odd and missing much in terms of introductions. Foremost is the introduction of the dataset, which is never really done. Page 11 refers to 'this dataset', while the previous sentence was saying that having such a dataset would be important and is challenging. The dataset needs to be properly described: what's the method for labeling, what's the brain area, what were the spike recording methodologies, what is meant by two labeling methodologies, what do we know about the idiosyncrasies of the particular network the recording came from (like CA1 is non-recurrent, so which connections)? I was surprised to see 'English et al.' cited in text only on page 13 since their data has been hailed from the beginning.

Further elements that needed definition are the Nsyn and i, which were not defined in the cortex of Equation 2-3: I was not sure if it referred to different samples or different variants of the synthetic model. I also would have preferred having the function f defined earlier, as it is defined for Equation 3, but appears in Equation 2.

When the loss functions are described, it would be important to define 'data' and 'labels' here. This machine learning jargon has a concrete interpretation in this context, and making this concrete would be very important for the readership.

While I appreciated that there was a section on robustness, I did not find that the features studied were the most important. In this context, I was surprised that the other datasets were relegated to supplementary, as these appeared more relevant.

Some of the figures have text that is too small. In particular, Figure 2 has text that is way too small. It seemed to me that the pseudo code could stand alone, and the screenshot of the equations did not need to be repeated in a figure, especially if their size becomes so small that we can't even read them.

Read the original source
Version published to 10.7554/elife.109028.1 on eLife
Jan 22, 2026
Version published to 10.7554/elife.109028 on eLife
Jan 22, 2026
Version published to 10.1101/2022.10.03.510694 on bioRxiv
Oct 7, 2022

Explaining Brain Computation Through Mechanistic Interpretability of Deep Neural Networks

This article has 8 authors:
1. Martina Gonzalez Vilas
2. Federico G Adolfi
3. Bhavin Choksi
4. Gabriele Merlin
5. Mathis Pink
6. Alan Sun
7. Gemma Roig
8. Mariya Toneva
This article has no evaluationsLatest version Jan 14, 2026
Reluctant to ReLU: Uncontrolled Connectivity Pruning Underlying Trainable Excitatory-Inhibitory Recurrent Neural Networks

This article has 2 authors:
1. Ignacio Castillejo
2. Miguel A. Vadillo
This article has no evaluationsLatest version Jan 26, 2026
Reluctant to ReLU: Uncontrolled Connectivity Pruning Underlying Trainable Excitatory-Inhibitory Recurrent Neural Networks

This article has 2 authors:
1. Ignacio Castillejo
2. Miguel A. Vadillo
This article has no evaluationsLatest version Jan 26, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Explaining Brain Computation Through Mechanistic Interpretability of Deep Neural Networks

Reluctant to ReLU: Uncontrolled Connectivity Pruning Underlying Trainable Excitatory-Inhibitory Recurrent Neural Networks

Reluctant to ReLU: Uncontrolled Connectivity Pruning Underlying Trainable Excitatory-Inhibitory Recurrent Neural Networks