Modelling multicellular coordination by bridging cell-cell communication and intracellular regulation through multilayer networks

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Log in to save this article

Abstract

In multicellular organisms, cells with various roles and locations coordinate to provide systemic and cohesive response to perturbations. These complex behaviors emerge from a complex interplay between intracellular regulation and intercellular signals that mediate cell-cell communication. While single-cell technologies opened the possibility of studying both, most methods focus solely on one of these aspects. Thus, they are only able to partially recover in vivo and multicellular behaviors.

We here introduce ReCoN (REconstruction of multicellular COordination Networks from single-cell data), a framework combining intracellular gene regulation and cell-cell communication to provide insights into multicellular coordination from single-cell data. First, ReCoN infers from single-cell data a heterogeneous multilayer network containing both cell-type-specific intracellular subnetworks and ligand-receptor interactions. Through random walk with restart explorations, ReCoN then infers the response of each cell type to both intra- and extracellular perturbations, such as a gene knock-out or a cytokine, respectively.

ReCoN was evaluated on predicting the in vivo response of immune cell-types to different cytokines and on recovering cardiac cell-type response in heart failure. It highlighted the role of indirect effects, where cells emit secondary messengers in response to the initial perturbation to coordinate multicellular transcriptomic responses. Additionally, ReCoN predicted distinct fibroblast states emerging in different microenvironments reconstructed from spatial data. ReCoN provides an interpretable modeling framework for multicellular systems that allows for the simulation of perturbations, including the assessment of the cellular selectivity of these treatments in vivo . Ultimately, it can help design patient-specific molecular therapies.

Article activity feed

  1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

    Learn more at Review Commons


    Reply to the reviewers

    Manuscript number: RC-2026-03407

    Corresponding author(s): Laura Cantini, Julio Saez-Rodriguez

    [The "revision plan" should delineate the revisions that authors intend to carry out in response to the points raised by the referees. It also provides the authors with the opportunity to explain their view of the paper and of the referee reports.

    The document is important for the editors of affiliate journals when they make a first decision on the transferred manuscript. It will also be useful to readers of the reprint and help them to obtain a balanced view of the paper.

    If you wish to submit a full revision, please use our "Full Revision" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

    1. General Statements [optional]

    This section is optional. Insert here any general statements you wish to make about the goal of the study or about the reviews.

    We thank both reviewers for their thorough and constructive evaluation of our manuscript.

    Reviewer 1 highlighted that the manuscript would benefit from 1) a stronger positioning of ReCoN within the existing literature on multicellular modelling and network exploration, 2) a justification of our methodological choices, including the use of Random Walk with Restart (RWR), 3) the choice of input datasets for GRN inference and an assessment of the robustness of ReCoN's predictions to noise in these networks, 4) a more systematic exploration of ReCoN's parameter space (restart probability, layer transition probabilities, filtering thresholds).

    Reviewer 2 raised concerns about 1) the generalisability of the α parameter value (by default, 0.8) across independent datasets, 2) the expected contribution of the indirect effect in prediction performances, 3) the robustness of GRN across datasets and systems, and 4) the need for more quantitative validation in the spatial/microenvironment showcase. They also pointed out an unsupported claim regarding gene knockout prediction in the abstract.

    Several clarifications on figures, methods, and writing were also requested by both reviewers.

    As the main addition to the manuscript, we propose a new showcase based on the recently published Human Cytokine Dictionary (Oesinghaus et al., 2025). This showcase will simultaneously address several reviewer concerns by allowing us to 1) test the robustness and performance of α = 0.8 in an independent dataset, 2) evaluate the impact of different GRN inference methods (HuMMuS, SCENIC+, CellOracle, GRNBoost2) and noise on ReCoN's predictions..

    We will conduct a systematic parameter exploration on the Heart Atlas showcase, covering restart probability and inter-layer transition probabilities. We will additionally strengthen the validation of the microenvironment showcase by providing additional comparison to matched single-cell fibroblast data.

    Regarding the manuscript, we will substantially expand the discussion to better contextualise ReCoN within existing multicellular modelling approaches and the methods to justify our methodological choices (RWR/MultiXrank, dataset selection). We will remove the unsupported gene knockout claim from the abstract and reframe it as a future direction. In addition, we will clarify the distinction between ReCoN variants and rename them for clarity in the results section 1.2., improve figure legends. Finally, we will also work on the tool's documentation, including new tutorials on using spatial data and on running ReCoN with scRNA-seq-only GRN inference.

    We believe these revisions will substantially strengthen the manuscript and address the reviewers' concerns regarding method's robustness, generalisation, and contextualisation.

    2. Description of the planned revisions

    Reviewers' comments are in blue

    Authors' answers are in black

    Proposed text modifications are in green

    Reviewer #1

    R1.1. This is a very well-written paper; the methods used are adequate, and the use cases are relevant and broad, exploiting state-of-the-art datasets and tools.

    The author's claims are mostly justified. The authors could make an effort to more explicitly cite other efforts in similar directions. The claim 'We envision ReCoN as an extension to prior multicellular modelling, offering an interesting compromise between prediction of cell type responses and understanding of their molecular coordination.' is very general and could be better substantiated. In fact, the authors do not really give examples of alternative approaches to study systems of interacting cells, other than mechanistic agent-based models, which are clearly very different.

    Response:

    We thank the reviewer for pointing out the lack of contextualisation for ReCoN in this closing discussion.

    *We wanted to remind that ReCoN builds notably on multicellular factor decomposition methods. We also want to emphasise *the interest in completing cell communication methods that describe the big picture in multicellular interactions.

    *We proposed to **explicitly state these two points with such rephrasing: *

    Network-based representations of multicellular systems have been an active field for many years, from early conceptual cytokine networks (Frankenstein, Alon, and Cohen 2006) to curated ligand-receptor cascades of hematopoietic tissue (Kirouac et al. 2010, Qiao et al. 2014). In parallel, and from bulk RNA-seq, the consideration of tissue specificities in GRN inference has been another way to consider the importance of the context in molecular mechanisms reconstruction (Sonawane et al. 2017). Single-cell analysis allowed decomposing tissue composition and quantifying gene expression, opening the possibility of scaling the inference of these networks and the inference of multicellular mechanisms in general, to large sets of molecules. Several methods have been developed to recover multicellularity. A first direction extends ligand-receptor interaction inference into the receiver cell response through curated signalling cascades, yielding ligand to target cascades (Browaeys, Saelens, and Saeys 2020, Jin et al. 2021, Zhang et al. 2021, Yan et al. 2025). A second direction leverages spatial context through explainable multi-view models that decompose marker variation in both intra- and intercellular contributions (Arnol et al. 2019, Tanevski et al. 2022), without considering the mediating cascades. Finally, the more recent family of multicellular factor decomposition methods focuses on the coordinated aspect of cellular programs rather than on the mechanisms. ReCoN's methodology proposes a network-based approach based on single-cell data and the philosophy of this last group of methods. Indeed, ReCoN aims to retrieve links between molecular drivers and such coordinated multicellular programs by bridging and exploring CCC inference and GRN modelling (Badia-i-Mompel et al. 2023) within large and coherent heterogeneous multilayer network.

    Arnol D, Schapiro D, Bodenmiller B et al. Modeling Cell-Cell Interactions from Spatial Molecular Data with Spatial Variance Component Analysis. Cell Rep 2019;29(1):202-211.e6. https://doi.org/10.1016/j.celrep.2019.08.077.

    Badia-i-Mompel P, Casals-Franch R, Wessels L et al. Comparison and evaluation of methods to infer gene regulatory networks from multimodal single-cell data. Preprint, bioRxiv, 21 Dec. 2024, 2024.12.20.629764. https://doi.org/10.1101/2024.12.20.629764.

    Badia-i-Mompel P, Wessels L, Müller-Dott S et al. Gene regulatory network inference in the era of single-cell multi-omics. Nat Rev Genet 2023;24(11):739-54. https://doi.org/10.1038/s41576-023-00618-5.

    Browaeys R, Saelens W, Saeys Y. NicheNet: modeling intercellular communication by linking ligands to target genes. Nat Methods 2020;17(2):159-62. https://doi.org/10.1038/s41592-019-0667-5.

    Frankenstein Z, Alon U, Cohen IR. The immune-body cytokine network defines a social architecture of cell interactions. Biol Direct 2006;1(1):32. https://doi.org/10.1186/1745-6150-1-32.

    Jin S, Guerrero-Juarez CF, Zhang L et al. Inference and analysis of cell-cell communication using CellChat. Nat Commun 2021;12(1):1088. https://doi.org/10.1038/s41467-021-21246-9.

    Kirouac DC, Ito C, Csaszar E et al. Dynamic interaction networks in a hierarchically organized tissue. Mol Syst Biol 2010;6(1):MSB201071. https://doi.org/10.1038/msb.2010.71.

    Oesinghaus L, Becker S, Vornholz L et al. A single-cell cytokine dictionary of human peripheral blood. Preprint, bioRxiv, 15 Dec. 2025, 2025.12.12.693897. https://doi.org/10.64898/2025.12.12.693897.

    Qiao W, Wang W, Laurenti E et al. Intercellular network structure and regulatory motifs in the human hematopoietic system. Mol Syst Biol 2014;10(7):MSB145141. https://doi.org/10.15252/msb.20145141.

    Radig J, Droit R, Doncevic D et al. Tracking biological hallucinations in single-cell perturbation predictions using scArchon, a comprehensive benchmarking platform. Preprint, bioRxiv, 27 June 2025, 2025.06.23.661046. https://doi.org/10.1101/2025.06.23.661046.

    Sonawane AR, Platig J, Fagny M et al. Understanding Tissue-Specific Gene Regulation. Cell Rep 2017;21(4):1077-88. https://doi.org/10.1016/j.celrep.2017.10.001.

    Tanevski J, Flores ROR, Gabor A et al. Explainable multiview framework for dissecting spatial relationships from highly multiplexed data. Genome Biol 2022;23(1):97. https://doi.org/10.1186/s13059-022-02663-5.

    Yan L, Cheng J, Nie Q et al. Dissecting multilayer cell-cell communications with signaling feedback loops from spatial transcriptomics data. Genome Res published online 12 May 2025. https://doi.org/10.1101/gr.279857.124.

    Zhang Y, Liu T, Hu X et al. CellCall: integrating paired ligand-receptor and transcription factor activities for cell-cell communication. Nucleic Acids Res 2021;49(15):8520-34. https://doi.org/10.1093/nar/gkab638.

    R1.2. Moreover, the exploration of the multilayer networks with RWR is a very reasonable choice but could there be other approaches? I think the authors could discuss this issue to briefly support their choice of this method.

    Response:

    *It is a very relevant comment, as this choice has not been *discussed in the paper; we propose extending the method section about ReCoN's networks exploration with a justification about this choice.

    There is currently a limited set of network exploration methods that have been implemented for multilayer networks. It includes notably pymnet (Nurmi et al., 2024), natively adapted to heterogenous multilayer networks, and multinet (Bagavathi et al., 2019) and muxviz (De Domenico et al., 2015), initially developed for multiplexed networks (e.g. social network where the same set of nodes is present in each layer) but adaptable to more complex multilayer networks. However, to our knowledge, only MultiXrank proposes a robust measurement of proximity between each pair of nodes.

    Indeed, pymnet does not propose implementation for pairwise distance, similarly for muxViz*, which focuses on community and motif detection. Multi-net does propose pairwise distance based on shortest paths, but implements it only for nodes of the same multiplex (e.g. in our network, it would only be two genes, or two receptors, respectively). *https://www.rdocumentation.org/packages/multinet/versions/4.3.2/topics/multinet.distance

    We provide the additional justification for choosing RWR and MultiXrank over a reimplementation of another method or an extension of another method.

    1. *The total complexity of the RWR is O(δm) - when the number of nodes is negligible compared to the number of edges, with m the number of edges and δ the number of iterations in the walk (Baptista et al., 2022 - Supp Notes 2.A; Jin W. et al, 2019). This linear increase with the number of edges is particularly interesting for large networks, such as ReCoN ones that can contain several million edges. The number of iteration δ and the computational time increases inversely to the restart probability, which is an important factor to keep this probability high. *
    1. *MultiXrank is particularly interesting for its flexibility as it allows to easily attribute different weights to the different layers and to precise the direction of the exploration easily. *
    1. It also produces deterministic results by prolonging exploration until convergence.
    1. Additionally, in the context of ReCoN, the indirect effect of each cell is run independently. We previously extended the implementation of multiXrank for running RWR in parallel in a previous work (Trimbour et al., 2024), making it already adapted for optimising ReCoN's explorations.

    For all these reasons MultiXRank implementation seemed to be the best choice for robust and efficient exploration of ReCoN's HMLN.

    *Bagavathi, A., Krishnan, S. (2019). Multi-Net: A Scalable Multiplex Network Embedding Framework. In: Aiello, L., Cherifi, C., Cherifi, H., Lambiotte, R., Lió, P., Rocha, L. (eds) Complex Networks and Their Applications VII. COMPLEX NETWORKS 2018. Studies in Computational Intelligence, vol 813. Springer, Cham. *https://doi.org/10.1007/978-3-030-05414-4_10

    Manlio De Domenico, Mason A. Porter, Alex Arenas, MuxViz: a tool for multilayer analysis and visualization of networks, Journal of Complex Networks, Volume 3, Issue 2, June 2015, Pages 159-176, https://doi.org/10.1093/comnet/cnu038

    *Nurmi et al., (2024). pymnet: A Python Library for Multilayer Networks. Journal of Open Source Software, 9(99), 6930, *https://doi.org/10.21105/joss.06930

    Jin, Woojeong, Jinhong Jung, and U. Kang. "Supervised and extended restart in random walks for ranking and link prediction in networks." PloS one 14.3 (2019): e0213857

    R1.3. Generally the discussion should provide the reader the context in the existing literature in which the work can be set, detailing its impact. I think this could be improved.

    Response:

    We hope that the correction on the context proposed for comment R1.1 offers a first clarification on the context in the literature.

    We also propose to extend the description of ReCoN's impact with the following sentences in the discussion: "Unlike purely data-driven approaches, ReCoN contextualizes prior knowledge balancing both robustness through literature data, and specificity through new measurements. This mechanistic approach opens new possibilities for understanding how cellular coordination shapes tissue-level responses and for designing targeted molecular interventions."

    R1.4. Regarding the choice of datasets, it is clear that the method is quite demanding, requiring single cell and different omics to build the model, in addition to the expression dataset that is used as a use case. This inevitably leads to using a mix of datasets.

    For example in the mouse experiments the gene regulatory network was inferred from both a lymph node scRNA-seq dataset and a splenic scATAC-seq dataset, presumably due to the lack of multiome data in this setting. However the cell-cell communication network was inferred from the control case of the Immune Dictionary. Why can't the authors use the control data also for inferring GRNs?

    Is atac-seq really necessary in the inference of the GRN? What is the impact of the fact that lymph node and spleen samples might be different?

    :

    *Is it a very **interesting comment, and we propose to add both 1) an explanation about our dataset choice to generate the GRN as a Supplementary text, and 2) a new experiment about the effect of GRNs built from multi-omics and scRNA-seq alone. *

    1. Dataset choice

    We decided to infer a GRN using multiomics data, as these methods seem to perform better and are becoming the state of the art (Badia-i-Mompel et al. 2023, Trimbour, Deutschmann, and Cantini 2024, Yuan and Duren 2025).

    *As scATAC-seq data was not produced *for the Mouse Immune dictionary, we tried to find an external dataset, used HuMMuS, the method we previously developed, as it is also based on RWR and performs well on unpaired data.

    scATAC-seq

    Our first criteria was to match the mouse model used in the immune dictionary dataset, which reduced importantly the number of multicellular immune cell datasets available. We extended our research to a splenic dataset, as spleen is itself classified as a high specialised lymphatic structure, (check) and contains notably the same cell types than classical lymph nodes.

    scRNA-seq

    While we could technically use the control mice of the Immune Dictionary single-cel RNA-seq data with the spleen scATAC-seq data, the Immune Dictionary only provides 100 or less cells for each cell types per stimulation, which would results in a low number of cells. As GRN quality seems to depend a lot on the number of cell used, we favoured choosing a larger dataset.

    Our choice to use single-cell multiomics methods was driven by the novelty of these methods over scRNA-seq based ones, the performance improvement that they seemed to offer in several benchmarkings, and the will of developing a pipeline integrating the most complete data available for contextualization (Badia-i-Mompel et al. 2024).

    1. GRN impact over the Human Immune Dictionary

    While it does not relate directly to this showcase, we will also add a new dataset analysis, detailed in the the comment R1.12. In the Human Cytokine Dictionary showcase,, we propose exploring the effect of choosing different GRNs, built from external multi-omics data or from the control scRNA-seq data of the dataset itself. We hope it can partially help users to decide in general wether to use external datasets of higher quality or sample-specific datasets.

    Finally, we propose to add in the documentation of the tool, a section showing how to use ReCoN with only scRNA-seq for the GRN inference, and the performance of different GRNs for the Human Cytokine Dictionary dataset directly in the paper.

    R1.5. The code is very clear, we were able to install and run it and it is quite well-documented. However, a few more details should be given in the text regarding how the evaluation of the performance is carried out.

    For example: If I understand correctly, when predicting the impact of cytokine perturbations the ReCoN predictions of genes impacted are compared to differentially expressed genes identified through traditional DEG analysis. What is compared is the ranking of these genes from ReCoN with the ranking provided by DEseq2. There is no description of how this comparison of ranking gives rise to AUROC values. Also, is it just the ranking that is predicted or can they also estimate how well they can predict the effect size?

    Response:

    We are thankful for pointing out the unclear technical details. DEG results were binarised, to obtain the list of differentially genes using the thresholds indicated in the section 4.4.4. We considered a gene as perturbed in each cytokine treatment if the comparison of control and treated cells had a t-test p-value below 0.1 and if the log-fold change was above 1.

    *The second, and more general point of the reviewers, ReCoN scores should be considered to provide ranking on the possible regulations, but cannot be considered proportional to the effect size. *As they are represent a likelihood more than a score, the binarisation should be the most appropriate transformation for the validation

    *Moreover, as the scores can be seen as the probability to end up the exploration on each node, they are always summing to one. This also prevents interpreting the scores as the amplitude of change. As an illustration example: if a receptor regulates three genes identically, they would (hopefully) all be having a score of (1 - R)/3, R being the restart probability in ReCoN, whether their expression doubles or is multiplied by 10. *

    While it can legitimately be seen as a downside, we believe it is similar in practice to most methods inferring GRN methods in practice, where trying to predict the true amplitude of gene perturbations usually results in very low performances (Badia-i-Mompel et al. 2024).

    We propose changes related to this comment.

    1. We would modify the section 4.4.4. of the method with the following paragraph to explicit that it consists in a binary selection: "For each cytokine-cell type pair, differentially expressed genes were binarised: genes passing the significance thresholds (FDR P-val 1) were labelled as positives, and all remaining genes as negatives. ReCoN scores were then used to rank all genes, and AUROC values were computed from this ranking against the binary labels."
    1. We will also include a section "ReCoN scores interpretation" on the documentation website, as score interpretation precisions will be particularly useful for users.

    R1.6. When describing the use cases, I think a bit more detail would help.

    For example 'To identify the cell-type-specific genes associated with HF, we used the MOFAcell scores of the multicellular factor 1 (MCP1) reported in ReHeat236' I supposed the explanation is on the dataset but for the sake of clarity it would be good to expand this sentence to give at least an idea of the approach.

    Response:

    We completely agree that more explanations should be provided, to avoid for the reader having to switching between articles to understand the concepts behind this showcase. As suggested by the reviewer, we propose a general description of the approach with the short paragraph, and to remove the term "loading":

    "In the ReHeat2 study, the first multicellular factor (MCP1) was associated with heart failure. We used the gene loadings of MCP1 as a proxy for the cell-type-specific transcriptomic changes associated with heart failure, ranking genes by their absolute loading values."

    We also propose to complete the method section: "MOFAcell is a multicellular factor analysis method that decomposes multi-sample single-cell data into latent factors representing coordinated gene expression patterns across cell types. Each factor is characterised by cell-type-specific gene scores, reflecting their individual contribution to the coordinated program. In this showcase, we use the first multicellular program (MCP1), as it was associated with heart failure"

    R1.7. Regarding the calculation of the R matrix from the NichNet matrices L and G, I gather that the R matrix is calculated once and is thus fully data-independent and available just like the L and G matrices from NichNet. This was not very clear in the tutorials.

    Response:

    We are very thankful for the reviewers' involvement in testing the tools itself and its documentation. First, we propose a new website page explaining the pre-computed resources available for receptor - gene links, and added a descriptive paragraph in the tutorial themselves.

    *Second, we notice a typo in the equation, where it should actually be L = R * G with the current definition. We corrected it in the next version, and precised that R is fully data independent and solely inferred from prior knowledge. *

    R1.8. Also, this might just be a typo in the tutorial: 'The default α = 0.8 gives more weight to direct effects, which has been empirically validated. You can adjust this based on your biological question." I believe the manuscript says alpha>0.5 refers to indirect effects dominating.

    Response:

    *We corrected the saying in the tutorials. Indeed, a high alpha *represents a stronger indirect effect. Additionally, a similar typo was in the first equation of the paper, we are correcting it too.

    R1.9. Same for the pre-processing of the spatial data for the third use case, a little more details on how this was done would help the users and readers.

    Response:

    We propose adding a specific section about the spatial pre-processing and analysis in the methods.

    *We are also adding a tutorial on spatial data. Since spatial data *processing is computationally intensive without GPUs, we will also provide the data already processed, in order to allow anyone to test this tutorial too.

    R1.10. I don't see issues with the statistical power of the analysis.

    Rather, I think the authors should provide some examination of the parameter space for their model. Whereas ana analysis of the impact of the Alpha parameter is provided, I believe there are several more parameters that have a crucial impact and choices for their values should be discussed.

    For example 'In the GRN reconstruction only the links with a score above 1.5e-7 were retained in ReCoN's gene regulatory layer. How was this chosen?

    We have identified the following parameters that are somehow justified but could be explored to have a better feel for how they impact the results

    Restart probability: How often the walker goes back to the starting seed/molecule

    Layer transition probability: How often the walker stays in the same layer - different cell? - different layers? Gamma

    Node transition within a layer: How often one jumps to a different layer

    Response:

    This is a very valid point raised by the reviewer about parameters explorations.

    We focused on exploring the alpha (direct/indirect effect) parameter, as its value was the incertitude when designing the model.

    We would like to address this comment by adding new explorations for the restart probability and the transition probability between layers. The probability to transition between specific nodes inside a layer directly depends itself on 1) the restart probability, 2) the transition probabilities, and 3) the weights of the edges, that are determined before and independently to ReCoN's exploration.

    The Heart Atlas showcase allows to evaluate each set of parameters in around 10 min instead of 10h for the Immune Dictionary. We thus propose to evaluate restart probability and layer transition probabilities on the data of this showcase.

    • *We would explore *the restart probability of 0.1 * N, with N between 1 and 9.
    • For transitions probabilities we propose varying GRN, receptor, and cell communication importance with the following configurations: - Staying in CCC probabilities (- not jumping to receptor layer) among (0.1, 0.3, 0.5, 0.7, 0.9), staying in receptor layer (- not jumping to GRN) of (0.25, 0.5, 0.75), staying in GRN layer (- not jumping to CCC) of (0.25, 0.5, 0.75). It would result in 9 intracellular variations combined with 5 intercellular variations.

    *We envision an evaluation by measuring the correlation between the results of the different configurations, and the time before convergence of the results, as it could potentially *increase drastically when decreasing the restart probability. If correlations below 0.9 are observed between some results, we will compare their absolute performances.

    We* would include the** figures related to these explorations in the supplementary data. We would highlight the main findings in the method section dedicated to the random walk with restart. Finally, we would briefly describe the parameter exploration design in the first section of the results, for curious readers who would like to verify parameter choice before reading the showcases.*

    R1.11. Weighting parameters: How much weight for direct or indirect effect to account for the combined effect - alpha - this is the only one that is explicitly explored.

    Response:

    We are very thankful for this comment, and we decided to modify our tutorial guidelines to make this choice more intuitive and general.

    Indeed, 1.5e-7 would hardly make sense for most methods, which would not produce such low scores. We now propose to select the first 2 million connections of GRNs, in order to keep a complete or a large portion of the network if other methods than HuMMuS are applied.

    *In our case, 1.5e-7 was empirically determined from the distribution of HuMMuS scores, to keep *the 2 million top connections as HuMMuS networks are generally almost fully connected, which is a particularity for classical GRN inference methods, and keeping it entirely would make exploration time much longer.

    R1.12. Finally, this might be considered OPTIONAL but would greatly improve the work in our opinion:

    The method crucially depends on the networks that are used in the different layers and to connect layers and cell types. As we know, biological data is noisy and incomplete (FP and FN) at each level and in each datatype. It would be really useful to estimate what is the robustness of the results to this noise. Particularly, from personal experience, we think the GRNs reconstructed from data are often almost fully connected and it is exceedingly difficult to validate them in specific contexts. This means that some 'errors' are likely to be present.

    Since several methods exist for inferring GRNs one could simply compare the results using different methods for this part of the network.

    A related point involves the characteristics of the RWR algorithm, that will be quite impacted by the presence of hubs in these networks (either in single layers or across several) that is likely to impact the exploration. If proteins that are hub are effectively important, that is not a problem, but in some layers, for example, the receptor-receptor layer that presumably will contain PPIs, there might be biases in hubs being just better studied proteins, and these hubs might have an 'unjustified' weight in the walks.

    One potential approach to assess the robustness of the method to these issues could be an empirical one that just randomly perturbs the networks in ReCoN to see to what extent similar predictions are achieved.

    *Response: *

    *We are thankful for this relevant comment on GRN and prediction stability, and would like to take it as an opportunity to *support the hypothesis that different GRN methods can be used in ReCoN.

    When developing our previous HMLN-based tool, HuMMuS (Trimbour et al. 2024 - Supp Figure 6), we observed that its multilayer structure provided more robust results than individual layers. We would like to reproduce such an analysis, verifying that ReCoN results have less variability than the GRN layers individually.

    We propose to integrate a* new showcase on the Human Cytokine Dictionary (Oesinghaus et al. 2025), trying to predict cytokine downstream effects similarly to the Mouse Immune Dictionary showcase.*

    This showcase would be useful to confirm the contribution of the indirect effect and test the impact of different GRN on the results.

    We would generate different GRN with several other GRNs methods: SCENIC+, CellOracle, and GRNBoost2 - the latest using only the scRNA-seq of the control samples in the Human Cytokine Dictionary.

    *The GRN methods produce generally output with very low overlap (Badia-i-Mompel et al. 2024). *

    *If we observe high correlations between the ReCoN predictions associated with the different GRNS, it would provide already a validation of ReCoN's robustness to GRN noise. *

    If lower correlations between ReCoN's predictions are obtained, we will add a specific permutation experience over the HuMMuS GRN, creating different level of artificial noise and assessing more precisely the robustness of ReCoN to GRN stochasticity.

    *Regarding PPI hub justification, our **applications did not use receptor PPI and are not affected by bias at this level in the showcases. This bias could specifically be present in the receptor-gene links, as we derive it from the ligand-gene connections of Nichenet which was itself partially based on prior knowledge. It is thus possible that some receptor are reached more often due to this bias and not a stronger effect. It seems however, hard to control in this context, as ReCoN currently relies on this prior knowledge. Currently, we hope that the combination of personalised, literature-agnostic GRN with literature-based receptor - gene can provide an interesting trade-off. In future development, we could imagine a receptor-gene network based solely on perturbations, but it would require controlling also the bias of ligand - receptor binding couples, which limits even the use of ligand-based experience. *

    We propose adding a short point in the discussion about hub effects from RWR-based methods.

    R1.13. Please add page numbers.

    *Response: *

    *We *will add the page numbers.

    R1.14. Figures are nice and clear.

    Some specific minor points are listed here below.

    Define hMLN on first appearance fig1 caption (no page numbers..

    2nd appearance heterogeneous multilayer structure (HMLN) ...

    Response:

    *We updated the legend of the figure to include the *definition of the acronym, as it arrives before first text occurrence. (Or define at both positions ?)

    R1.15. Bi_j not so clear to what it refers when first mentioned

    Response:

    *Bi_j represents a weight that can be attributed to favour some cell-to-cell transitions. It is usually not necessary to use them.

    *It is of interest notably to model 1) known spatial patterns in situ and hypothesis/design where cell types favour some connections. *

    E.g.: for modelling the skin, a user might notably want to increase connections between epidermic and dermic cells, and between dermic and hypodermic cells.

    We propose a new explanation of Bi_j to both explain it's meaning in the modelling, and illustrates situations for using it: "The coefficient B_{i,j} modulates the influence of cell type i on cell type j in the indirect effect computation. By default, all B_{i,j} are set to one, weighting each cell type's contribution equally per cell. However, it can be adjusted to encode additional biological knowledge, such as spatial proximity between cell types or known cooperation patterns. For instance, when modelling the skin, a user might increase B_{i,j} between epidermal and dermal cells, and between dermal and hypodermal cells, to reflect their spatial organisation."

    R1.16. personalized interaction specificity. - maybe better word than personalised (contextualised?)

    Response:

    We agree that contextualised explicits better the meaning behind this model. Personalised might notably lead to expect patient-specific data, which is not the case here.

    We propose to rephrase all the model names to : Receptor-matrix, ReCoN-no-CCC, ReCoN-no-context, ReCoN-complete.

    R1.17. ReCoN-genetic and ReCoN, ( generic?)

    Response:

    We will correct this typo.

    R1.18. responses. It is expected to observe common behaviors in-between cell-type, that the GRN

    and the generic CCC network already contribute captures.

    • not very clear

    Response:

    We aimed here to provide an explanation to the already good performance of the "ReCoN-no-context" (or its name updated according to comment R1.16), which could be surprising as no cell-type specific information is used. The explanation proposed is the good prediction of several properties shared by all immune cell types, such as similar metabolic pathways, despite their specific roles. If we adopt a quantitative view on their transcriptome like in this showcase, it can be expected that the cell type responses are relatively well predicted through the common properties only.

    As this is a very relevant comment, and that several comments pre-submission we received were also related to this result, we would like to keep an explanatory sentence.

    R1.19. Figure 2b the icon of cells with double arrows might suggest phenotype shift when instead this is just communication

    Response:

    (left side) We are very thankful for paying attention to the details of the paper and fully agree with this analysis. We propose to represent ligand emission instead of arrows, reusing the convention of the Figure 1.

    R1.20. eTACs explain acronym and what they are

    Response:

    We update the first occurrence of eTACS to extrathymic Aire-expressing cells (eTACS).

    R1.21. Due to very few genes being differentially

    expressed, only cDC1 was conserved and evaluated for IL22,

    Not so clear

    Response:

    As we are commenting on IL22 stimulation results, we reorganised the sentence to make it less convoluted: "For IL22 stimulation, only cDC1 presented enough genes being differentially expressed."

    R1.22. In this showcase (not very clear, use case?)

    Response:

    *We **perceive "use case" as describing a type of use for the method, while a show case is a specific example of a use case. *We thus find showcase more appropriate here. We will however go over all use of the word, to be sure it is only used for the precise examples we provided, and not to describe "use cases".

    R1.23. different fibroblast specializations - maybe phenotypes?

    Response:

      • It is a very good suggestion, as specialisation would involve functional aspects (that we can't really be sure of), and a chronological evolution*
    1. Phenotype generally includes numerous properties, such as morphology, that we cannot validate here. We think the use of phenotype might be stronger than specialisation here. To simplify, phenotype can work, to be more precise: transcriptomic specialisation? I am honestly not sure of the best change here.

    R1.24. Figure 4b

    1. b) Schematic view of the deconvolution process and cell type-specific count inference from the spatial niches.

    Not so clear what the heatmap shows, rows and columns

    Spots heatmap : label niche on rectangles in cols

    And each col is a spot

    Rows are cell types or cells?

    In the cell types x spot

    Response:

    *This figure can indeed *benefit strongly from legend modifications. On both matrix, lines represent the genes, while columns represent the spot / individual cells deconvoluted per spots

    1. We would annotate the niche legend (here the colour surroundings) by a symbolic drawing instead of writing it on the matrix

    Legend "genes" on the first matrix

    Write deconvolution ON the figure directly

    R1.25. Cell2location. Add reference, maybe explain basic functionality?

    Response:

    *Cell2location was *not referenced in the results section, and was only referenced in the section 4.6.2 of the methods, as the 72th citation. We corrected this oversight, and propose 1) a brief explanation of deconvolution right before, 2) a brief explanation of Cell2location particularity in inferring individual cell profiles - which is not common in spatial deconvolution.

    R1.26. reconstructing different patients, tissues, and microenvironments to predict

    context-specific molecular treatments.

    Unclear

    fibrosis in different - at

    molecular levels

    Response:

    *We *will modify this section title according to the reviewer's citation and the different reformulation.

    R1.27. Figure 5d myeloid and endothelial colour code inversed from 5 BC

    Response:

    The legends are individually correct, but there is no reason to not make them coherent across panels. We will update the legend of the panel 5.d..

    R1.28. 5d indicate important pathways in organe should not change the colour of the nodes (purple=common, blue or green specific). Use border colour maybe?

    Response:

    *We had *forgotten to precise the colour code of this panel, where the choice of orange highlighted here the gene set related to molecular pathways instead of functional annotations. As the name already explicits pathway, we now think that the orange background is redundant informations and may create some confusion. We thus would like to update Wnt and TNFA pathways backgrounds to ___ (more enriched in cell type), and purple (significantly enriched in all cell types).

    R1.29. 5e is not a venn diagram

    1. e) Venn diagram showing the overlap between transcription factors (TFs) predicted by ReCoN (green) and those previously

    implicated in fibrosis (orange) or cardiac diseases (violet). Only the top 10 TFs were annotated from literature

    sources; full sizes of fibrosis- and cardiac disease-related receptor sets can therefore not be represented.

    1. f) also not a venn diagram e/f now in supp

    the "NABA ECM collagens" gene set. Nodes are

    grouped by molecular type (e.g., transcription factors, receptors, ligands), and links represent the weighted,

    direct regulatory interactions present in the ReCoN-constructed

    Response:

    *As the diagrams do not indicate the total number of receptor/TF that are in the literature, it cannot be Venn diagrams. We updated the legend to :Venn diagram showing the *Overlapp between [...]

    As we reorganised the paper, these plots are now only in supplementary; we removed the duplicate occurrence in the figure 5 legend.

    R1.30. Why Sankey plot? Normally sankey plot represents flow (of regions changing from 1 state to another) but here this is just a weighted network?

    No communication from firbos back to other cell types? No communication between ventricular/myeloid/lymphoid?

    Response:

    *We are thankful for this useful feedback which helped us realising *interesting details were missing from the paragraph.

    *This is only intended for visualising regulatory cascade, so users have to decide on one receiving cell, a set of target genes, and sending cells. It includes a specific subset of regulatory cells, and only their interactions with the target cells. Here, we illustrated the regulation of some ECM genes produced by fibroblast. *

    *Sankey Diagram might indeed not be the clearest representation, as we are not modelling the all diffusion, and not a flow per se. We propose to replace by another representation that we hope *will be more intuitive for biologists (and more aesthetic), such as illustrated below:

    R1.31. as a extension to - an

    underrepresented in the current. - current framework?

    Response:

    framework works* perfectly to fill the missing word in the sentence*

    R1.32. However, it can't represent more - cannot

    Borrowing representation from hypergraphs, which introduces

    The network exploration implementation of ReCoN also present some limitations.

    limitations. While random walks

    with restarts offer a stable and fast exploration workflow for multilayer networks, it

    currently only considers positive weights to predict regulation strengths. It involves that the

    nature of the regulation, as activation or inhibition, has to be identified a posteriori.

    • check concordance/grammar

    Response:

    We will update the raised grammatical errors

    R1.33. Only the nodes that are included in one of the layers are present in the

    final results, ignoring the ones present only in bipartites.

    Unclear

    Response:

    Layers and bipartites are treated differently by the algorithm, and layer presence is necessary to appear in the results.

    In practice, it just means that receptors/ligands not paired in the CCC, or genes not regulated by any TF in the GRN, won't appear.

    *We propose clarifying *with this second explanation

    "In practice, a node must have at least one connection in its layer to appear in the final results. It thus means that receptors or ligands absent from the CCC network and genes not targeted by any transcription factor in the GRN will not receive a score from the random walk exploration."

    R1.34. a scATAC - an

    Barsi et al is published https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013188

    Response:

    We updated the reference with the published article.

    R1.35. effects, allowing for modulating in a second

    time their contribution. - word order

    Response:

    We propose to formulate "allowing in a second time to modulate their contribution"

    R1.36. others. However, it is possible to adjust the Beta coefficient to

    represent it based on the available information for each dataset.

    Represent- adjust?

    Response:

    We agree with the reviewer's suggestion to use adjust.

    R1.37. We use the latter to compare the different models. - what is the latter?

    Response:

    The latter referred to the 25 cytokines of the Immune Dictionary which had at least one connection in the inferred cell communication network with CellPhoneDB. We propose clarifying this formulation to "..."

    R1.38. It resulted in the scRNA-seq in 1,789 cells with 13,167

    genes, and for the scATAC-seq in 3,759 cells with 254,545 regions.

    Check english

    Response:

    *We propose replacing this sentence by the following: *"It resulted in a scRNAseq dataset of 1,789 cells with 13,167 genes, and a scATACseq dataset of 3,759 cells with 254,545 regions."

    R1.39. GRETA pipeline.- reference

    Response:

    We added the citation to the paper of the GRETA pipeline in the section 4.5 of the methods: "Badia-i-Mompel et al., 2026"

    R1.40. We kept all the cells whose annotations through unsupervised clustering,

    followed by marker gene annotations, through scANVI were coherent.

    Word order

    Response:

    We propose the following reformulation to correct the sentence: "We kept all cells whose annotations were coherent between unsupervised clustering with marker-gene labelling and scANVI-based label transfer"

    R1.41. In parallel, pairs of ligands and receptors with both associated with scores above

    an absolute gene loading of 0.1 were considered potential driver interactions in HF.

    Unclear

    Response:

    In the MOFAcell results, factors correspond to linear combination of genes that explain a large part of the data variance; the contribution of each gene is called loading. We chose the factor that classified the best patient with and without fibrosis, and kept all the top genes, all of those with a score above 0.1.

    We propose reformulating this sentence as the word "loading" could overcomplicate here for most readers: "To identify the ligand and receptors driving heart failure, we considered all of those with an absolute contribution to the multicellular factor of 0.1."

    R1.42. gseapy Python - reference?

    Response:

    The gseapy package was indeed not cited, we now include the citation : "Zhuoqing Fang, Xinyuan Liu, Gary Peltz, GSEApy: a comprehensive package for performing gene set enrichment analysis in Python, Bioinformatics, 2022;, btac757, https://doi.org/10.1093/bioinformatics/btac757"

    R1.43. and to calculate average for each spatial context the average cell type expression.

    Unclear

    Response:

    we propose to reformulate the sentence to: "These cell-type-spot profiles were used later for each spatial context to create a specific cell-cell communication networks and to calculate cell type average expressions."

    R1.44. We only used the loadings of all cell

    types but the fibroblasts to consider the effect of the sole environment.

    Unclear

    Response:

    we propose to use "APART from the fibroblast" to clarify the sentence and "to ONLY consider the environment effect".

    R1.45. We realised a downstream - performed

    Response:

    We fully agree with the reviewer's suggestion.

    R1.46. The profiles inferred by ReCoN were first very correlated in all three contexts. - unclear

    Response:

    The sentence was missing clarity and deserved being rephrased. We propose: "When looking at the absolute scores of ReCoN in all three contexts, results were initially highly correlated. To focus on context-specific differences, enrichments were performed using the log-ratio of each context profile over the mean of the other profiles."

    R1.47. Potentially the closest results are models that can predict the effect of perturbations on cell line cultures. Several approaches in the literature employ either transformers or optimal transport to predict the effect of perturbations in single cell datasets. One of the main issues is an underlying necessary assumption that the perturbation effect will be larger than the heterogeneity (in cell lines for example), which becomes increasingly difficult when considering in-vivo experiments. ReCoN obviously goes beyond this by considering explicitly the presence of different cell types but distinctions of cell types are sometimes quite arbitrary and potentially application of ReCoN to some of the in-vitro culture datasets, even on cell lines, could be a way to test its performance and benchmark it against other methods.

    The main bottleneck in the application of this framework to 'personalisation' of therapies, mentioned even in the abstract as a potential future goal for such an approach, will be the lack of data. This approach requires single cell level descriptions of the system at hand, plus additional datasets to build the model structure. To a certain extent, public data of related tissues/contexts can be used, but it will be necessary to test the dependence of performance on coherence of the input data to develop sufficient trust to use it for new predictions, especially in a medical field.

    We thank the reviewer for these reflections, which raise several distinct points that we would like to add in the discussion.

    Cell line perturbation is indeed a close and active field of research, with notably numerous models based on optimal transport and VAE and relevant benchmarks(Radig et al. 2025)*. In our view, ReCoN tries to take a complementary angle, by both focusing on the environment effect and using a network-driven approach providing explainability. *

    These perturbation methods are typically benchmarked on single cell line screenings, where cell-cell communication is highly limited or absent by design, while ReCoN is specifically designed to exploit multiple cell types interactions. Furthermore, ReCoN relies on a network that aims to provide only explainable hypotheses and molecular cascades. They also typically learn from different data, as ReCoN only uses single-cell data and best perturbation prediction methods learn from a subset of perturbation experiments.

    Exploring the performance of ReCoN in perturbation predictions would require designing extensive comparisons with the state-of-the-art taking into account all these nuances which we believe goes outside of the scope of the present study. It however still raises a fundamental question for the development of the next methods and the need to assess whether the perturbation effect is actually larger than the heterogeneity, and we propose to extend the discussion to cover these aspects.

    Secondly, this comment raised a point about cell type definition, which can be a hard task and sometimes a wrong description of cells heterogeneity. We note that even if ReCoN relies on grouping cells in some way, it does not impose any particular cell type ontology: users can define their own cell types or cell states, since the CCC layer is typically inferred from single-cell RNA-seq alone and does not require canonical cell-type annotations. This flexibility allows ReCoN to accommodate finer or coarser groupings depending on the biological question. We do not propose a framework to take into account diversity in other ways than homogeneous clusters of cells, but we think that it constitutes an interesting future development of ReCoN or new multicellular modelling methods.

    Lastly, we fully agree that an important limitation for ReCoN's use is data availability and generation, which was also a limitation when identifying datasets for the manuscript's applications. We hope that the development of open source atlases will make it easier to leverage tissue-specific prior knowledge and increase potential application, prediction performances, and trust in ReCoN results.

    In conclusion, we propose to state in the discussion two new points:

    *1) extending multicellular perturbations (including gene knock-out) to conditions where cell types cannot be defined prior to the analysis, or are more to consider across a spectrum, will be an interesting future direction. *

    2) there is new a need for broad benchmarks covering both multicellular and single-cell line tasks to evaluate the trade-off between accounting for cell heterogeneity and overall prediction accuracy.

    Radig, J., Droit, R., Doncevic, D. et al. scArchon: a scalable benchmarking framework for assessing single-cell perturbation models. Genome Biol 27, 162 (2026). https://doi.org/10.1186/s13059-026-04104-z

    R1.48. The authors could comment on how their method compares to others that do not require single cell level information. Despite clear differences, it might be important to show the advantage of using this more complex approach that requires data that is less available. Given the ease with which bulk profiles can be constructed from single cell data, it might be possible to compare the approaches directly. For example, see

    1. Wang, S. Patkar, J.S. Lee, E.M. Gertz, W. Robinson, F. Schischlik, D.R. Crawford, A.A. Schäffer, E. Ruppin Deconvolving Clinically Relevant Cellular Immune Cross-talk from Bulk Gene Expression Using CODEFACS and LIRICS Stratifies Patients with Melanoma to Anti-PD-1 Therapy

    Mike van Santvoort, Óscar Lapuente-Santana, Maria Zopoglou, Constantin Zackl, Francesca Finotello, Pim van der Hoorn, Federica Eduati,

    Mathematically mapping the network of cells in the tumor microenvironment,

    Cell Reports Methods 2025

    We propose to extend the discussion with additional methods, notably from before single-cell technology developments. We did not plan to include this two specific methods, as to our knowledge, they don't provide output directly comparable to ReCoN's purpose.

    • The first work proposes to deconvolute the bulk RNA-seq profile into cell-type-specific expression profiles. It is an interesting reference, as it could allow applying ReCoN even to bulk RNA-seq, but they do not provide comparable results, as their final task corresponds to inferring the ligand-receptor interactions, without providing downstream molecular mechanisms.
    • The second method proposed in this paper, RaCInG builds cell-to-cell networks for individual patients. They do not explore the molecular interactions inside the cells themselves, which could be used to build personalised ReCoN's model but seem to be more a prior to recent CCC than ReCoN itself.

    Reviewer #2

    R2.1. It is not clear how well it performs in independent validations. Authors showed that it can predict the effect of cytokine perturbations in the immune dictionary by selecting an optimal alpha. Authors should validate that using the same alpha value of 0.8, it is possible to accurately predict the effect of cytokine perturbations in independent datasets. This is particularly concerning for cytokine-cell type pairs where the optimal alpha is not known. Therefore, the potential utility of Recon to estimate the effect of multicellular perturbations is not well established.

    Response:

    *The reviewers raised a very relevant point by pointing out that the alpha coefficient might vary between datasets. *

    *The value of 0.8 was chosen because it produced the best results in two independent datasets, the immune dictionary and the heart failure showcases. We could here observe some cross-dictionary reproducibility. To complete these findings, we will also verify that 0.8 provides the best performance in a new showcase: the Human Cytokine Dictionary *(Oesinghaus et al. 2025)

    We tried to contrast this choice by opening on the need to confirm the importance of the indirect effect. We propose to add a sentence explicitly commenting on the impact of these new findings on the alpha coefficient and its robustness value.

    It is also accurate to say that ReCoN cannot currently estimate the alpha parameter autonomously. We proposed this default value as it worked on both datasets, but it is possible that no default value could fit them all. The value of alpha is currently a default value, but users are completely free in the current implementation of ReCoN to modify its value depending on their needs

    If it was not the case, one option could be to fit its value using similar prior perturbations, when such data is available. For example, perturbing one or a few cytokines, a user could choose the value that explained the best the gene expression responses.

    R2.2. Authors claimed that optimal alpha value of 0.8 implies the dominance of indirect effect. But in contrast to this claim, the performance across cytokine-celltype pair only increased from 0.72 to 0.76, which seem to imply that indirect effects do not add much.

    *Response: *

    The range of performance improvement is an interesting point to discuss for us, as it roughly doubles the computational time and consequently a trade-off between resource usage and this improvement.

    While the average improvement from combining the direct and indirect effects observed on the first showcase was around 5%, it reached more than 10% in some cell types. We consider that it still corresponds to an interesting improvement for the current task. Indeed, it here "only" incorporates the coordination of immune cells to a cytokine stimulation, which should not necessarily change their profile drastically compared to isolated exposition.

    R2.3. How does the cell-type specific effects prediction perform by just considering the intracellular layers? The authors constructed multiple variants of ReCoN to estimate unicellular and multicellular effects. How is the variant ReCoN-grn different from full ReCoN where gamma is set to zero.

    *Response: *

    We are thankful for this comment, which will help to restructure the section 2.2.

    As the ReCoN-GRN differs from the full ReCoN model, even with a gamma value of 0, as the latest include ligand-to-receptor weights. However, the ReCoN-GRN would correspond to the ReCoN-generic with an alpha of 0, which does not weight ligand-to-receptor links.

    We propose to clarify this detail in the section 2.2.2 by adding after the introduction of the ReCoN-generic model the sentence: "Note that ReCoN-grn corresponds to the ReCoN-generic model with alpha set to zero, where no indirect effects are considered. It differs from the full ReCoN model with alpha set to zero, which still includes ligand-to-receptor weights through the receptor-gene bipartite network."

    R2.4. In section 2.2, authors assert that if matching datasets are not available, GRN layer can be extracted from other datasets. How well does the GRN layer from one system generalizes to the other system in terms of perturbation prediction?

    *Response: *

    It is, of course, a complex question, as it probably strongly depends on the studied system. However, we believe while it is important to consider similar systems, using the same samples for the cell-communication and the GRN layer is not necessary.

    The first showcase that we propose explores exactly this case. We built the GRN from two unpaired datasets, and the cell communication from a third one. It provided convincing performances, justifying our earlier claim. It is additionally something done in most methods contextualising prior knowledge, which usually comes from other samples and sometimes even other organs (Browaeys, Saelens, and Saeys 2020, Jin et al. 2021, Badia-i-Mompel et al. 2023).

    To provide additional insights, we will run the new Human Cytokine Dictionary showcase using both 1) multiomics methods on external PBMC datasets, and 2) a single-cell RNA-seq only method on the Human Dictionary directly. We will then be able to show performances using both data and corresponding methods.

    To justify more clearly our claim according to reviewer's comment, we propose highlighting in the showcase itself this justification: ".... this showcase highlights the possibility to combine networks obtained from distinct datasets...".

    Related to combining datasets, we propose to clarify the reasons behind our choices for the Immune Dictionary showcase with the additional supplementary text proposed in response to the comment R1.4.

    Badia-i-Mompel P, Wessels L, Müller-Dott S et al. Gene regulatory network inference in the era of single-cell multi-omics. Nat Rev Genet 2023;24(11):739-54. https://doi.org/10.1038/s41576-023-00618-5.

    Browaeys R, Saelens W, Saeys Y. NicheNet: modeling intercellular communication by linking ligands to target genes. Nat Methods 2020;17(2):159-62. https://doi.org/10.1038/s41592-019-0667-5.

    Jin S, Guerrero-Juarez CF, Zhang L et al. Inference and analysis of cell-cell communication using CellChat. Nat Commun 2021;12(1):1088. https://doi.org/10.1038/s41467-021-21246-9.

    R2.5. In the abstract, authors claimed that ReCoN can predict the effect of gene knockouts. But authors did not show any application or validation to support this claim.

    Response:

    We indeed had no showcase that could explicitly measure the performance of ReCoN directly for gene knockout, while the possible application was introduced in the abstract.

    We believe that ReCoN could be used in the future to infer such perturbations, but we fully agree that this claim cannot be presented without justification.

    We propose to remove the introduction of gene-knockout there, and to introduce it in the discussion opening instead, specifying that it will require specific experience and constitutes a possible future extension of the work.*

    R2.6. The communication between cells might be dependent on their spatial proximity. Is it possible to construct the CCC layer by incorporating the context-matched spatial data? How would that affect the performance of multicellular response prediction?

    Response:

    *This is a very interesting comment as numerous methods using spatial transcriptomic data have been published recently. *

    In the current formulation, the beta coefficient Bi_j modulates the impact of the cell type i on the cell type j. If the spatial transcriptomic data can inform on the proximity between cell types, and its overall impact on their communication, users could enforce more communication between some.

    However, as ReCoN is a cell-type centric model, adding spatial information can only be done at a general scale, or by modelling independently spatial regions such as presented in the Microenvironments heart infarction showcase. It means that ReCoN cannot beneficiate from the potential of spatial transcriptomic as much as models representing the tissue structure.

    R2.7. In the fibroblast application in Fig 4d, based on the cardiac cell types expression in region type, they are predicting fibroblast gene expression. Wouldn't the most direct benchmarking be comparison with observed fibroblast expression from the ST (after deconvolution perhaps)?

    Response:

    This was a helpful comment to guide the restructuration of the microenvironment heart infarction showcase, as we believe the whole showcase objective was not formulated clearly enough.

    We aim at modelling the impact of the environment on the transcriptome. As the complete transcriptome of a cell results from numerous interacting variables, we believe that comparing the correlation between ReCoN's scores and the transcriptome would not evaluate the prediction of the environment impact.

    For this reason, we wanted to compare the results to the specific differences from the microenvironment. We focused on gene set enrichment that seemed less noisy for such a comparative experiment, in particular from Visium10X data that has a particularly high dropout rate.

    We propose to strengthen the validation by providing molecular insights into the three groups of cells studied.

    The spatial data themselves are bulk, adding a layer of noise over the small number of genes captured by Visium. Instead of a correlation with the deconvoluted spots, we have equivalent single-cell RNA-seq fibroblast data annotated in the same study, which matches the three modelled niches. We propose to conduct a differential expression here and try to compute a correlation between these groups and ReCoN scores, providing a quantitative analysis.

    If the correlation was low because of the noise in the data (notably leading to the permutation of individual gene orders even if overall biological signals and gene set orders are conserved), we will additionally do a pathway enrichment over this data, enriching also the qualitative validation.

    R2.8. Section 2.6 Besides the cytokine section, it is difficult to assess the added value of this approach. Likely there is a lot of valuable findings here but difficult to say because the assessment is very qualitative.

    Response:

    One of the challenges around this work was to find relevant dataset to evaluate ReCoN. We tried to complete the direct quantitative evaluation from the Immune Dictionary with another quantitive evaluation from the heart atlas multicellular programs, despite a much less direct validation.

    We hope that the production of new perturbation experiments over multicellular datasets, especially cell-type targeted perturbations, will provide more opportunities to validate the different findings and claim from our current manuscript.

    On a similar note, no method seemed proposing similar predictions to be compared to. It led to the use of Nichenet score and the current decomposition of the ReCoN model in the section 2.2.1 to evaluate the contribution of the model.

    R2.9. The article is dense and writing should be reorganized for better readability.

    Minor issues -

    No p-values in figures.

    *Response: *

    We agree that integrating values directly in the panels would make the reading of the figure easier. We would like to introduce the p-values in the panels 2d, 2e, 2f, 2g. We had forgot to indicate in the legend of the panel 4.d that all bold scores were associated with a p-value *

    R2.10. Typo - ReCoN-genetic should be - ReCoN-generic.

    Response:

    We are thankful for noticing the typo and corrected it in the new version.

    R2.11. Authors may consider adding figures to describe their results on balance between direct and indirect effects in section 2.2.2.

    Response:

    Depending on the new findings on the indirect effect iterations, we propose adding an additional panel on their combination or a supplementary figure.

    R2.12. Redundancy in the following two lines -

    o While these approaches effectively describe what tissue-wide programs are coordinated, they generally offer limited insight into the molecular mechanisms that establish or regulate these programs.

    o Despite their ability to identify coordinated tissue-wide programs, multicellular program analyses typically offer limited insight into the underlying molecular mechanisms that orchestrate these programs.

    Response:

    We propose in the version of the manuscript to remove the first sentence. In our opinion, starting the next paragraph by this clarification seems more helpful to guide the reader than having it at the end of the previous one.

    3. Description of the revisions that have already been incorporated in the transferred manuscript

    Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

    4. Description of analyses that authors prefer not to carry out

    Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

    R2.13. The direct and indirect effects are treated in two separate steps. In reality of course these effects are operating simultaneously. I wonder if this could be better modelled by iterating through the two steps. It might be worthwhile

    trying to see if that improves the performance.

    We thank the reviewer for this interesting idea, and propose to add a supplementary text to present the result of this discussion to the readers.

    The direct effect is supposed to be measurable from the first iteration only, as we try to represent the effect of direct receptor binding. Regarding the indirect effect, iterations could be done to model the indirect effect, which could represent more distant effect in time.

    On an algorithmic note, the indirect effect already allow several "iterations" of this effect, as each random walk can loop between all cell types until restart. However, it does not allow to control the weight of the different successive transition. In practice, with a high restart probability, an extreme weight is given to the first "iteration" over the second, as there is three layers to cross to explore the next cell.

    First, we propose clarifying this section of the manuscript, to explain the depth of the indirect effect explorations.

    Biologically, it is highly possible that these iterations have an important role to explain the complete reaction of the cells. However, we believe that it hits a major limitation of our modelling, and RWR based exploration in general, as it goes against the enforcement of restarts.

    We aim to represent pairwise measurements, representing the impact of one node on another. But random walks without restart are not naturally well fitted to this problem, as they naturally converge to a stationary distribution ((László, Lov, and Erdos 1996)). In the case of ReCoN, it means that each gene and receptor, if we pushed the exploration indefinitely, would have the same probability to end up on each node of the system.

    The restart mitigates this impact and enforces the impacts of the seeds by ensuring that the walkers stay close to the seed. (Tong, Faloutsos, and Pan 2006). By iterating successively from the new distribution obtained from the RWR, we would go against this important probability and progressively converge toward the stationary distribution from classical random walks.

    So we completely share the opinion of the reviewer that the iterative nature of the indirect effect should be explored too, but we don't believe that ReCoN can model them accurately. We hope that new exploration methods will be able to decipher the importance of these iterations, once additional arguments have been gathered to justify the global interest of considering the indirect effect.

    Bibliography:

    László L, Lov L, Erdos O. Random Walks on Graphs: A Survey. 1 Jan. 1996:1-46.

    Tong H, Faloutsos C, Pan J yu. Fast Random Walk with Restart and Its Applications. Sixth Int Conf Data Min ICDM06 Dec. 2006:613-22. https://doi.org/10.1109/ICDM.2006.70.

  2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #2

    Evidence, reproducibility and clarity

    Summary - This is an interesting paper where the authors predict the multicellular response to the molecular perturbations. The idea is somewhat novel and offers a conceptual enhancement by modelling the multicellular response as collective outcome of cell intrinsic gene regulatory changes coupled with cell-cell communication by using a simple network diffusion-based approach. We have a few comments to help strengthen the work.

    • It is not clear how well it performs in independent validations. Authors showed that it can predict the effect of cytokine perturbations in the immune dictionary by selecting an optimal alpha. Authors should validate that using the same alpha value of 0.8, it is possible to accurately predict the effect of cytokine perturbations in independent datasets. This is particularly concerning for cytokine-cell type pairs where the optimal alpha is not known. Therefore, the potential utility of Recon to estimate the effect of multicellular perturbations is not well established.
    • Authors claimed that optimal alpha value of 0.8 implies the dominance of indirect effect. But in contrast to this claim, the performance across cytokine-celltype pair only increased from 0.72 to 0.76, which seem to imply that indirect effects do not add much.
    • How does the cell-type specific effects prediction perform by just considering the intracellular layers? The authors constructed multiple variants of ReCoN to estimate unicellular and multicellular effects. How is the variant ReCoN-grn different from full ReCoN where gamma is set to zero.
    • In section 2.2, authors assert that if matching datasets are not available, GRN layer can be extracted from other datasets. How well does the GRN layer from one system generalizes to the other system in terms of perturbation prediction?
    • In the abstract, authors claimed that ReCoN can predict the effect of gene knockouts. But authors did not show any application or validation to support this claim.
    • The communication between cells might be dependent on their spatial proximity. Is it possible to construct the CCC layer by incorporating the context matched spatial data? How would that affect the performance of multicellular response prediction?
    • The direct and indirect effects are treated in two separate steps. In reality of course these effects are operating simultaneously. I wonder if this could be better modelled by iterating through the two steps. It might be worthwhile trying to see if that improves the performance.
    • In the fibroblast application in Fig 4d, based on the cardiac cell types expression in region type, they are predicting fibroblast gene expression. Wouldn't the most direct benchmarking be comparison with observed fibroblast expression from the ST (after deconvolution perhaps)?
    • Section 2.6 Besides the cytokine section, it is difficult to assess the added value of this approach. Likely there is a lot of valuable findings here but difficult to say because the assessment is very qualitative.
    • The article is dense and writing should be reorganized for better readability.

    Minor issues

    • No p-values in figures.
    • Typo - ReCoN-genetic should be - ReCoN-generic.
    • Authors may consider adding figures to describe their results on balance between direct and indirect effects in section 2.2.2.
    • Redundancy in the following two lines -
      • While these approaches effectively describe what tissue-wide programs are coordinated, they generally offer limited insight into the molecular mechanisms that establish or regulate these programs.
      • Despite their ability to identify coordinated tissue-wide programs, multicellular program analyses typically offer limited insight into the underlying molecular mechanisms that orchestrate these programs.

    Significance

    This is an interesting paper where the authors predict the multicellular response to the molecular perturbations. The idea is somewhat novel and offers a conceptual enhancement by modelling the multicellular response as collective outcome of cell intrinsic gene regulatory changes coupled with cell-cell communication by using a simple network diffusion-based approach.

  3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #1

    Evidence, reproducibility and clarity

    The authors propose an approach to model complex regulatory processes in tissue or cell collections in specific environments taking into account intra- cellular regulatory processes at multiple levels and inter-cellular communication, importantly offering a chance to estimate the importance of indirect effects of perturbations on one cell type via processes in other cell types. Increasingly more complete models allow testing the impact of each component and of integrating data as context-specific information versus general prior knowledge. 3 main use cases are provided exploiting public datases: prediction of the effect of specific in-vivo cytokine perturbations on mouse lymph node tissues Healthy and disease myocardium in a heart failure multiome dataset Myocardial infarction spatial transcriptomics to identify how different cellular neighbourhoods are related to fibroblast phenotype and fibrosis The main framework is an extension of their previous HuMMus framework to investigate multilayer networks of regulation within a single cell type to also consider inter-cellular interactions, thus including i) tf-target GRN, ii) receptor a receptor layer based on PPI, and cell-cell communication based on LR interactions. These complex networks are then explored within the framework of Random Walk with Restart, which allows to establish 'interaction weights' between different nodes in the network, based on repeated simulations of spreading on the network that thus produce scores of proximity between network nodes, across possible paths. In this study first RWR that only allow intra-cell type walks are performed to calculate direct interaction of perturbation on node states, then RWRs across layers are also enabled, to calculate the importance of inter-cell interactions (via coeff gamma). The importance of each cell type is given by another coeff B that can either correspond to cell type proportions or spatial proximity of cell pairs and finally the scores of within and inter-cell interactions are weighted with a coefficient alpha.

    The central contribution that allows coupling of intra with inter-cellular interactions is the establishment of receptor-gene links. Instead of inferring it from data, they propose to express the receptor-gene matrix as: R = L ⋅ G taking ligand-receptor (L) and ligand-gene (G) adjacency matrices from NicheNet and using NNLS to compute R.

    Generally, for all these cases, comparison between performance in inferring the effect of perturbation or the upstream regulators or downstream targets are provided with assessment of AUROC/AUPRC values.

    • Are the claims and the conclusions supported by the data or do they require additional experiments or analyses to support them?

    This is a very well-written paper, the methods used are adequate and the use cases are relevant and broad, exploiting state of the art datasets and tools.

    The author's claims are mostly justified. The authors could make an effort to more explicitly cite other efforts in similar directions. The claim 'We envision ReCoN as a extension to prior multicellular modelling, offering an interesting compromise between prediction of cell type responses and understanding of their molecular coordination.' is very general and could be better substantiated. In fact, the authors do not really give examples of alternative approaches to study systems of interacting cells, other than mechanistic agent based models, that clearly are very different. Moreover, the exploration of the multilayer networks with RWR is a very reasonable choice but could there be other approaches? I think the authors could discuss this issue to briefly support their choice of this method.

    Generally the discussion should provide the reader the context in the existing literature in which the work can be set, detailing its impact. I think this could be improved.

    Regarding the choice of datasets, it is clear that the method is quite demanding, requiring single cell and different omics to build the model, in addition to the expression dataset that is used as a use case. This inevitably leads to using a mix of datasets. For example in the mouse experiments the gene regulatory network was inferred from both a lymph node scRNA-seq dataset and a splenic scATAC-seq dataset, presumably due to the lack of multiome data in this setting. However the cell-cell communication network was inferred from the control case of the Immune Dictionary. Why can't the authors use the control data also for inferring GRNs? Is atac-seq really necessary in the inference of the GRN? What is the impact of the fact that lymph node and spleen samples might be different?

    '

    • Please request additional experiments only if they are essential for the conclusions. Alternatively, ask the authors to qualify their claims as preliminary or speculative, or to remove them altogether.

    • If you have constructive further reaching suggestions that could significantly improve the study but would open new lines of investigations, please label them as "OPTIONAL".

    • Are the suggested experiments realistic in terms of time and resources? It would help if you could add an estimated time investment for substantial experiments.

    • Are the data and the methods presented in such a way that they can be reproduced? The code is very clear, we were able to install and run it and it is quite well-documented. However, a few more details should be given in the text regarding how the evaluation of the performance is carried out. For example: If I understand correctly, when predicting the impact of cytokine perturbations the ReCoN predictions of genes impacted are compared to differentially expressed genes identified through traditional DEG analysis. What is compared is the ranking of these genes from ReCoN with the ranking provided by DEseq2. There is no description of how this comparison of ranking gives rise to AUROC values. Also, is it just the ranking that is predicted or can they also estimate how well they can predict the effect size?

    When describing the use cases, I think a bit more detail would help. For example 'To identify the cell-type-specific genes associated with HF, we used the MOFAcell scores of the multicellular factor 1 (MCP1) reported in ReHeat236' I supposed the explanation is on the dataset but for the sake of clarity it would be good to expand this sentence to give at least an idea of the approach.

    Regarding the calculation of the R matrix from the NichNet matrices L and G, I gather that the R matrix is calculated once and is thus fully data-independent and available just like the L and G matrices from NichNet. This was not very clear in the tutorials.

    Also, this might just be a typo in the tutorial: 'The default α = 0.8 gives more weight to direct effects, which has been empirically validated. You can adjust this based on your biological question." I believe the manuscript says alpha>0.5 refers to indirect effects dominating.

    Same for the pre-processing of the spatial data for the third use case, a little more details on how this was done would help the users and readers.

    • Are the experiments adequately replicated and statistical analysis adequate? I don't see issues with the statistical power of the analysis. Rather, I think the authors should provide some examination of the parameter space for their model. Whereas ana analysis of the impact of the Alpha parameter is provided, I believe there are several more parameters that have a crucial impact and choices for their values should be discussed.

    For example 'In the GRN reconstruction only the links with a score above 1.5e-7 were retained in ReCoN's gene regulatory layer. How was this chosen?

    We have identified the following parameters that are somehow justified but could be explored to have a better feel for how they impact the results

    Restart probability: How often the walker goes back to the starting seed/molecule Layer transition probability: How often the walker stays in the same layer - different cell? - different layers? Gamma Node transition within a layer: How often one jumps to a different layer Weighting parameters: How much weight for direct or indirect effect to account for the combined effect - alpha - this is the only one that is explicitly explored.

    Finally, this might be considered OPTIONAL but would greatly improve the work in our opinion: The method crucially depends on the networks that are used in the different layers and to connect layers and cell types. As we know, biological data is noisy and incomplete (FP and FN) at each level and in each datatype. It would be really useful to estimate what is the robustness of the results to this noise. Particularly, from personal experience, we think the GRNs reconstructed from data are often almost fully connected and it is exceedingly difficult to validate them in specific contexts. This means that some 'errors' are likely to be present. Since several methods exist for inferring GRNs one could simply compare the results using different methods for this part of the network. A related point involves the characteristics of the RWR algorithm, that will be quite impacted by the presence of hubs in these networks (either in single layers or across several) that is likely to impact the exploration. If proteins that are hub are effectively important, that is not a problem, but in some layers, for example, the receptor-receptor layer that presumably will contain PPIs, there might be biases in hubs being just better studied proteins, and these hubs might have an 'unjustified' weight in the walks. One potential approach to assess the robustness of the method to these issues could be an empirical one that just randomly perturbs the networks in ReCoN to see to what extent similar predictions are achieved.

    Minor comments:

    • Specific experimental issues that are easily addressable.
    • Are prior studies referenced appropriately?
    • Are the text and figures clear and accurate?
    • Do you have suggestions that would help the authors improve the presentation of their data and conclusions?

    Please add page numbers. Figures are nice and clear. Some specific minor points are listed here below.

    Define hMLN on first appearance fig1 caption (no page numbers..;) 2nd appearance heterogeneous multilayer structure (HMLN) ... Bi_j not so clear to what it refers when first mentioned personalized interaction specificity. - maybe better word than personalised (contextualised?) ReCoN-genetic and ReCoN, ( generic?) responses. It is expected to observe common behaviors in-between cell-type, that the GRN and the generic CCC network already contribute captures.

    • not very clear

    Figure 2b the icon of cells with double arrows might suggest phenotype shift when instead this is just communication eTACs explain acronym and what they are Due to very few genes being differentially expressed, only cDC1 was conserved and evaluated for IL22, Not so clear In this showcase (not very clear, use case?) different fibroblast specializations - maybe phenotypes?

    Figure 4b b) Schematic view of the deconvolution process and cell type-specific count inference from the spatial niches. Not so clear what the heatmap shows, rows and columns Spots heatmap : label niche on rectangles in cols And each col is a spot Rows are cell types or cells? In the cell types x spot

    Cell2location. Add reference, maybe explain basic functionality?

    reconstructing different patients, tissues, and microenvironments to predict context-specific molecular treatments. Unclear fibrosis in different - at molecular levels

    Figure 5d myeloid and endothelial colour code inversed from 5 BC 5d indicate important pathways in organe should not change the colour of the nodes (purple=common, blue or green specific). Use border colour maybe? 5e is not a venn diagram e) Venn diagram showing the overlap between transcription factors (TFs) predicted by ReCoN (green) and those previously implicated in fibrosis (orange) or cardiac diseases (violet). Only the top 10 TFs were annotated from literature sources; full sizes of fibrosis- and cardiac disease-related receptor sets can therefore not be represented. f) also not a venn diagram e/f now in supp the "NABA ECM collagens" gene set. Nodes are grouped by molecular type (e.g., transcription factors, receptors, ligands), and links represent the weighted, direct regulatory interactions present in the ReCoN-constructed

    Why Sankey plot? Normally sankey plot represents flow (of regions changing from 1 state to another) but here this is just a weighted network? No communication from firbos back to other cell types? No communication between ventricular/myeloid/lymphoid?

    as a extension to - an underrepresented in the current. - current framework? However, it can't represent more - cannot Borrowing representation from hypergraphs, which introduces The network exploration implementation of ReCoN also present some limitations. limitations. While random walks with restarts offer a stable and fast exploration workflow for multilayer networks, it currently only considers positive weights to predict regulation strengths. It involves that the nature of the regulation, as activation or inhibition, has to be identified a posteriori.

    • check concordance/grammar

    Only the nodes that are included in one of the layers are present in the final results, ignoring the ones present only in bipartites. Unclear a scATAC - an Barsi et al is published https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013188 effects, allowing for modulating in a second time their contribution. - word order

    others. However, it is possible to adjust the Beta coefficient to represent it based on the available information for each dataset. Represent- adjust?

    We use the latter to compare the different models. - what is the latter?

    It resulted in the scRNA-seq in 1,789 cells with 13,167 genes, and for the scATAC-seq in 3,759 cells with 254,545 regions. Check english GRETA pipeline.- reference

    We kept all the cells whose annotations through unsupervised clustering, followed by marker gene annotations, through scANVI were coherent. Word order In parallel, pairs of ligands and receptors with both associated with scores above an absolute gene loading of 0.1 were considered potential driver interactions in HF. Unclear gseapy Python - reference?

    and to calculate average for each spatial context the average cell type expression. Unclear

    We only used the loadings of all cell types but the fibroblasts to consider the effect of the sole environment. Unclear We realised a downstream - performed

    The profiles inferred by ReCoN were first very correlated in all three contexts. - unclear

    Significance

    Provide contextual information to readers (editors and researchers) about the novelty of the study, its value for the field and the communities that might be interested.

    This is a very timely paper, dealing with an important gap in the literature. It is not an entirely new framework, but it integrates different existing approaches to solve a complex issue in a creative way. To my knowledge, it is the first attempt to consider and formalise regulation processes involving both intra- and inter-cellular interactions. The results support the importance of distinguishing the different paths that can relate the impact of a perturbation to specific genes/functions in different cells and their overall ecosystem.

    General assessment: provide a summary of the strengths and limitations of the study. What are the strongest and most important aspects? What aspects of the study should be improved or could be developed?

    The tool offers a combination of approaches, providing a coherent framework. The code is well documented and functional. The use cases are quite compelling. Sadly, the only type of validation possible involves confirmation of known facts from the literature, which makes it hard to evaluate the full impact of some of the predictions. I think the details of how the method works and especially how the performance was evaluated could be expanded and an assessment of how different parameters and choices impact the results would also be very helpful. An effort to compare the presented variations of the method to some other approach would be very welcome, but I am finding it hard to identify what an alternative approach could be comparable.

    Advance: compare the study to the closest related results in the literature or highlight results reported for the first time to your knowledge; does the study extend the knowledge in the field and in which way? Describe the nature of the advance and the resulting insights (for example: conceptual, technical, clinical, mechanistic, functional,...).

    Potentially the closest results are models that can predict the effect of perturbations on cell line cultures. Several approaches in the literature employ either transformers or optimal transport to predict the effect of perturbations in single cell datasets. One of the main issues is an underlying necessary assumption that the perturbation effect will be larger than the heterogeneity (in cell lines for example), which becomes increasingly difficult when considering in-vivo experiments. ReCoN obviously goes beyond this by considering explicitly the presence of different cell types but distinctions of cell types are sometimes quite arbitrary and potentially application of ReCoN to some of the in-vitro culture datasets, even on cell lines, could be a way to test its performance and benchmark it against other methods. The main bottleneck in the application of this framework to 'personalisation' of therapies, mentioned even in the abstract as a potential future goal for such an approach, will be the lack of data. This approach requires single cell level descriptions of the system at hand, plus additional datasets to build the model structure. To a certain extent, public data of related tissues/contexts can be used, but it will be necessary to test the dependence of performance on coherence of the input data to develop sufficient trust to use it for new predictions, especially in a medical field.

    The authors could comment on how their method compares to others that do not require single cell level information. Despite clear differences, it might be important to show the advantage of using this more complex approach that requires data that is less available. Given the ease with which bulk profiles can be constructed from single cell data, it might be possible to compare the approaches directly. For example, see K. Wang, S. Patkar, J.S. Lee, E.M. Gertz, W. Robinson, F. Schischlik, D.R. Crawford, A.A. Schäffer, E. Ruppin Deconvolving Clinically Relevant Cellular Immune Cross-talk from Bulk Gene Expression Using CODEFACS and LIRICS Stratifies Patients with Melanoma to Anti-PD-1 Therapy

    Mike van Santvoort, Óscar Lapuente-Santana, Maria Zopoglou, Constantin Zackl, Francesca Finotello, Pim van der Hoorn, Federica Eduati, Mathematically mapping the network of cells in the tumor microenvironment, Cell Reports Methods 2025

    Audience: describe the type of audience ("specialized", "broad", "basic research", "translational/clinical", etc...) that will be interested or influenced by this research; how will this research be used by others; will it be of interest beyond the specific field?

    Broad interest to biomedical researchers and also biologists in other fields. While the method allows advances in basic research on biological process regulation, a clear clinical application can be envisaged in immuno-oncology for example/ immunology and even general molecular medicine.

    Please define your field of expertise with a few keywords to help the authors contextualize your point of view. Indicate if there are any parts of the paper that you do not have sufficient expertise to evaluate.

    I am a computational biologist with expertise in network models, regulatory networks, agent-based models and especially familiar with the tumour microenvironment and processes therein. I can more or less appreciate the meaningfulness of the biological findings related to the mouse lymphnode example. I am much less of an expert on heart tissue modeling, heart failure, fibrosis etc, required to fully comprehend the impact of the second and third use cases.