Shared and organ-specific gene expression programs of fibrotic diseases

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Log in to save this article

Abstract

Fibrotic scarring is a common response to tissue injury. Repeated or severe insults can cause fibrosis, leading to excessive extracellular matrix deposition and a substantial clinical risk of organ dysfunction. Despite its high prevalence, few therapeutic options exist, and fibrotic diseases collectively represent a major global health burden. Fibrotic diseases affect virtually all organs, yet they have been explored mainly in isolation for each organ. As a result, proposed shared fibrotic mechanisms are often based on indirect comparisons between independent datasets rather than on a unified, systematic, cross-organ meta-analysis. To overcome this gap, we conducted a large-scale meta-analysis of single-cell transcriptomic data from healthy and fibrotic human tissues to identify both shared and organ-specific transcriptomic profiles. We constructed a single-cell fibrosis atlas of over five million cells from 20 studies, covering more than 25 disease etiologies affecting the heart, liver, kidney, and lung. Through systematic comparison of these datasets, we identified organ-specific as well as cross-organ fibrosis-associated gene expression profiles in major cell types and defined disease fibroblast subpopulations with excessive extracellular matrix production. These analyses revealed a conserved fibrotic response shared across tissues. Our analysis spans global comparisons of fibrosis-associated changes in cellular composition and predictive disease signatures to detailed examinations of individual genes, transcription factors, and intercellular communication patterns observed in fibrotic diseases across organs. We provide our cross-organ integration as a user-friendly open resource for investigating fibrotic diseases across organs. This resource will enable an accelerated discovery of disease mechanisms and faster development of broadly effective antifibrotic strategies in the future.

Article activity feed

  1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

    Learn more at Review Commons


    Reply to the reviewers

    General Statements

    Thank you for providing an assessment of our manuscript. Below, we outline our revision plan. The revisions address four main areas: the relationship between the identified molecular signatures and fibrosis severity or disease etiology; the criteria used to identify disease-associated fibroblasts; the interpretation of the genes and biological processes highlighted by our analyses; and the broader biological insights supported by the study.

    As part of the revisions implemented, we have:

    Associated organ-specific fibrotic molecular signatures and fibrosis severity scores available in the clinical metadata, helping to relate the identified transcriptional patterns to biologically meaningful aspects of fibrosis. Extended supplementary figures that more clearly present the decision-making process used to identify fibroblast subpopulations associated with fibrosis. Revised the methods, figures, legends, and captions in response to the reviewers' suggestions to improve clarity. Expanded the discussion of the results by incorporating the literature suggested by the reviewers, thereby providing additional context for the identified fibrotic signatures. Extended our spatial analysis using a more robust identification of fibrotic regions.

    We plan to:

    Extend our cell-cell communication and spatial analysis using deconvolution methods Provide comparisons between our unsupervised multicellular factor analysis of multiple studies with our supervised fibrotic signatures to ensure coherence between analyses. Perform additional comparisons between specific pairs of organs and additional cell types, instead of focusing solely on the comparison of all organs simultaneously. Expand the results and discussion to clarify the relevance and limitations of our study. We believe these revisions will strengthen our resource manuscript and will help us to provide a robust and reliable description of fibrotic processes across organs.

    Description of the planned revisions

    Reviewer #1

    Reviewer #1, major comment 1:* The group has been developing cutting edge bioinformatic tools for the community. The authors also provided scripts and the processed data for reproducibility. I have no doubt in their implementation of the methodology. I also understand the reasons of the objective tone throughout the manuscript. However, the authors made very little claims with biological significance. The conclusion of the study is vague with almost nothing mentioned in the abstract. What are the cross-organ effects in fibrosis identified in this study? I believe some additional claims would facilitate the reader with less technical knowledge to grasp the study better.*

    We understand the concern of the reviewer regarding the lack of an explicit discussion of the biological significance in the abstract and other parts of the manuscript, as most of the manuscript is focused on the comparison of studies at different levels. Our study defines which fibrosis-associated transcriptional patterns are reproducibly detectable across the currently available public single-cell datasets, while also identifying where cross-organ interpretation remains limited. We observed that some disease-associated transcriptional patterns recur across organs and studies, particularly in mesenchymal and endothelial compartments. In contrast, other compartments, including myeloid cells, showed weaker cross-organ agreement, which may reflect either greater tissue-context dependence or stronger sensitivity to differences in disease stage, sampling, and annotation. Finally, we observed a convergence of fibrotic signals in a subset of mesenchymal cells and show which genes are specifically expressed in actively scarring regions across organs, with TIMP1 being consistently identified as highly expressed in fibrotic regions by disease associated fibroblasts across tissues and modalities.

    Our results should be interpreted as robust and reproducible cross-dataset fibrosis signatures rather than definitive evidence for a specific pathophysiological mechanism. Therefore, we believe that the primary contribution of this study lies not in assigning causal roles to individual genes or pathways, but in providing a systematic framework for identifying fibrosis-associated programs that are reproducibly observed across studies, organs, and disease etiologies. As our analysis is entirely computational, we intentionally avoid making strong mechanistic claims without experimental validation. Instead, we envision this resource as a means to prioritize candidates and generate hypotheses for future functional studies.

    To address the reviewer's concern, we will make more explicit claims of our observations within our abstract and throughout the text to make our intentions and conclusions clearer. We will be more explicit about what information we are providing with our resource and how it can best be leveraged. We will further include our conclusions about cross-organ agreements described above, as well as specific observations from our analyses that help the reader to get a better grasp of the study.

    *Reviewer #1, major comment 3: *The authors performed multicellular factor modeling in each organ and identified factors that are distinct in fibrotic and reference tissue in Fig. 2B, e.g., factors 1 and 2 in heart. Are these factors driven by specific biological pathways? Could these factors also be used to identify common biological functions in fibrotic tissue across organs?

    We agree that, in principle, the latent factors identified by the multicellular factor models could be interrogated for their biological interpretation. Each factor is associated with a gene-weight vector per cell type, which can be analyzed similarly to a differential expression signature to identify enriched pathways and biological processes.

    However, we chose not to pursue a systematic factor-level interpretation for three reasons. First, as shown in Suppl. Figure 3, the contribution of individual factors to the separation between fibrotic and reference samples varies substantially across organs. In some organs, the distinction is largely captured by a single factor, whereas in others it is distributed across multiple factors. Second, because the models were trained independently for each organ, there is no direct correspondence between factor identities across organs, making cross-organ comparisons of individual factors difficult to interpret. Finally, we were not able to capture a fibrosis-related transcriptomic program from all organs.

    We therefore used the multicellular factor analysis primarily as an unsupervised approach to assess whether common fibrosis-associated variation could be detected across datasets. The observation that fibrotic and reference samples consistently separated along latent factors suggested the presence of shared disease-associated signals. For the subsequent biological interpretation, however, we opted for a supervised analysis framework based on differential expression and downstream functional enrichment, which allowed more direct and robust comparisons across organs and disease contexts.

    We will revise the manuscript to make this rationale more transparent to the reader. In addition, we will include an analysis demonstrating that the gene weights associated with the disease-relevant latent factors closely resemble the corresponding organ effect sizes in heart, kidney, and lung, illustrating that biological interpretation at the factor level yields conclusions that are highly consistent with those obtained from the supervised differential expression analysis. This further supports our decision to base the downstream functional analyses on the organ effect sizes, which provide a more straightforward framework for cross-organ comparison.

    Reviewer #1, major comment 4:* Although strong organ-specific effects, the author detected similar transcriptional changes in endothelial and mesenchymal cells in heart and lung at Fig. 3B. The analysis on disease-associated fibroblasts also showed much higher overlapped between heart and lung compared to, e.g., liver and kidney in Fig. 4C. Are there additional shared fibrosis features or functions in mesenchymal cells or disease-associated fibroblasts in heart and lung?*

    *Reviewer #1, major comment 5: *There seems to be certain degree of similarities among the epithelial cells in kidney and lung in Fig. 4B.

    Shared response for comments 4 and 5:

    Given the high number of combinations of comparisons, we decided to focus on the most shared signals (mesenchymal and endothelial) in our manuscript. However, as the reviewer notes, there are other comparisons, such as the one between epithelial cells from kidney and lung, or in endothelial cells between heart and lung, that may be important to report. We plan to revise the text in section "Fibrotic disease programs within tissues" to explicitly discuss the observed similarity and plan to additionally show the shared genes driving these similarities in a supplementary Figure in the manuscript.

    *Reviewer #1, major comment 8: *TNC appears in the lower bottom of the list in Fig. 6C. It is unclear why TNC was chosen as a board therapeutic target in the end.

    We agree that the original wording may have implied that TNC was selected because it was the top-ranked candidate in Figure 6C. This was not our intention. Rather, we chose TNC as an illustrative example because it emerged from our analysis without prior manual prioritization, has already been linked to fibrosis in specific disease contexts, and has been explored experimentally as a therapeutic target. At the same time, its role has not been investigated broadly across fibrotic diseases, making it a useful example of how the presented framework can identify candidates that may have relevance beyond the settings in which they were originally studied.

    We will revise the text to clarify that TNC is presented as one representative example from the set of prioritized candidates rather than as the single most highly ranked therapeutic target.

    Reviewer #1, minor comment 1:* Is there additional measure that account for the datasets with lower RNA counts shown in Fig. S1?*

    We thank the reviewer for highlighting this potential source of technical variation. We did not apply an additional correction specifically to account for datasets with lower RNA counts. Instead, to minimize the impact of differences in sequencing depth and cell-level sparsity across datasets, the majority of our analyses were performed on pseudobulk profiles rather than individual cells. Pseudobulk aggregation substantially reduces the influence of variation in RNA counts between cells and datasets, providing more robust estimates of gene expression. We therefore believe that differences in RNA counts had a limited impact on the main conclusions of the study. To illustrate this point, we plan on showing additional quality control summary plots for our pseudobulked data.

    Reviewer #2

    Reviewer #2, major comment 3:* Fig. 4B-C: the full list of organ-specific and overlapping genes should be given in a supplemental table.*

    We thank the reviewer for this suggestion. We agree that providing the complete lists of organ-specific and overlapping genes improves the transparency and utility of the analysis. We will provide the full gene lists underlying Figures 4B-C as supplementary tables in the revised manuscript. These tables will provide the complete set of genes used for the reported overlap analyses and allow readers to further explore the identified organ-specific and shared fibrotic programs.

    Reviewer #2, major comment 6:* Cell-cell communications analysis: It would be informative to add a circosplot highlighting the best cell-cell communication candidates in each organ. The authors should also provide the full list of predicted interactions in a supplementary table, including scores for each organ for each interaction. Additionally, it would be important to focus specifically on ligand-receptor pairs associated with growth factors and cytokines. While incorporating Visium data is very interesting and challenging, it may reduce sensitivity due to its relatively poor capture efficiency. This could particularly overemphasize the importance of collagens and other ECM-related factors, which are highly expressed.*

    We agree that additional visualization and data availability would improve the presentation of the cell-cell communication analysis. Therefore, we will add additional organ-specific visualizations highlighting the highest-confidence cell-cell communication candidates within each organ, providing a more intuitive overview of the predicted interactions. Second, we plan to include the complete list of predicted ligand-receptor interactions as supplementary tables, including the corresponding scores for each organ and gene annotations (i.e. cytokine, growth factor, etc.), allowing readers to explore the full set of predictions underlying the analyses.

    We also agree that highly expressed extracellular matrix components, such as collagens and proteoglycans, can dominate CCC analyses, especially when investigating fibrotic diseases. Indeed, this consideration motivated our final therapeutic target prioritization strategy (Figure 6). In this analysis, we specifically excluded collagens and proteoglycans, thereby enriching for extracellular signaling molecules that are more likely to represent biologically informative and therapeutically actionable cell-cell communication events. We will modify the results section to clarify our rationale for this analysis.

    Reviewer #2, major comment 8:* Visium Dataset Analysis: It would be interesting to compare fibrotic areas across different organs by performing niche or topic analyses using supervised deconvolution approaches (such as RCTD). This would allow for a better estimation of cell composition and functional annotations of fibrotic and inflammatory areas.*

    We agree that a cell type deconvolution would provide an informative framework for characterizing the cellular composition of fibrotic niches and its association with the fibrotic signatures we derived from single-cell data. We plan to address the reviewer's suggestion by running a cell type deconvolution analysis of the Visium datasets to estimate the enrichment of major cell populations within scar regions and compare them across organs. We hope that these additional analyses will provide complementary information on the cellular composition of these areas.

    Reviewer #2, minor comment 1: p11: the authors conclude that "cell proportions differed not only between patients and organs, but also that there was no uniform abundance change in disease". This result may reflect technical variability, particularly due to dissociation biases from very different organs or the use of different platforms. This limitation should be discussed.

    We agree that differences in cell type proportions may not only reflect biological variation but can also be influenced by technical factors, including organ-specific dissociation biases, differences in tissue processing, and the use of distinct sequencing platforms. We will expand the text to explicitly acknowledge these potential confounding factors and to emphasize that the observed differences in cell abundances should be interpreted with appropriate caution.

    Reviewer #2, minor comment 3: Panel E in Fig. 5 is difficult to read and needs to be improved.

    To improve the readability of the figure, will include fewer ligand-receptor pairs and additionally add grey boxes in the background to help the reader to better distinguish the ligand-receptor pairs from each other.

    Reviewer #3

    Reviewer #3, minor comment 1:* P5: Some context regarding expected differences between single cell and single nuclei datasets here would be good (especially if some differences are potentially important).*

    We agree that adding context regarding the expected differences between single cell and single nuclei datasets would add value to the manuscript. These differences have been investigated in the past and were shown to have an impact on the RNA-sequencing results and their interpretations (Van Melkebeke et al. 2024; Lake et al. 2023; Feng et al. 2026; Denisenko et al. 2020; Litviňuková et al. 2020; Koenitzer et al. 2020). We therefore plan to include more background information, including the distinct capture biases and transcriptomic characteristics, to highlight that these differences should be considered when comparing datasets generated using different protocols.

    *Reviewer #3, minor comment 6: **P12: Please clarify whether the multicellular factor model is fit jointly across all datasets within an organ, or separately per dataset followed by comparison. If fit jointly, how are batch/study effects handled? If fit separately, how are factors aligned across invocations? *

    Is it possible to say how much of this consistency across datasets is due to non-fibrotic or non-disease state regulation? Are the disease-associated factors driven by coordinated changes across multiple cell types, or primarily by one dominant cell type? And if the latter, is this related to expression magnitude, or cell type abundance?

    We agree that the description of the multicellular factor model in the original manuscript did not provide sufficient methodological detail.

    The multicellular factor model was fitted jointly across all datasets within each organ, resulting in one model per organ (four models in total). Following the strategy proposed in the MOFA+ framework (Argelaguet et al. 2020), individual studies were treated as groups within the model, allowing the integration of multiple datasets while accounting for study-specific effects. Because the model uses cell type-specific pseudobulk profiles as separate views, the inferred factors reflect coordinated transcriptional changes across cell types rather than differences in single-cell abundance. Pseudobulk aggregation substantially reduces the influence of cell number variation, and we applied quality control thresholds to ensure that only samples with sufficient counts for each cell type were included.

    To further clarify the relationship between latent factors and fibrosis, we plan to add an additional analysis showing the proportion of variance explained (R²) by each factor across studies and cell types. The R² can be used as a proxy of the importance of a cell-type in defining the latent factor. Whereas many latent factors capture sources of biological or technical variation unrelated to disease, only a subset consistently separates fibrotic from reference samples. These disease-associated factors therefore represent fibrosis-specific variation rather than general transcriptional structure and are the factors we highlighted in the manuscript text to support that different studies had a consistent disease signal.

    We will incorporate these clarifications into the manuscript to make the modeling framework and its interpretation more transparent and add additional analyses showing the variance explained as extra insights into the models.

    *Reviewer #3, minor comment 12: **P23: What conclusions should be drawn from the broad cell-type communication comparisons between organs in Fig. 5A? The text reports which broad cell-type pairs account for many upregulated ligand-receptor interactions, but it is not clear whether these comparisons identify fibrosis-specific communication or mainly reflect broad tissue architecture, cell-type abundance, etc. *

    If the broad categories were chosen because finer cell-state annotations are not consistently available across studies, it would be helpful to state this limitation explicitly.

    We agree that the rationale and interpretation of the broad cell-cell communication analysis should be described more clearly in the manuscript.

    The analysis shown in Figure 5A is based on the organ-specific mixed-effects differential expression models and therefore reflects disease-associated changes in ligand and receptor expression between fibrotic and reference samples, rather than absolute expression levels. Therefore, Figure 5A shows which cell type pairs increase their communication in fibrosis, based on the amount of ligand-receptor pairs that are differentially expressed above a threshold. As the mixed-effects models run per cell type separately, it is unlikely that an increase in cell type proportion causes more upregulated communication events to another cell type with this type of analysis. Overall, we do not see a correlation between increase in cell type proportion in the tissue (Figure 2A) and number of upregulated genes with the mixed effect models (Figure 4A). Therefore, we do not think that cell type proportions have a high effect on this particular analysis.

    We also agree that the use of broad cell type categories warrants clarification. These categories were chosen because they can be robustly harmonized across the diverse datasets included in this meta-analysis, whereas finer cell-state annotations are not consistently available or comparable across studies and organs. We plan to revise the manuscript to clarify both the interpretation of Figure 5A and the rationale for using broad cell type categories in this analysis.

    *Reviewer #3, minor comment 14: *P31: The therapeutic suggestions should come with some discussion that this is association rather than causation, as it's not established that these are causal drivers. MOXD1 seems compelling, especially if this has been observed to have a potential therapeutic effect in other fibrotic diseases, and this is an excellent outcome that justifies the meta-analysis approach. TNC is somewhat more speculative in this regard, so if there is any mechanistic or other motivations, it would be good to include them here.

    We agree that the therapeutic implications of our findings should be interpreted with appropriate caution, as our analyses identify associations rather than causal drivers of fibrosis.

    These candidates were selected based on the combination of our computational prioritization results and the existing literature, rather than a causal role that has been established by our analysis. Our intention was to provide representative examples of how the presented framework can recover biologically plausible candidates with existing experimental support while simultaneously suggesting their potential relevance across a broader range of fibrotic diseases. We plan to revise the discussion to more clearly emphasize that the proposed therapeutic candidates represent hypothesis-generating observations that require experimental validation.

    *Reviewer #3, minor comment 16: *P31: It would be nice to have what you think the issues are with the lack of patient metadata, and how these issues might manifest in the analyses (this links with the previous comment regarding disease stage).

    The lack of detailed clinical and histological metadata substantially limits the range of biological and clinical questions that can be addressed, thereby reducing the value that can be extracted from the considerable effort and cost associated with large-scale tissue sequencing studies. In the current study, we are mostly restricted to comparing fibrotic and reference samples because information such as disease stage, fibrosis severity, time since diagnosis, medication, treatment history, tissue sampling location, and other clinical covariates is largely unavailable or inconsistently reported across studies. If these metadata were available, they could be explicitly incorporated into the statistical models, allowing analyses that relate transcriptional changes to clinically relevant variables such as fibrosis severity or disease progression rather than simply disease status.

    Furthermore, additional patient metadata would allow potential confounding factors to be accounted for or controlled in the analysis. For example, treatment effects or other clinical characteristics could be modeled directly or specific patient groups could be excluded where appropriate, leading to a clearer separation of disease-associated biology from technical or clinical confounders.

    We will expand the Discussion to more explicitly describe these limitations and their potential impact on the interpretation of our results.

    Description of the revisions that have already been incorporated in the transferred manuscript

    To facilitate review of the revised manuscript, we have grouped our responses into two categories. First, we address comments that resulted in substantial new analyses, figures, or modifications to the interpretation of the results. Second, we address minor and editorial comments, which have already been directly incorporated into the revised manuscript.

    3.1 Comments requiring additional analyses or substantial revisions

    Reviewer #1

    *Reviewer #1, major comment 2: *The authors have pooled the data from at least five different disease per organ to identify the pan-fibrosis signature across diseases. Some of the diseases, e.g., pneumonitis, ICM, MI, MCD, ALD) may present more acute remodeling compared to the rest, which might exhibit distinct features that mask the analysis. The extent of fibrosis also varies very significantly. A correlation with histological data is required.

    We agree that fibrotic diseases differ substantially with respect to disease etiology, disease stage, extent of remodeling, and the degree of fibrosis present in the tissue. We had highlighted this as a key limitation of the study in the discussion:

    "Second, the limited availability of patient metadata leaves many aspects unresolved, including the exact diagnosis, disease severity, tissue sampling location, and the extent of fibrosis. If these aspects were better documented, they could be accounted for in the analysis and could allow a clearer distinction of physiological from pathophysiological fibrotic processes. Third, we treated all disease etiologies collectively under the term "fibrosis". However, the degree of fibrotic remodeling likely varies between conditions, and the dataset remains imbalanced in terms of sample representation across organs."

    While comprehensive histological and disease severity information was not consistently available across the published datasets included in our meta-analysis, we were able to further investigate this question in the subset of studies for which fibrosis-related metadata were available. Specifically, we derived organ-specific fibrosis signatures, scored these signatures across patients, and performed a per-study normalization. In these datasets, our derived organ fibrosis scores correlated with available fibrosis severity measurements, supporting the biological relevance of the identified programs (Figure S5A-D).

    In addition, these analyses indicate that fibrosis signature scores vary across disease etiologies, consistent with the reviewer's suggestion that different diseases may exhibit distinct degrees of fibrotic remodeling (Figure S5E). However, given that most of the etiologies are covered by a single study, it is not possible to disentangle these results from the type of controls used by each study and technical variability.

    Nevertheless, because detailed histological and clinical metadata are available only for a limited subset of studies, we believe that a comprehensive analysis of fibrosis severity, disease chronicity, and etiology-specific remodeling is not possible with the currently available data. Future studies with more uniformly annotated patient cohorts will be well-positioned to address these questions in greater depth. Our findings should therefore be interpreted as identifying molecular programs consistently associated with fibrotic disease across diverse conditions, rather than as a direct measure of fibrosis severity itself. We have included these observations in the results section "Identification of shared gene programs per tissue":

    "As multiple disease etiologies and disease stages were integrated in each organ, we asked whether the extracted organ-consensus genes were associated with fibrosis severity. However, fibrosis severity measurements were unavailable for the majority of studies, preventing a systematic assessment of severity across the integrated dataset. To nevertheless evaluate whether the identified programs captured biologically meaningful aspects of fibrosis, we derived organ-specific fibrosis signatures, scored these signatures across patients, and performed a per-study normalization. In datasets containing fibrosis severity measurements, our derived fibrosis signature scores correlated with fibrosis severity, supporting the biological relevance of the identified programs (Figure S5A-D). Furthermore, we observed differences in signature scores across disease etiologies (Figure S5E). However, because disease etiologies were unevenly distributed across studies, it remains difficult to distinguish true biological differences from study-specific technical effects. Overall, these results suggest that there is a part of the fibrotic program that appears to be shared within most tissues, primarily found in endothelial, mesenchymal, and epithelial cells. Furthermore, our findings indicate that the identified organ-consensus programs capture biologically meaningful aspects of fibrosis."

    To explain our methodology, we further added this section to our methods:

    "Fibrosis severity scoring

    To associate the organ-consensus gene signature with fibrosis severity, we first extracted an organ-consensus gene set per organ from the organ-specific gene ranking. Specifically, for each cell type and organ, genes were ranked based on the random-effects meta-analysis estimate obtained from differential expression analyses across studies. Only genes detected in at least three studies were considered for downstream analyses. Positively associated genes were required to have a non-negative upper confidence interval bound and were ranked by decreasing effect size, whereas negatively associated genes were required to have a non-positive upper confidence interval bound and were ranked by increasing effect size. The top 200 positively associated genes and the top 100 negatively associated genes were retained for each cell type-organ combination.

    To give each sample a fibrosis score, pseudobulk profiles were generated for each study by aggregating raw counts across all annotated cells per sample, excluding samples with fewer than three annotated cell types. Pseudobulk count matrices were normalized to 10,000 counts per sample, followed by log-transformation. Gene set activities were inferred per sample using decoupler's (124) (v1.9.0) univariate linear model (ULM) with curated organ-consensus gene sets, yielding enrichment scores for each sample.

    Finally, these enrichment scores were normalized per study: For each study, the mean and standard deviation of enrichment scores were calculated for all control samples. Sample-level scores were then centered against the corresponding study-specific control mean and additionally converted to standardized scores by dividing by the control standard deviation."

    Reviewer #1, major comment 7:* The graphs in Fig. S6A do not clearly present how the disease-associated fibroblasts are identified. The true identities of disease should also be plotted in these UMAPs. The results indicating these cells expressed myofibroblast signature should also be shown confirming that these cells are not other mesenchymal cells, e.g., pericytes or smooth muscle cells.*

    We agree that the original supplementary figures did not sufficiently illustrate how disease-associated fibroblast populations were identified and distinguished from other mesenchymal cell types. To improve transparency, we have substantially expanded the original Figures S6A-C with four organ-specific supplementary figures (Figures S6-S9). For each organ, we now provide:

    Cluster-level compositional analyses showing changes in abundance between healthy and fibrotic samples. (A) Percentage of mesenchymal cell labels as disease-associated fibroblast (blue) and "rest" per study. (B) Expression of canonical marker genes for myofibroblasts, pericytes, and smooth muscle cells across clusters. (C) The top marker genes for the cluster(s) selected as disease-associated fibroblasts. (C) UMAP visualizations colored by disease etiology and disease condition (fibrosis vs. control), the study, and the original author-provided cell state annotations, including myofibroblast/activated fibroblast annotations where available. (D - G) UMAP visualizations colored by the final annotations used in the subsequent analysis. (H) These additions make the selection procedure substantially more transparent and provide multiple independent lines of evidence supporting the identification of disease-associated fibroblast populations.

    The rationale for the selected clusters is now evident from the revised supplementary figures. In the lung, the selected cluster 3 exhibits a clear increase in abundance in fibrotic samples, expresses canonical myofibroblast markers, and corresponds closely to activated fibroblast/myofibroblast annotations provided in the original studies. In the heart, the selected cluster 1 was the only population showing a robust disease-associated expansion together with strong myofibroblast marker expression and agreement with published annotations. Although another small cluster (cluster 4) displayed partial myofibroblast characteristics, its very low abundance would have a negligible impact on our pseudobulk-based analyses. In the liver, the selected cluster showed consistent expansion across studies and expressed canonical myofibroblast markers, although author-provided annotations were not available for direct comparison. Finally, the kidney datasets presented the greatest integration challenges, likely due to differences between single-cell and single-nucleus protocols. Here, we selected two clusters (cluster 0 and cluster 4) that increased in fibrosis and expressed fibroblast-associated markers, while excluding another expanding cluster (cluster 2) that showed a pericyte-like expression profile. Overall, our final annotations were broadly consistent with the original study annotations wherever such information was available.

    Changes in the manuscript:

    "We integrated the mesenchymal cell population per organ and identified a disease-associated cluster by compositional analysis (Figure 4A, Figures S7-Figure S10)."

    Furthermore, we added the following section to our methods to clarify our methodology:

    "Candidate clusters were required to show consistent enrichment in fibrotic samples and a transcriptional profile characteristic of activated fibroblasts/myofibroblasts. In cases where multiple candidate populations were present, clusters with low abundance or expression profiles inconsistent with myofibroblast identity (e.g., pericyte-like populations) were excluded. Final cluster assignments were validated against the original study annotations whenever available."

    Reviewer #2

    Reviewer #2, major comment 1:* Fig.4A: Fibroblast Population Analysis. The authors integrated the fibroblast populations per organ to identify a disease-associated cluster by compositional analysis. In some models, more than one pathological clusters are revealed by the analysis. Shouldn't they be included as pathological, or at least excluded, from the reference population used as a control for differential expression?*

    We thank the reviewer for this important comment. We agree that, in some organs, more than one cluster shows features associated with disease and that the selection of disease-associated fibroblast populations should therefore be carefully justified. To improve transparency, we have substantially expanded the supplementary analyses and replaced the original Figures S6A-C with four organ-specific supplementary figures (Figures S7-S10), as described in our answer to Reviewer #1, major comment 7.

    Regarding the reviewer's suggestion to exclude additional potentially pathological clusters from the reference population, we chose not to do so. In many cases, the identity of these secondary clusters is less clear, and excluding them would introduce an additional layer of subjective decision-making that may not necessarily improve robustness. Instead, we used a conservative strategy in which only well-supported disease-associated fibroblast populations were explicitly selected. Furthermore, all downstream analyses of disease-associated fibroblasts were performed using pseudobulk profiles. Because pseudobulk aggregation emphasizes broad transcriptional trends, we expect the resulting signatures to be relatively robust to the inclusion or exclusion of small, ambiguously annotated subpopulations. For these reasons, we believe that retaining the remaining mesenchymal populations in the reference group provides the most objective and reproducible framework for the differential expression analysis.

    For changes in the manuscript associated to this comment, please see our answer to Reviewer #1, major comment 7.

    *Reviewer #2, major comment 7: *Scar-specific cell-cell communication: Using only COL1A1 as a marker may not be the best option, as this gene is also expressed in normal areas. Suggestion: Use a score combining the best fibrosis-associated genes across the four organs to define fibrotic areas more accurately?

    We thank the reviewer for this suggestion. We agree that COL1A1 is not exclusively expressed in fibrotic regions and can also be detected in normal tissue. To make the analysis more robust, we revised our approach and no longer rely on a single marker gene. Instead, we now compute an enrichment score based on a broader set of established extracellular matrix components, including all collagens and proteoglycans collected by Naba et al. (2012), thereby identifying regions characterized by active matrix deposition rather than expression of COL1A1 alone.

    We then assess the spatial colocalization of candidate ligands and receptors with these ECM-enriched regions across the entire tissue section and focus on the strongest colocalization signals. Importantly, this spatial analysis is subsequently integrated with the disease-associated fibroblast analysis, allowing us to prioritize genes that are both enriched in disease-associated fibroblasts and localized to ECM-rich regions.

    We acknowledge that ECM-rich regions are not necessarily equivalent to fibrotic scar tissue and that some physiologically matrix-producing regions may also be captured by this approach. However, because the analysis is performed across entire tissue sections and multiple independent samples, we expect such regions to contribute primarily as background signal for fibrotic slides. By focusing on the strongest and most consistently colocalizing ligands and receptors across samples, the analysis is designed to identify signals robustly associated with ECM-rich regions rather than being driven by isolated areas of physiological matrix expression.

    We considered the reviewer's suggestion of defining fibrotic regions using fibrosis-associated genes derived from our single-cell analyses. However, we chose not to pursue this strategy because it would introduce a degree of circularity into the analysis. Specifically, the same fibrosis-associated genes would first be used to define fibrotic regions and evaluate for spatial association with candidate ligands and receptors. They would naturally be used again in the gene expression ranking of disease-associated fibroblasts. However, we would like to compare those genes we have found in our meta-analysis with an independent data-modality. Therefore, by instead using an independent ECM-based definition of scar regions, we avoid this potential bias and maintain a clearer separation between the identification of fibrotic regions and the prioritization of disease-associated signaling molecules.

    We compared the results from before (COL1A1-to-gene colocalization) to our results now (ECM enrichment-to-gene colocalization) and found high correlation values between both results for each organ (Review Plan Figure 1). To further show that we expect the pathophysiological ECM signature to largely overshadow physiological ECM expression, we quantified their scores per slide (Figure 6B). We think that our new analysis method is more robust than before, as we now combine several genes into one score.

    We have updated Figure 6 and its text with these new results in our manuscript:

    "To refine these insights, we next focused on identifying ligands and receptors that are specifically expressed in actively scarring regions. We prioritized these molecules because, as extracellular signaling factors and cell-surface proteins, they are directly accessible to therapeutic intervention and therefore represent particularly attractive candidate targets. Structural extracellular matrix molecules were excluded as candidate genes in this analysis and were used instead for the identification of fibrotic scar regions.

    Accordingly, we calculated an ECM enrichment score for each spatial spot, based on a broad set of established structural extracellular matrix components, consisting of all collagens and proteoglycans collected by Naba et al.(Naba et al. 2012). We then computed the spatial colocalization of all remaining ligands and receptors with the identified scarring regions (see methods). Finally, we compared the scar-localization of each gene per organ to the organ-consensus scores of disease fibroblasts (Figure 6A). ECM enrichment scores were significantly elevated in fibrotic compared with control samples across all four organs (Wilcoxon rank-sum test: heart p = 0.005; lung p = 0.002; liver p = 0.014; kidney p = 4e-6, Figure 6B), indicating that pathological extracellular matrix production substantially exceeds physiological ECM turnover. We overall observed a low correlation between scar localization of ligands and receptors and organ effect size in each organ (R in heart = 0.32, liver = 0.38, lung = 0.09, kidney = 0.11), suggesting several cell types and states to be involved in scar-tissue gene expression or a fibrotic gene expression change that goes beyond the scar area (Figure 6C). When comparing the overlap between top ranked genes per organ (upper 20th percentile in gene regulation and colocalization), we observed 8 genes that were identified in 3 out of 4 organs (VIM, TIMP1, FSTL1, CCN2, ANXA2, FBN1, FN1, THBS2), and 2 genes (TIMP2, MRC2) that were identified in all four organs (Figure 6D)."

    Furthermore, we updated Supplementary Figure 12 to include ECM enrichment scores instead of COL1A1 expression.

    Finally, we updated the methods section:

    "To identify actively scarring regions, we performed an enrichment analysis of the geneset consisting of Collagens and Proteoglycans using *decoupler's (124) *(v1.9.0) univariate linear model (ULM). The spatial colocalization of scarring regions and targets of interest was estimated with the bivariate Moran's R metric implemented in LIANA+ (130) v1.5.0 per target and Visium slide."

    Reviewer #3

    *Reviewer #3, minor comment 13: *P30: The staging or severity of each of the diseases seems like quite a strong confounder, especially if there is a bias for sampling tissues that are late stage. It would be nice to see this addressed more explicitly in the results, perhaps with some comparisons between those that are identified as earlier and later stage in the respective fibrotic diseases (if these annotations exist).

    We thank the reviewer for raising this important point. We agree that disease stage and severity are potential confounding factors in any meta-analysis of fibrotic diseases and that a bias toward sampling late-stage disease could influence the molecular programs identified.

    Unfortunately, disease staging and fibrosis severity annotations were not consistently available across the published datasets included in our analysis. As a result, we were unable to systematically stratify samples into early- and late-stage disease groups across all organs and disease etiologies. We have therefore highlighted this limitation in the discussion:

    "Second, the limited availability of patient metadata leaves many aspects unresolved, including the exact diagnosis, disease severity, tissue sampling location, and the extent of fibrosis. If these aspects were better documented, they could be accounted for in the analysis and could allow a clearer distinction of physiological from pathophysiological fibrotic processes."

    Nevertheless, we sought to address this concern in the subset of studies for which fibrosis-related severity measurements were available. Specifically, we derived organ-specific fibrosis signatures, scored these signatures across patients, and performed per-study normalization. In these datasets, fibrosis signature scores correlated with available fibrosis severity measurements, supporting the biological relevance of the identified programs (Figure S5A-D). In addition, these analyses indicate that fibrosis signature scores vary across disease etiologies, consistent with the reviewer's suggestion that different diseases may exhibit distinct degrees of fibrotic remodeling (Figure S5E).

    Nevertheless, because detailed histological and clinical metadata are available only for a limited subset of studies, we believe that a comprehensive analysis of fibrosis severity, disease chronicity, and etiology-specific remodeling is beyond the scope of the currently available data and that the currently available metadata are insufficient to robustly compare early- and late-stage disease across the full collection of datasets. We agree that a systematic investigation of stage-specific fibrotic programs would be highly valuable and represents an important direction for future studies using more comprehensively annotated patient cohorts.

    For changes in the manuscript associated to this comment, please see our answer to Reviewer #1, major comment 2.

    3.2 Editorial corrections or clarity improvements

    Reviewer #1

    *Reviewer #1, major comment 6: *The authors focused on the common functions between mesenchymal and endothelial cells among organs in Fig. 3H and I. Are there cell type specific effects here but shared across organs?

    We thank the reviewer for this question. The results shown in Figures 3H and 3I already represent cell type-specific functional enrichments, as the analyses were performed independently for each cell type before identifying pathways that are consistently altered across organs. Thus, the reported enrichments correspond to cell type-specific effects that are shared across fibrotic diseases in different tissues.

    At the same time, we agree with the reviewer that an interesting observation emerging from these analyses is the overlap in the enriched biological processes identified across different cell types. This suggests that, despite clear cell type-specific transcriptional responses, multiple cell populations converge on a common set of fibrosis-associated pathways. To avoid potential confusion, we have revised the text to clarify that Figures 3H and 3I display cell type-specific enrichments and that the overlap between cell types reflects convergence on shared biological processes rather than identical gene-level responses. Furthermore, we pointed out one difference shown in the plots: the enrichment of neuronal development and axonogenesis pathways in mesenchymal cells.

    "This association with development was further supported by the functional characterization of upregulated genes per organ and cell type."

    [...] "In addition, enrichment of neuronal development and axonogenesis pathways points to activation of projection-related programs, which were not present in the endothelial cell population (Figure 3I). "

    *Reviewer #1, major comment 9: *It is unclear why only known ligands and receptors are included in the therapeutic target identification analysis in Fig. 6B.

    Our intention was to focus the therapeutic target identification analysis on known ligands and receptors, while excluding major extracellular matrix (ECM) components, because ligands and receptors are generally more amenable to therapeutic intervention and therefore represent particularly attractive candidate targets. To clarify this rationale, we have revised the manuscript text to explicitly describe the criteria used for target selection and the motivation for restricting the analysis to this subset of genes. The corresponding clarification has been added to the results section "Scar-specific cell-cell communication":

    "To refine these insights, we next focused on identifying ligands and receptors that are specifically expressed in actively scarring regions. We prioritized these molecules because, as extracellular signaling factors and cell-surface proteins, they are directly accessible to therapeutic intervention and therefore represent particularly attractive candidate targets. Structural extracellular matrix molecules were excluded as candidate genes in this analysis and were used instead for the identification of fibrotic scar regions."

    *Reviewer #1, minor comment 2: *The description or legend for the colors is missing in Fig. 3A

    We thank the reviewer for this comment. The color legend was included in the original version of Figure 3A; however, we agree that its placement did not make it sufficiently prominent and may have reduced its visibility. To improve clarity, we have revised the figure layout and repositioned the legend of Figure 3A above the plot so that the color annotation is more readily identifiable.

    *Reviewer #1, minor comment 3: *FAP appears to be the top gene with robust upregulation in fibrotic heart, lung, liver, and kidney in Fig. 3E, which is also a well-establish surrogate of fibroblast activity and tissue fibrosis in clinical settings (for instance, PMID: 38279381) but not mentioned anywhere in the text.

    We thank the reviewer for highlighting the upregulation of FAP across fibrotic organs. We agree that FAP is a well-established marker of activated fibroblasts and tissue fibrosis and therefore deserves explicit mention at this stage of the analysis. We have revised the text accompanying Figure 3E to highlight FAP as one of the most consistently upregulated genes across organs and to note its established relevance in fibrotic disease:

    "One of the most robustly upregulated genes across organs was prolyl endopeptidase FAP (FAP), a well-established marker gene of activated fibroblasts that has been shown to be functionally relevant in fibrotic diseases in several clinical settings."

    *Reviewer #1, minor comment 4: *Although it is clear that this study was performed at a much larger scale, the additional gain compared to the previous attempt on identification of shared feature in fibrotic heart, lung, liver, and kidney should be mentioned (PMID: 41752153).

    We thank the reviewer for pointing out this relevant study (PMID: 41752153). We agree that it represents an important previous effort to identify shared features across fibrotic diseases and should be discussed. We have therefore revised the Introduction to acknowledge this work and clarify how the present study extends beyond it. Specifically, while the previous study compared fibrotic heart, lung, liver, and kidney tissues, it was based on a limited number of studies and disease contexts per organ. In contrast, our analysis integrates a substantially larger collection of datasets spanning multiple disease etiologies within each organ, enabling a more systematic assessment of conserved and tissue-specific fibrotic programs across diverse fibrotic diseases.

    • "Recent studies have sought to define shared molecular features across fibrotic diseases affecting the heart, lung, liver, and kidney (15). However, these analyses were based on one study and limited disease contexts per organ, restricting their ability to systematically assess the robustness and generalizability of shared fibrotic programs across diverse disease etiologies."*

    Reviewer #2

    *Reviewer #2, major comment 4: *Fig.4D: Among this top list, DNM3OS has been indeed characterized as a regulator of the TGF-β pathway in lung fibrosis and should be cited (PMID: 30964696). Interestingly, this lncRNA encodes a cluster of miRNA, including miR-199a-5p, that has been found deregulated in various fibrotic models including lung, kidney and liver (PMID: 23459460).

    We thank the reviewer for highlighting the functional relevance of DNM3OS in fibrosis to improve the manuscript. We checked the literature and agree that its role as a regulator of TGF-β signaling and the involvement of its associated miRNA cluster, including miR-199a-5p, provide important context for interpreting our findings.

    We have therefore expanded the discussion of Fig. 4D and DNM3OS in the manuscript and added the suggested references. Specifically, we now note that DNM3OS was consistently upregulated across organs and that both DNM3OS and its associated miRNA miR-199a-5p have been implicated as downstream effectors of TGF-β signaling involved in myofibroblast activation in lung fibrosis, as well as in experimental models of liver and kidney fibrosis.

    "Furthermore, long noncoding RNA dynamin 3 opposite strand (DNM3OS) was consistently upregulated across organs. DNM3OS and its associated miRNA, miR-199a-5p, have been identified as downstream effectors of TGF-β signaling and implicated in myofibroblast activation in lung fibrosis (76), as well as in experimental mouse models of liver and kidney fibrosis (77)."

    *Reviewer #2, major comment 5: *Fig. 3F-I and Fig. 4E: the list of the predicted downstream genes for each TF should be provided in a supplemental table

    The transcription factor target gene sets used in these analyses were not generated as part of this study but were obtained from previously published and publicly available regulatory network resources. Because these target gene lists are extensive and already available through the original resource, we did not include them as supplementary tables. To improve transparency and reproducibility, we have revised the manuscript to clearly state the source of these regulatory networks and provide the corresponding reference(s) and access information, allowing readers to retrieve the complete target gene sets used in our analyses. Therefore, in the section "Common aspects of fibrosis across tissues in endothelial and mesenchymal cells", we now state that the collection is publicly available and refer to the methods section:

    "From organ effect sizes, we also inferred transcription factor (TF) activities per organ using CollectTRI (54), a curated publicly available collection of TF-targets, and identified the most commonly upregulated TFs based on the up- or downregulation of the genes they regulate across organs (see methods)."

    In addition, we specifically state in the methods section how the regulons can be accessed:

    "CollecTRI regulons are publicly accessible as described in the original publication (64), for instance at https://zenodo.org/records/8192729?preview_file=CollecTRI_regulons.csv."

    *Reviewer #2, minor comment 2: *Several panels (Fig.3F-I, Fig.4E-F) need to be improved, in particular the dot plots. with the same order for organs than for the other panels and another range for the size of the dots (-log10 pvalue) to reduce the max size of the dot as well as the enrichment score to expand the value of the z-score.

    We thank the reviewer for these suggestions regarding figure presentation. To improve the readability and consistency of the dot plots, we have made several changes to the figures. We believe these changes substantially improve the interpretability of the figures while preserving the underlying biological signal.

    First, we reordered the organs in Figures 3F-I and 4E-F (see above, in answer to Reviewer #1, minor comment 2 and below, respectively) to match the ordering used throughout the remainder of the manuscript. Second, we expanded the displayed enrichment score range from −2 to 2 to −4 to 4. While many values remain relatively homogeneous, this reflects the fact that these panels were specifically designed to highlight the most consistently shared and strongly regulated signals across organs. Third, we adjusted the dot size scaling for the adjusted p-values. To further improve the visualization of statistical significance, we now explicitly indicate significance using circle outlines: features with an adjusted p-value

    Reviewer #2, minor comment 4: The study is meticulously designed and clearly presented, employing a robust combination of computational approaches. To the reviewer's knowledge, this is the first systematic, cross-organ meta-analysis of fibrosis, offering a comprehensive characterization of both organ-specific and shared gene programs associated with fibrotic processes. A particularly commendable aspect of this work is the provision of a rich and accessible dataset through an interactive data browser, which will serve as a valuable resource for the scientific community at large. The impact of this study is broad and multidisciplinary, benefiting not only computational biologists but also experimental biologists and clinicians working in the field of fibrosis.

    We appreciate the positive assessment of our work and would like to thank the reviewer for recognizing the value of the systematic cross-organ analysis and the interactive data browser. We are pleased that the reviewer considers the study to be a useful resource for the fibrosis research community and appreciates its potential relevance to computational and experimental researchers, as well as clinicians.

    Reviewer #3

    *Reviewer #3, minor comment 2: *P6: 43 {plus minus} 9 % - this looks a little strange as a percentage, leaving it as a count would probably be clearer as its quite a small number. Please clarify here what 'feature count' here refers to.

    We agree that the notation "43 {plus minus} 9%" may be less intuitive. However, we chose to retain the percentage because it summarizes the proportion of female samples across datasets rather than the total number of samples, which varies substantially between studies. To improve clarity, we removed the variability term and now report only the percentage of samples in the section Data curation for a cross-organ comparison of fibrotic diseases (p.6):

    "In studies with available gender information (16/22 datasets), 43 % of samples were female on average (Figure 1D)."

    In addition, we clarified the meaning of "feature count" by replacing this term with "gene count" throughout the text and in Suppl. Figure 1B.

    *Reviewer #3, minor comment 3: *P8: Caption: Could you expand a bit upon this 'molecular change severity' in the text?

    We thank the reviewer for pointing this out. We agree that at this point in the manuscript, the concept of "molecular change severity" is not clear yet. It is described later in the manuscript at the beginning of the section "Fibrotic disease programs within tissues" and refers to our analysis with scDist.To make this clearer at its first mention, we have revised the caption to explicitly direct readers to the relevant section and figures. The caption now states:

    "Studies displayed in grey were excluded after an initial assessment of molecular change severity between patient groups, as discussed in the section 'Fibrotic disease programs within tissues' (Figure S2A-D & methods)."

    We believe this addition improves clarity while avoiding duplication of the more detailed explanation provided later in the manuscript.

    *Reviewer #3, minor comment 4: *Do the author annotated cell types correspond reasonably well with your cell type labels, in those datasets where its present?

    We would like to clarify that we did not perform de novo cell type annotation in the studies except for two. Instead, we used the cell type annotations provided by the original study authors and harmonized them into broader cell type categories based on their names to enable comparisons across studies and organs. The mapping between the original study annotations and these harmonized categories is already provided in Supplementary Table 1. To make this more explicit, the text now states:

    "To enable a comparison across tissues, we grouped cells into five broad categories based on the author's annotations: endothelial-, epithelial-, mesenchymal-, lymphoid-, and myeloid cells (mappings available in Suppl. Table 1)."

    Furthermore, the consistency of these annotations was assessed by examining the expression of cell type marker genes, as shown in Figure 1F, which supports the validity of the harmonized cell type labels used throughout the study.

    *Reviewer #3, minor comment 5: *P11: A little more information on scDist and what the distances are calculated based on would be good here.

    We thank the reviewer for this suggestion. We agree that the original description did not sufficiently explain how ScDist quantifies molecular differences between conditions. We have therefore expanded the text to clarify that ScDist is a mixed-effects modeling framework and that larger distances correspond to stronger disease-associated transcriptional perturbations:

    "To do so, we applied ScDist (36), a mixed-effects modeling framework that quantifies transcriptomic differences between conditions while accounting for donor-to-donor variability (see methods). For each cell type, ScDist estimates a distance in gene expression space between healthy and fibrotic cells, with larger values indicating stronger disease-associated transcriptional changes."

    Furthermore, we added to the methods:

    "To assess disease-associated transcriptional shifts within each cell type, we applied scDist (v1.1.2) (117) to estimate transcriptional distances between fibrotic and control samples. ScDist assesses disease-associated transcriptional shifts within each cell type by using a linear mixed-effects model that separates condition-associated transcriptional changes from inter-individual variability by including the disease condition as a fixed effect and donor-specific variation as a random effect."

    *Reviewer #3, minor comment 7: *P14: Are these genes known to be implicated in fibrotic diseases? I know that this is discussed further later, but a few words here would be good.

    We added some context to some of the mentioned genes into the text:

    ** "Notably, several of the highest-ranked genes by our analysis are well-established stress-response and fibrosis markers, such as POSTN38,39, SPP140, VCAN41,42, COL15A121, C343,44, FABP445, and VWF46,47, providing confidence that the identified signatures capture true disease processes instead of study-specific occurrences."

    *Reviewer #3, minor comment 8: *P17: Fig 3H: enrichment -> enrichment score? (same elsewhere)

    We thank the reviewer for noting this ambiguity. We agree that the term "enrichment" was imprecise in this context. To improve clarity and consistency, we have revised the figure legends of Fig 3 F-I and Fig 4 E-F to explicitly refer to the reported metric as the enrichment score rather than simply enrichment. The updated figure 3 can be found in our answer to Reviewer #1, minor comment 2, the updates to Figure 4 in our answer to Reviewer #2, minor comment 2.

    *Reviewer #3, minor comment 9: *P19: ULM is used a few times in the captions, but only ever defined in the methods.

    We agree that the abbreviation ULM was not sufficiently defined in the main text and figure legends. To improve readability, we now define the term ULM as univariate linear model at its first occurrence in the figure legends (Figure 3I, page 18).

    The figure caption now reads:

    "For F-I: Dots show the enrichment score (positive: upregulated in fibrosis), while sizes show the -log10 of the adjusted p-values of univariate linear model (ULM) enrichments."

    *Reviewer #3, minor comment 10: *P20: 'disease relevant cell states' - this might need rewording to better reflect the compositional analysis, and not imply that this identifies cell states rather than clusters of cells.

    We agree that compositional analysis formally identifies cell clusters enriched in disease rather than directly establishing biological cell states. We have revised the text to refer to disease-associated mesenchymal populations/clusters identified through compositional analysis rather than "disease-relevant cell states":

    "To identify disease-associated mesenchymal subpopulations in our datasets, we integrated the mesenchymal cell population per organ and identified a disease-associated cluster by compositional analysis"

    "We also explored disease-associated mesenchymal subpopulation-specific gene expression and the spatial localization of ligands and receptors."

    *Reviewer #3, minor comment 11: *P22: Fig 4D: This could do with more dynamic range on the colour axis, as most things are near or above the scale.

    We thank the reviewer for this suggestion & agree that the original color scale provided limited visual separation between highly concordant features. We note that this is, in part, a consequence of the panel's design, as Figure 4D specifically highlights genes that are consistently and strongly regulated across organs and therefore exhibit relatively similar effect sizes. Nevertheless, to improve visual discrimination, we have adjusted the color scale of Figure 4D (and similarly, Figure 3 D and E) to provide greater dynamic range and enhance the visibility of differences between genes while preserving the underlying data. We believe this modification improves the interpretability of the figures. The new figures 3 and 4 can be found in our answers to Reviewer #1, minor comment 2 and Reviewer #3, minor comment 8, respectively.

    *Reviewer #3, minor comment 15: *It would be nice to keep the gene naming schemes consistent (i.e., MOXD1 and TNC), especially within the same discussion.

    We thank the reviewer for this suggestion and agree that consistent gene nomenclature improves readability. We have therefore revised the discussion text to use a consistent naming.

    *Reviewer #3, minor comment 17: *'some studies have highlighted the disease-relevance of specific cell states' -> please cite

    To support this statement, we have added the appropriate references describing disease-relevant cell states in fibrotic tissues:

    "Lastly, with exception to the mesenchymal cell population, our analysis primarily focused on broad cell type categories, even though some studies have highlighted the disease-relevance of specific cell states (22, 73,74,7,75,33)".

    *Reviewer #3, minor comment 18: *Code availability: I think the 'fi' digraph in the link for https://github.com/saezlab/organfibrosis breaks it, but after correcting it manually I can access the repository.

    We thank the reviewer for noting this issue. The hyperlink functions correctly in the submitted manuscript PDF, but we are not sure in which format the reviewer received the manuscript. We will work with the editorial team during the publishing process to ensure that the repository link will be displayed correctly and remains fully accessible in the published version.

    Description of analyses that authors prefer not to carry out

    Reviewer #1

    Reviewer #2

    Reviewer #2, major comment 2:* Myeloid Cell Analysis: given the importance of myeloid cells in fibrotic processes, particularly the origin of pathological cells (often monocyte-derived macrophages), it would be highly informative to adopt a similar approach to determine whether myeloid subpopulations differ depending on the affected organ.***

    We thank the reviewer for this suggestion and agree that myeloid cells play a critical role in fibrosis. A systematic comparison of disease-associated myeloid states across organs would therefore be highly valuable. In the present study, however, we chose to focus our state-level analysis on mesenchymal cells because they represent the principal effector population responsible for extracellular matrix deposition and scar formation across fibrotic diseases and because they showed a promising overlap between tissues at the broad cell type level. In contrast, our cross-organ analyses indicate weaker transcriptional conservation among myeloid cells (highest cross-organ disease score prediction AUROC mesenchymal: 0.88; myeloid: 0.72), suggesting that organ-specific immune responses may contribute more strongly than shared fibrosis-associated programs.

    Moreover, our integrated dataset combines both single-cell and single-nucleus sequencing studies, which are known to differ in transcript capture and cell type recovery, especially in immune cells (Feng et al. 2026; Van Melkebeke et al. 2024b; Denisenko et al. 2020). These technical differences already complicated the robust comparison of mesenchymal populations, and we expect they would present an even greater challenge for the identification and comparison of fine-grained myeloid cell states across studies and organs. We therefore chose to focus our detailed state-level analysis on mesenchymal populations, where the biological question was most directly aligned with the central objective of identifying conserved fibrogenic programs across organs.

    Therefore, extending the same analysis to myeloid populations would require a comprehensive integration, annotation, and validation effort that would substantially expand the scope of the current study. We therefore chose to focus our in-depth state-level analysis on the mesenchymal compartment, which is most directly aligned with the central objective of identifying conserved fibrogenic programs across organs.

    Reviewer #3

    References

    Argelaguet, Ricard, Damien Arnol, Danila Bredikhin, et al. 2020. "MOFA+: A Statistical Framework for Comprehensive Integration of Multi-Modal Single-Cell Data." Genome Biology 21 (1): 111. https://doi.org/10.1186/s13059-020-02015-1.

    Denisenko, Elena, Belinda B. Guo, Matthew Jones, et al. 2020. "Systematic Assessment of Tissue Dissociation and Storage Biases in Single-Cell and Single-Nucleus RNA-Seq Workflows." Genome Biology 21 (1): 130. https://doi.org/10.1186/s13059-020-02048-6.

    Feng, Xue, Yu Feng, Sayed Haidar Abbas Raza, Yun Ma, and Hongyu Deng. 2026. "Single Cell and Single Nucleus RNA Sequencing in Liver Tissues: Applications and Prospects in Model and Non-Model Organisms." Frontiers in Genetics 17 (April): 1781941. https://doi.org/10.3389/fgene.2026.1781941.

    Koenitzer, Jeffrey R., Haojia Wu, Jeffrey J. Atkinson, Steven L. Brody, and Benjamin D. Humphreys. 2020. "Single-Nucleus RNA-Sequencing Profiling of Mouse Lung. Reduced Dissociation Bias and Improved Rare Cell-Type Detection Compared with Single-Cell RNA Sequencing." American Journal of Respiratory Cell and Molecular Biology 63 (6): 739-47. https://doi.org/10.1165/rcmb.2020-0095MA.

    Lake, Blue B., Rajasree Menon, Seth Winfree, et al. 2023. "An Atlas of Healthy and Injured Cell States and Niches in the Human Kidney." Nature 619 (7970): 585-94. https://doi.org/10.1038/s41586-023-05769-3.

    Litviňuková, Monika, Carlos Talavera-López, Henrike Maatz, et al. 2020. "Cells of the Adult Human Heart." Nature 588 (7838): 466-72. https://doi.org/10.1038/s41586-020-2797-4.

    Naba, Alexandra, Karl R. Clauser, Sebastian Hoersch, Hui Liu, Steven A. Carr, and Richard O. Hynes. 2012. "The Matrisome: In Silico Definition and In Vivo Characterization by Proteomics of Normal and Tumor Extracellular Matrices*." Molecular & Cellular Proteomics 11 (4): M111.014647. https://doi.org/10.1074/mcp.M111.014647.

    Van Melkebeke, Lukas, Jef Verbeek, Dora Bihary, et al. 2024a. "Comparison of the Single-Cell and Single-Nucleus Hepatic Myeloid Landscape within Decompensated Cirrhosis Patients." Frontiers in Immunology 15 (February). https://doi.org/10.3389/fimmu.2024.1346520.

    Van Melkebeke, Lukas, Jef Verbeek, Dora Bihary, et al. 2024b. "Comparison of the Single-Cell and Single-Nucleus Hepatic Myeloid Landscape within Decompensated Cirrhosis Patients." Frontiers in Immunology 15 (February). https://doi.org/10.3389/fimmu.2024.1346520.

  2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #3

    Evidence, reproducibility and clarity

    This paper describes an integrated analysis of several single cell and spatial RNA sequencing datasets to uncover common programs within fibrotic diseases. Many of the signals observed in the scRNA-seq analysis are related to the ECM, and therefore the authors specifically use spatial sequencing data from each of the tissues to investigate local cell-cell communication with fibrotic scar regions. Using this analysis, the authors propose several potential therapeutic targets, and provide an interactive web app to view the results of their analyses.

    I would like to congratulate the authors on a well written and interesting manuscript. I have no major concerns regarding the paper as a whole, but I think several points would benefit from clarification, particularly around interpretation of disease heterogeneity and the therapeutic implications.

    Results

    p5:

    Some context regarding expected differences between single cell and single nuclei datasets here would be good (especially if some differences are potentially important).

    p6:

    43 {plus minus} 9 % - this looks a little strange as a percentage, leaving it as a count would probably be clearer as its quite a small number.

    Please clarify here what 'feature count' here refers to.

    p8:

    Caption: Could you expand a bit upon this 'molecular change severity' in the text?

    Do the author annotated cell types correspond reasonably well with your cell type labels, in those datasets where its present?

    p11:

    A little more information on scDist and what the distances are calculated based on would be good here.

    p12:

    Please clarify whether the multicellular factor model is fit jointly across all datasets within an organ, or separately per dataset followed by comparison. If fit jointly, how are batch/study effects handled? If fit separately, how are factors aligned across invocations?

    Is it possible to say how much of this consistency across datasets is due to non-fibrotic or non-disease state regulation? Are the disease-associated factors driven by coordinated changes across multiple cell types, or primarily by one dominant cell type? And if the latter, is this related to expression magnitude, or cell type abundance?

    p14:

    Are these genes known to be implicated in fibrotic diseases? I know that this is discussed further later, but a few words here would be good.

    p17:

    Fig 3H: enrichment -> enrichment score? (same elsewhere)

    p19:

    ULM is used a few times in the captions, but only ever defined in the methods.

    p20:

    'disease relevant cell states' - this might need rewording to better reflect the compositional analysis, and not imply that this identifies cell states rather than clusters of cells.

    p22:

    Fig 4D: This could do with more dynamic range on the colour axis, as most things are near or above the scale.

    p23:

    What conclusions should be drawn from the broad cell-type communication comparisons between organs in Fig. 5A? The text reports which broad cell-type pairs account for many upregulated ligand-receptor interactions, but it is not clear whether these comparisons identify fibrosis-specific communication or mainly reflect broad tissue architecture, cell-type abundance, etc.

    If the broad categories were chosen because finer cell-state annotations are not consistently available across studies, it would be helpful to state this limitation explicitly.

    p30:

    The staging or severity of each of the diseases seems like quite a strong confounder, especially if there is a bias for sampling tissues that are late stage. It would be nice to see this addressed more explicitly in the results, perhaps with some comparisons between those that are identified as earlier and later stage in the respective fibrotic diseases (if these annotations exist).

    p31:

    The therapeutic suggestions should come with some discussion that this is association rather than causation, as it's not established that these are causal drivers. MOXD1 seems compelling, especially if this has been observed to have a potential therapeutic effect in other fibrotic diseases, and this is an excellent outcome that justifies the meta-analysis approach. TNC is somewhat more speculative in this regard, so if there is any mechanistic or other motivations, it would be good to include them here.

    It would be nice to keep the gene naming schemes consistent (i.e., MOXD1 and TNC), especially within the same discussion.

    p31:

    It would be nice to have what you think the issues are with the lack of patient metadata, and how these issues might manifest in the analyses (this links with the previous comment regarding disease stage).

    'some studies have highlighted the disease-relevance of specific cell states' -> please cite

    Code availability: I think the 'fi' digraph in the link for https://github.com/saezlab/organfibrosis breaks it, but after correcting it manually I can access the repository.

    Significance

    This is a well-presented analysis of single-cell, single-nucleus, and spatial transcriptomics data derived from patients with a range of fibrotic diseases, with the aim of developing an integrated description of fibrosis-associated programs across organs. This integrated analysis is used to nominate potential therapeutic targets, many of which are compatible with current understanding of fibrosis and therefore provide validity for the approach. The results are also made available through a web application that can be queried easily.

    The main limitations of the study arise from the nature and heterogeneity of the available data. In particular, limitations in dataset composition and clinical annotation mean that important aspects such as disease progression, severity, sampling location, and fibrosis stage cannot be systematically studied.

    The novelty of the study lies in its cross-organ, gene-centric integration of fibrotic disease datasets across a sizeable patient cohort, including analyses of inferred interactions between broad cell-type compartments. This provides a useful precursor to deeper mechanistic studies of fibrotic regulation, and a resource for researchers interested in fibrosis-associated signatures and candidate mechanisms. More generally, it is a good example of how public datasets can be integrated within systems biomedicine.

    My background is in computational biology and biophysics.

  3. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #2

    Evidence, reproducibility and clarity

    The article by Küchenhoff et al. presents a comprehensive meta-analysis of single-cell transcriptomic data from healthy and fibrotic human tissues encompassing 20 studies and 25 disease etiologies across the heart, liver, kidney, and lung. They identified organ-specific as well as cross-organ fibrosis-associated gene expression profiles in major cell types including fibroblasts, epithelial, endothelial and immune cells. Additionally, they also conduct a focused analysis on transcription factors and intercellular communication patterns in fibrotic regions, supported by both scRNA-seq and spatial transcriptomics data.

    The study is well-designed and clearly presented, with a robust combination of computational approaches that enhance the characterization of both organ-specific and shared gene programs in fibrosis. The authors also provide a rich and accessible dataset through an interactive data browser, which will be highly useful for the scientific community. While most of the data are convincing, some clarifications and improvements are needed, as detailed below.

    Major comments:

    • Fig.4A: Fibroblast Population Analysis. The authors integrated the fibroblast populations per organ to identify a disease-associated cluster by compositional analysis. In some models, more than one pathological clusters are revealed by the analysis. Shouldn't they be included as pathological, or at least excluded, from the reference population used as a control for differential expression?
    • Myeloid Cell Analysis: given the importance of myeloid cells in fibrotic processes, particularly the origin of pathological cells (often monocyte-derived macrophages), it would be highly informative to adopt a similar approach to determine whether myeloid subpopulations differ depending on the affected organ.
    • Fig. 4B-C: the full list of organ-specific and overlapping genes should be given in a supplemental table.
    • Fig.4D: Among this top list, DNM3OS has been indeed characterized as a regulator of the TGF-β pathway in lung fibrosis and should be cited (PMID: 30964696). Interestingly, this lncRNA encodes a cluster of miRNA, including miR-199a-5p, that has been found deregulated in various fibrotic models including lung, kidney and liver (PMID: 23459460).
    • Fig. 3F-I and Fig. 4E: the list of the predicted downstream genes for each TF should be provided in a supplemental table
    • Cell-cell communications analysis: It would be informative to add a circosplot highlighting the best cell-cell communication candidates in each organ. The authors should also provide the full list of predicted interactions in a supplementary table, including scores for each organ for each interaction. Additionally, it would be important to focus specifically on ligand-receptor pairs associated with growth factors and cytokines. While incorporating Visium data is very interesting and challenging, it may reduce sensitivity due to its relatively poor capture efficiency. This could particularly overemphasize the importance of collagens and other ECM-related factors, which are highly expressed.
    • Scar-specific cell-cell communication: Using only COL1A1 as a marker may not be the best option, as this gene is also expressed in normal areas. Suggestion: Use a score combining the best fibrosis-associated genes across the four organs to define fibrotic areas more accurately?
    • Visium Dataset Analysis: It would be interesting to compare fibrotic areas across different organs by performing niche or topic analyses using supervised deconvolution approaches (such as RCTD). This would allow for a better estimation of cell composition and functional annotations of fibrotic and inflammatory areas.

    Minor comments:

    • p11: the authors conclude that "cell proportions differed not only between patients and organs, but also that there was no uniform abundance change in disease". This result may reflect technical variability, particularly due to dissociation biases from very different organs or the use of different platforms. This limitation should be discussed.
    • Several panels (Fig.3F-I, Fig.4E-F) need to be improved, in particular the dot plots. with the same order for organs than for the other panels and another range for the size of the dots (-log10 pvalue) to reduce the max size of the dot as well as the enrichment score to expand the value of the z-score.
    • Panel E in Fig. 5 is difficult to read and needs to be improved.
    • Figure Improvements: Fig. 3F-I and Fig. 4E-F: The dot plots could be improved by i) using the same order for organs as in other panels for consistency; ii) adjusting the dot size scale (-log10 p-value) to reduce the maximum dot size and expand the range of enrichment scores (z-score). Fig. 5E: This panel is difficult to read and needs improvement for clarity.

    Referees cross-commenting

    I agree with the comments made by the other reviewers, who effectively highlight the merits and value of this study and point out a few issues for improvement or clarification.

    Significance

    The study is meticulously designed and clearly presented, employing a robust combination of computational approaches. To the reviewer's knowledge, this is the first systematic, cross-organ meta-analysis of fibrosis, offering a comprehensive characterization of both organ-specific and shared gene programs associated with fibrotic processes.

    A particularly commendable aspect of this work is the provision of a rich and accessible dataset through an interactive data browser, which will serve as a valuable resource for the scientific community at large. The impact of this study is broad and multidisciplinary, benefiting not only computational biologists but also experimental biologists and clinicians working in the field of fibrosis.

    Reviewer's expertise: The reviewer has extensive experience in functional genomics and fibrosis research, including single-cell-based approaches, but is not specialized in bioinformatics.

  4. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

    Learn more at Review Commons


    Referee #1

    Evidence, reproducibility and clarity

    Summary:

    Küchenhoff and co-authors aimed to explore the shared and organ-specific gene expression profiles in tissue fibrosis by integrative analysis on 20 publicly available scRNA-Seq datasets obtain from human heart, liver, lung, and kidney. Despite strong organ-specific effects, they identified consensus fibrosing gene signature across different disease etiologies within and among organs. Shared gene expression profiles across organs were enriched in endothelial and mesenchymal cells. Analysis focused on a subset of fibrosis-enriched fibroblasts revealed consistent upregulation of collagen-related pathways and dysregulation of developmental-related transcription factors across organs. Cell-cell communication analysis detected robust upregulations ECM-integrin interactions between mesenchymal and endothelial cells in multiple organs. The authors further proposed several targets based on spatial co-expression with COL1A1 in Visium datasets and previous analysis based on scRNA-Seq. They also made their results publicly available and easily accessible through a web dashboard.

    Major comments:

    1. The group has been developing cutting edge bioinformatic tools for the community. The authors also provided scripts and the processed for reproducibility. I have no doubt in their implementation of the methodology. I also understand the reasons of the objective tone throughout the manuscript. However, the authors made very little claims with biological significance. The conclusion of the study is vague with almost nothing mentioned in the abstract. What are the cross-organ effects in fibrosis identified in this study? I believe some additional claims would facilitate the reader with less technical knowledge to grasp the study better.
    2. The authors have pooled the data from at least five different disease per organ to identify the pan-fibrosis signature across diseases. Some of the diseases, e.g., pneumonitis, ICM, MI, MCD, ALD) may present more acute remodeling compared to the rest, which might exhibit distinct features that mask the analysis. The extent of fibrosis also varies very significantly. A correlation with histological data is required.
    3. The authors performed multicellular factor modeling in each organ and identified factors that are distinct in fibrotic and reference tissue in Fig. 2B, e.g., factors 1 and 2 in heart. Are these factors driven by specific biological pathways? Could these factors also be used to identify common biological functions in fibrotic tissue across organs?
    4. Although strong organ-specific effects, the author detected similar transcriptional changes in endothelial and mesenchymal cells in heart and lung at Fig. 3B. The analysis on disease-associated fibroblasts also showed much higher overlapped between heart and lung compared to, e.g., liver and kidney in Fig. 4C. Are there additional shared fibrosis features or functions in mesenchymal cells or disease-associated fibroblasts in heart and lung?
    5. There seems to be certain degree of similarities among the epithelial cells in kidney and lung in Fig. 4B.
    6. The authors focused on the common functions between mesenchymal and endothelial cells among organs in Fig. 3H and I. Are there cell type specific effects here but shared across organs?
    7. The graphs in Fig. S6A do not clearly present how the disease-associated fibroblasts are identified. The true identities of disease should also be plotted in these UMAPs. The results indicating these cells expressed myofibroblast signature should also be shown confirming that these cells are not other mesenchymal cells, e.g., pericytes or smooth muscle cells.
    8. TNC appears in the lower bottom of the list in Fig. 6C. It is unclear why TNC was chosen as a board therapeutic target in the end.
    9. It is unclear why only known ligands and receptors are included in the therapeutic target identification analysis in Fig. 6B.

    Minor comments:

    1. Is there additional measure that account for the datasets with lower RNA counts shown in Fig. S1?
    2. The description or legend for the colors is missing in Fig. 3A
    3. FAP appears to be the top gene with robust upregulation in fibrotic heart, lung, liver, and kidney in Fig. 3E, which is also a well-establish surrogate of fibroblast activity and tissue fibrosis in clinical settings (for instance, PMID: 38279381) but not mentioned anywhere in the text.
    4. Although it is clear that this study was performed at a much larger scale, the additional gain compared to the previous attempt on identification of shared feature in fibrotic heart, lung, liver, and kidney should be mentioned (PMID: 41752153).

    Significance

    This is the first study reporting the transcriptomic changes in tissue fibrosis in heart, lung, liver, and kidney at large scale across different diseases in a cell type specific manner. This study implemented state-of-the-art bioinformatics that not only focus on shared feature among organs, but also the similarities across organs. The manuscript highlights the similar molecular changes within endothelial and mesenchymal cells, especially from heart and lung. The authors performed spatial co-expression with COL1A1, further increase the robustness of target identification for fibrotic core. The authors further made their results public available, which would benefit fibrosis research community and facilitate the development of therapeutics against tissue fibrosis.