Curated and harmonised transcriptomics datasets of interstitial lung diseases
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
This study provides manually curated and homogenised transcriptomics data of interstitial lung disease (ILD) patients retrieved from the NCBI Gene Expression Omnibus and European Nucleotide Archive repositories. The compendium includes 30 transcriptomics datasets generated with DNA microarrays and RNA sequencing technologies for a total of 1,371 samples. All the datasets underwent metadata curation and harmonisation, data quality check, and preprocessing with standardised procedures. Furthermore, a robust data model was developed to standardise phenotypic data, thereby enhancing comparability across heterogeneous datasets. Gene expression data and lists of differentially expressed genes computed between ILD and healthy samples are provided. Among the ILDs included in this study, idiopathic pulmonary fibrosis (IPF) is the most represented worldwide. Co-expression networks of IPF and healthy samples were inferred, which are also included in this study. This work significantly improves the Findability, Accessibility, Interoperability, and Reusability (FAIR) of publicly available transcriptomics data of ILDs, providing a platform to implement and validate integrated systems biology and pharmacology approaches for novel interstitial lung disease diagnostics and therapeutics.