An integrated machine learning model of transcriptomic genes in multi-center chronic obstructive pulmonary disease reveals the causal role of TIMP4 in airway epithelial cell
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Chronic obstructive pulmonary disease (COPD) is a heterogeneous syndrome, resulting in inconsistent findings across studies. Identifying a core set of genes consistently involved in COPD pathogenesis, independent of patient variability, is essential. Methods We integrated lung tissue sequencing data from patients with COPD across two centers. We used weighted gene co-expression network analysis and machine learning to identify 13 potential pathogenic genes common to both centers. Additionally, a gene-based model was constructed to distinguish COPD at the molecular level and validated in independent cohorts. Gene expression in specific cell types was analyzed, and Mendelian randomization was used to confirm associations between candidate genes and lung function/COPD. Results Tissue inhibitor of metalloproteinase 4 (TIMP4) was identified as a key pathogenic gene and validated in COPD cohorts. Further analysis using single-cell sequencing from mice and patients with COPD revealed that TIMP4 is involved in ciliated cells. In primary human airway epithelial cells cultured at the air-liquid interface, TIMP4 overexpression reduced ciliated cell numbers. Conclusions We developed a 13-gene model for distinguishing COPD at the molecular level and identified TIMP4 as a potential hub pathogenic gene. This finding provides insights into shared disease mechanisms and positions TIMP4 as a promising therapeutic target for further investigation.