Quantification of collagen and associated features from H&E-stained whole slide pathology images across cancer types using a physics-based deep learning model
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Collagen is the major component of the extracellular matrix (ECM). Collagen structural organization undergoes significant transformation during tumorigenesis. The visualization of collagen in histological tissue sections would aid in the study of tumor growth, encapsulation, and invasion. However, such visualization requires the use of special stains such as Picrosirius Red (PSR) or Masson’s Trichrome (MT), or more recently, second-harmonic generation imaging (SHG) in unstained tissue sections. However, PSR and MT both suffer from significant inter- (and intra-) lab stain variabilities, and SHG, while considered a ground truth by many, suffers from issues of system complexity/reliability, cost, and speed/throughput. These technical hurdles limit more widespread assessment of collagen in tissue samples.
Methods
Using high-contrast, high-throughput polarization imaging on PSR-stained slides to generate ground truth training polarization images, we developed a deep learning model (iQMAI) to infer the presence of collagen directly from hematoxylin and eosin (H&E)-stained whole-slide images (WSIs) with high specificity. After iQMAI inference across WSIs, individual collagen fibers were extracted, and features describing overall collagen intensity and fiber morphology were computed. iQMAI pixel-and feature-wise outputs were compared to ground truth polarization imaging to assess model performance. The trained iQMAI model was deployed on H&E-stained WSI from the TCGA LUAD, LUSC, LIHC, and PAAD datasets for evaluation. iQMAI-derived collagen features were compared to tissue composition, gene expression, and overall survival.
Results
The iQMAI model shows significant generalization across multiple indications. iQMAI collagen predictions were similar to polarization imaging measurements of the same sample, with a mean structural similarity index (SSIM) of 0.84 (95% CI 0.69-0.93), a mean patch-wise RMSE of 0.04 (95% CI 0.02-0.08), and a linear correlation (R 2 =0.93). Comparing features of the collagen fibers extracted from iQMAI vs. polarization images yielded similar linear correlations between computed fiber tortuosity, length, width, and relative angle. The relationship between collagen fiber density and fibroblast density was distinct in non-small cell lung cancer (LUAD and LUSC), hepatocellular carcinoma (LIHC), and pancreatic ductal adenocarcinoma (PAAD). In PAAD, fiber density and fiber width were both negatively associated with the LRRC-15 gene expression signature, and increased fiber width was associated with longer overall survival.
Conclusions
iQMAI is a deep learning model that accurately predicts collagen from an H&E-stained WSI, allowing for spatially resolved quantification of collagen morphology and enabling investigation of the interplay between collagen and other TME components. We demonstrate an example of the utility of iQMAI-based collagen assessment in PAAD, where collagen features are associated with immunosuppressive cancer-associated fibroblasts and overall survival. Understanding the relationship between collagen, the tumor microenvironment composition, and disease progression may aid the development of effective immunotherapies in PAAD and other cancer types.