PathQC: Determining Molecular and Physical Integrity of Tissues from Histopathological Slides

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Quantifying tissue molecular and physical integrity is essential for biobank development. However, current assessment methods either involve destructive testing that depletes valuable biospecimens or rely on manual evaluations, which are not scalable and lead to interindividual variation. To overcome these challenges, we present PathQC, a deep learning framework that directly predicts the tissue RNA Integrity Number (RIN) and the extent of autolysis from hematoxylin and eosin (H&E)-stained whole-slide images of normal tissue biopsies. PathQC first extracts morphological features from the slide using a recently developed digital pathology foundation model (UNI), followed by a supervised model that learns to predict RNA Integrity Number and autolysis scores from these morphological features. PathQC is trained on and applied to the Genotype-Tissue Expression (GTEx) cohort, which comprises 25,306 non-diseased post-mortem samples across 29 tissues from 970 donors, where paired ground truth RIN and autolysis scores were available. Here, PathQC predicted RIN with an average correlation of 0.47 and an autolysis score of 0.45, with notably high performance in Adrenal Gland tissue (R=0.82) for RIN and in Colon tissue (R=0.83) for autolysis. We provide a pan-tissue model for the prediction of RIN and autolysis score for a new slide from any tissue type ( GITHUB ). Overall, PathQC will enable scalable measurement of molecular and physical integrity from routine H&E images, thereby enhancing the quality of both biobank generation and its retrospective analysis.

Article activity feed