Interpretable machine learning for coeliac disease diagnosis: quantitative morphometry of duodenal biopsies

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Coeliac disease affects approximately 1% of the global population and remains substantially underdiagnosed. Histopathological assessment of duodenal biopsies is the diagnostic gold standard but is subject to approximately 20% inter-observer disagreement. While machine learning approaches show promise, most prior work relies on black-box models with limited interpretability, restricting clinical adoption.

Methods

We present an interpretable pipeline that follows established histopathological criteria by extracting clinically meaningful morphological features from H&E-stained whole-slide images. Five sequential stages perform pre-processing, semantic segmentation of villi, crypts, intraepithelial lymphocytes (IELs) and enterocytes, crypt morphometry, villus length estimation via a novel polyline-based keypoint model, and coeliac disease classification using three quantitative features: IEL-to-enterocyte ratio, villus-to-crypt area ratio, and villus-length-to-crypt-depth ratio. Training and validation used data from four institutions; independent testing used 1,357 WSIs from two further institutions including one with a previously unseen scanner manufacturer, spanning five diagnostic categories: coeliac disease, normal mucosa, chronic inflammation, gastric metaplasia, and gastric heterotopia.

Results

Semantic segmentation achieved villus and crypt precision and recall of 87–90%. Villus length estimation correlated strongly with expert annotations (Pearson’s r=0.85, mean relative error 13.5% post-calibration). All three morphological features significantly separated coeliac disease from all non-coeliac diagnostic groups across internal and external datasets (p<0.01 in all comparisons). On the test set the diagnostic classifier achieved accuracy 94.5%, PPV 92.9%, NPV 94.7%, and AUC 0.982.

Conclusions

This interpretable framework achieves strong multi-centre diagnostic performance while producing quantitative morphological outputs, villus length, crypt depth, and IEL-to-enterocyte ratios, that directly reflect established histopathological criteria, representing a meaningful step towards standardised AI-assisted coeliac disease diagnosis.

Article activity feed