Interpretable machine learning applied to high-dimensional salivary proteomics accurately classifies pediatric inflammatory bowel diseases
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background and aims
Inflammatory bowel diseases (IBD), including Crohn’s disease (CD), ulcerative colitis (UC), and IBD-unclassified (IBD-U), are chronic inflammatory disorders of the gastrointestinal tract. Current methods for classification and longitudinal monitoring are invasive, expensive, and often delayed, limiting timely diagnosis and management. This study reports the first application of high-dimensional salivary proteomics integrated with interpretable artificial intelligence/machine learning (AI/ML) to define a minimal protein signature for pediatric IBD classification with the goal of informing therapeutic decision-making.
Methods
Unstimulated saliva from pediatric CD, UC, and IBD-U patients was analyzed using Alamar Biosciences’ NULISAseq Inflammation Panel 250 (250 proteins). Logistic regression with recursive feature elimination identified a minimal discriminative signature. Performance was tested in independent follow-up samples. SHapley Additive exPlanations (SHAP) quantified patient-specific protein contributions and assessed biological similarity of IBD-U to CD and UC.
Results
Differential abundance analysis between UC and CD revealed 53 significantly different proteins. ML identified a 14-protein signature comprising chemokines/cytokines (CCL1, IFNA1;IFNA13, IL12p70, IL34, TNFSF11/RANKL), receptors/ligands (CD40LG, ICOSLG, IL1R2, IL17RA), structural/tissue-remodeling proteins (CD93, GFAP, SPP1), and growth factors/immune modulators (GDF2, GZMA). The model achieved 96.2% overall accuracy in first-visit samples and 86.4% overall accuracy in follow-up testing. SHAP revealed patient-specific drivers and suggested biological alignment of IBD-U cases toward CD-like or UC-like profiles.
Conclusions
This first-in-field integration of salivary proteomics with interpretable AI/ML demonstrates that accurate, noninvasive classification of pediatric IBD is possible using minimal biomarker sets. This approach establishes a scalable framework for future longitudinal monitoring, and supports earlier and more precise therapeutic interventions.