AI-Driven Two-Component System Classifier for Pediatric MDR Pathogens

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The global rise of multidrug-resistant (MDR) bacterial infections in pediatric populations poses an alarming challenge to effective clinical management and antimicrobial stewardship. Two-component systems (TCSs), comprising sensor kinases and response regulators, play a pivotal role in bacterial adaptation, virulence, and resistance mechanisms, making them promising targets for diagnostic and therapeutic innovation. This study employs a machine learning-driven bioinformatics pipeline to identify and prioritize potential TCS biomarkers across MDR pediatric pathogens for integration into next-generation diagnostic biosensors. Genomic datasets from clinically relevant MDR bacteria were curated and analyzed to extract TCS-associated gene and protein signatures. Using Pfam domain features, multiple supervised learning models were trained, with XGBoost, Random Forest, and a Stacking Ensemble achieving high overall accuracies (0.9883-0.9885). While the dominant Non-TCS class was predicted with near-perfect accuracy, minority subclasses exhibited variable detection due to severe class imbalance, particularly for rare groups such as CpxA-like and EnvZ-like proteins (n=2 each). Moderate F1-scores were obtained for generic response regulators and OmpR-like proteins. Feature importance analysis identified a small set of highly discriminative domains, including PF01339, PF00702, PF07679, PF03997, and PF04886, associated with conserved regulatory and signaling motifs. These results demonstrate that Pfam domain signatures offer biologically meaningful features for TCS classification, while highlighting the need for expanded datasets or embedding-based features to improve minority-class prediction. Overall, this work provides a scalable, AI-driven foundation for TCS biomarker discovery, aiming to develop diagnostic biosensors for MDR pediatric pathogens.

Article activity feed