Comprehensive interaction profiling and machine learning prediction of bacteriophage infectivity across clinically diverse Pseudomonas aeruginosa
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The rise of antibiotic-resistant bacterial infections has driven renewed interest in bacteriophage therapy, where viruses that specifically kill bacteria are used as targeted antimicrobials. Pseudomonas aeruginosa , a WHO critical-priority pathogen that causes severe infections in hospitalized and immunocompromised patients, presents a major challenge for phage therapy because of its extraordinary genetic diversity. Phages effective against one bacterial strain often fail against others, and existing cross-resistance-profiling approaches require iterative empirical testing of each new patient isolate. To establish a genome-based framework for rapid phage-isolate matching, we assembled a collection of 95 genomically diverse P. aeruginosa phages representing 20 genera and tested each against 99 genetically diverse clinical isolates, generating 9,405 infection outcome measurements. Bacterial O-antigen serotype emerged as the dominant determinant of strain susceptibility, while defense systems, anti-defense systems, and prophage burden contributed smaller strain-specific effects. The full curated multivariate model explained 47% of strain-susceptibility variance. Machine-learning models integrating these features and pangenome-derived gene clusters reached a per-strain AUROC of 0.86. In an in vivo proof-of-concept test against a single held-out strain, the ML-designed cocktail produced a ∼12-fold greater median CFU reduction than the expert-designed cocktail (q = 0.045), with both cocktails substantially reducing burden relative to the untreated control (∼113-fold for ML, ∼9-fold for CG; both q < 10□³). SHAP analysis of the model identified bacterial surface-architecture genes (LPS biosynthesis, outer membrane proteins, type IV pili) as the dominant predictors, with defense-system content modulating which specific phages succeed against a strain rather than uniformly damping susceptibility. Together, these results establish a genome-based framework for predicting phage susceptibility in genetically diverse clinical isolates.