Genomic and Machine Learning Approaches for Predicting Gut-Microbe Infections in Preterm Infants: A Systematic Review of African Studies

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The burden of morbidities associated with gut-microbes is disproportionately high in preterm infants in sub-Saharan African countries, as they include necrotizing enterocolitis (NEC), late-onset sepsis (LOS) and antimicrobial-resistant (AMR) bloodstream infections. Multi-omics like shotgun metagenomics, 16S rRNA profiling, metabolomics, proteomics, transcriptomics and machine-learning (ML) prediction have revolutionized research on neonatal infection around the world. Nonetheless, they are still used only loosely, and in fragments and with methodological heterogeneity in African neonatal cohorts although initial findings indicate that they can better predict risks. This systematic review was done to map current genomic, multi-omics and computational strategies to explore infections associated with the gut-microbe of African preterm infants; perform methodological evaluation and predictive science; and contrast African data with the rest of the world to find the best potential solutions to clinical translation. Detailed search of the PubMed, Web of Science, AJOL, Google Scholar, bioRxiv, and medRxiv were searched up to November 2025 by complementing the search by high-impact journals (PLOS ONE, BMC Series, Nature journals, Clinical Infectious Diseases/Lancet Microbe). Reference screening and grey literature were also done. Included articles had African preterm (<37 weeks gestation) cohorts and engaged in high-throughput genomic/ multi-omics analyses shotgun metagenomics, 16S sequencing, metatranscriptomics, proteomics, metabolomics, microbial GWAS / ML models predictive to NEC, sepsis, colonization, AMR, or mortality. The screening process was done in two stages; 871 records were assessed, and 17 studies were eligible. The data extracted included all the clinical, laboratory, computational, and performance variables. The Newcastle -Ottawa Scale, JBI tools, and QUADAS-2 predictive models were used to evaluate risk of bias. Synopsis of the narratives had to be done as a result of heterogeneity. The seventeen included research was carried out in Tanzania, Zimbabwe, The Gambia, Burkina Faso, Ethiopia, Kenya, South Africa and multicounty Sub-Saharan African cohorts. There included omics tools; shotgun metagenomics (n=8), 16S rRNA sequencing (n=3), pathogen whole-genome sequencing (n=2), transcriptomics/proteomics (n=3), and integrative multi-omics (n=3). Three researchers used ML models such as XGBoost, LightGBM, KNN, and gradient boosting whose values of AUROC were 0.71-0.82 [1,5,6]. The distinctive biomarkers were signatures of resistome before NEC, maternal gut microbiome predictors of birthweight and growth, and mortality prevention transcriptome cord-blood markers (AUC> 0 90). There were also major gaps such as small-sized samples, under-representation of countries, lack of standardization in laboratory and computational pipelines, and lack of external validation. Even the small sample of African neonatal genomic research indicates that it is not only possible but has a high potential to identify early infection through determining resistome profiles, properties of maternal microbiomes and host-response biomarkers. Clinically deployable ML tools in combination with shotgun metagenomics may improve diagnostics of neonatal units limited by resources. To attain this, it has been believed that it is urgently required to have large, longitudinal, multi-country African cohorts that are harmonized on multi-omics workflows and must have better data-sharing infrastructures.

Article activity feed