Finding the known unknowns: minimal machine learning models of resistance identify novel antibiotic resistance discovery opportunities in Klebsiella pneumoniae

This article has been Reviewed by the following groups

Read the full article See related articles

Listed in

Log in to save this article

Abstract

Bacterial antimicrobial resistance (AMR) poses a significant public health threat. The advent of global awareness and affordable whole genome sequencing has yielded an ever-growing collection of bacterial genome sequence datasets and corresponding antibiotic resistance metadata. This enables the use of computational techniques, including machine learning (ML), to predict phenotypes and discover novel AMR-associated variants. With the great variety of resistance mechanisms to interrogate and the number of datasets that can be mined, there is a need to identify where novel AMR marker discovery is most necessary. Multiple databases and annotation pipelines exist to identify AMR variants known to be associated with resistance to specific antibiotics or antibiotic classes, however, the completeness of these databases varies and for some antibiotics even the most complete databases remain insufficient for accurate classification. Here, we couple these pipelines with predictive ML models, which we call β€œminimal models” of resistance. We predict the binary resistance phenotypes of 20 major antimicrobials in the genomically diverse pathogen Klebsiella pneumoniae . We present a detailed comparison of the annotation pipelines and drug resistance databases currently available, and we identify their shortcomings in phenotype prediction, highlighting opportunities for novel marker discovery. We further provide a description of a Bacterial and Viral Bioinformatics Resource Center (BV-BRC) database, highlighting the observed AMR mechanism as the key for phenotype prediction in this dataset. This analysis has relevance for all those seeking to use or improve drug resistance databases. It provides a critical review of the differences in annotation tools and databases commonly used in bacterial AMR studies, identifying existing gaps and novel AMR marker discovery niches. It outlines guidance for the establishment of a real standard dataset for the development and benchmarking of ML models of AMR.

Article activity feed

  1. Georgios Feretzakis

    Review 3: "Finding the Known Unknowns: Minimal Machine Learning Models of Resistance Identify Novel Antibiotic Resistance Discovery Opportunities in Klebsiella Pneumoniae"

    Peer reviewers commend the study for its robust methodology, novel comparative analysis of AMR databases, and its relevance to improving genome-based resistance prediction.

  2. Samuel Shelburne

    Review 2: "Finding the Known Unknowns: Minimal Machine Learning Models of Resistance Identify Novel Antibiotic Resistance Discovery Opportunities in Klebsiella Pneumoniae"

    Peer reviewers commend the study for its robust methodology, novel comparative analysis of AMR databases, and its relevance to improving genome-based resistance prediction.

  3. Lara Urban

    Review 1: "Finding the Known Unknowns: Minimal Machine Learning Models of Resistance Identify Novel Antibiotic Resistance Discovery Opportunities in Klebsiella Pneumoniae"

    Peer reviewers commend the study for its robust methodology, novel comparative analysis of AMR databases, and its relevance to improving genome-based resistance prediction.

  4. Strength of evidence

    Reviewer(s): L Urban (University of Zurich) | πŸ“—πŸ“—πŸ“—πŸ“—β—»οΈ
    S Shelburne (MD Anderson Cancer Center) | πŸ“˜πŸ“˜πŸ“˜πŸ“˜πŸ“˜
    G Feretzakis (Hellenic Open University) | πŸ“—πŸ“—πŸ“—πŸ“—β—»οΈ