Interpretable and accurate prediction models for metagenomics data

This article has been Reviewed by the following groups

Read the full article

Abstract

Background

Microbiome biomarker discovery for patient diagnosis, prognosis, and risk evaluation is attracting broad interest. Selected groups of microbial features provide signatures that characterize host disease states such as cancer or cardio-metabolic diseases. Yet, the current predictive models stemming from machine learning still behave as black boxes and seldom generalize well. Their interpretation is challenging for physicians and biologists, which makes them difficult to trust and use routinely in the physician–patient decision-making process. Novel methods that provide interpretability and biological insight are needed. Here, we introduce “predomics”, an original machine learning approach inspired by microbial ecosystem interactions that is tailored for metagenomics data. It discovers accurate predictive signatures and provides unprecedented interpretability. The decision provided by the predictive model is based on a simple, yet powerful score computed by adding, subtracting, or dividing cumulative abundance of microbiome measurements.

Results

Tested on >100 datasets, we demonstrate that predomics models are simple and highly interpretable. Even with such simplicity, they are at least as accurate as state-of-the-art methods. The family of best models, discovered during the learning process, offers the ability to distil biological information and to decipher the predictability signatures of the studied condition. In a proof-of-concept experiment, we successfully predicted body corpulence and metabolic improvement after bariatric surgery using pre-surgery microbiome data.

Conclusions

Predomics is a new algorithm that helps in providing reliable and trustworthy diagnostic decisions in the microbiome field. Predomics is in accord with societal and legal requirements that plead for an explainable artificial intelligence approach in the medical field.

Article activity feed

  1. Now published in GigaScience doi: 10.1093/gigascience/giaa010

    Edi Prifti 1Institute of Cardiometabolism and Nutrition, Integromics, ICAN, Paris, France2Sorbonne University, IRD, UMMISCO, UMI 209, Paris, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Edi PriftiYann Chevaleyre 3Paris-Dauphine University, PSL Research University, CNRS, UMR 7243, LAMSADE, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Yann ChevaleyreBlaise Hanczar 4IBISC, University Paris-Saclay, University Evry, Evry, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Blaise HanczarEugeni Belda 1Institute of Cardiometabolism and Nutrition, Integromics, ICAN, Paris, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Eugeni BeldaAntoine Danchin 5Institute of Cardiometabolism and Nutrition, ICAN, Paris, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Antoine DanchinKarine Clément 6Sorbonne University, INSERM, Nutriomics team, Paris, France7Assistance Publique-Hôpitaux de Paris, Nutrition department, CRNH Ile de France, Paris, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Karine ClémentJean-Daniel Zucker 1Institute of Cardiometabolism and Nutrition, Integromics, ICAN, Paris, France2Sorbonne University, IRD, UMMISCO, UMI 209, Paris, FranceFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Jean-Daniel Zucker

    A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giaa010 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

    These peer reviews were as follows:

    Reviewer 1: http://dx.doi.org/10.5524/REVIEW.102131 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.102132