Confounder-free Predictive Models for Microbiome-based Host Phenotype Prediction
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
As in many fields, the presence of confounding effects (or biases) presents a significant challenge in micro-biome research, including using microbiome data to predict host phenotypes. If not properly addressed, confounders can lead to spurious associations, biased predictions and misleading interpretations. One notable example is the medication metformin, which is commonly prescribed to treat type 2 diabetes (T2D) and is known to influence the gut microbiome. In this study, we propose confounder-free predictive models for human phenotype prediction using microbiome data. These models utilize an end-to-end approach within an adversarial min-max optimization framework to derive features that are invariant to confounding factors, while accounting for the intrinsic correlations between confounders and prediction outcomes. We implemented two versions of confounder-free predictors using different network architectures: one based on a fully connected network (referred as FNN CF) and another incorporating prior biological knowledge (referred as MicroKPNN CF). We evaluated our models on microbiome datasets associated with T2D, where metformin acts as a confounder. Our results demonstrate that confounder-free predictors achieve higher accuracy compared to models that do not account for confounders and more effectively identify microbial markers associated with the phenotype, rather than markers influenced by metformin. Between the two confounder-free models, although the prior-knowledge-guided approach showed slightly lower prediction accuracy compared to the fully connected model, it offered greater interpretability, providing additional insights into the underlying biological mechanisms.