ABaCo: Addressing Heterogeneity Challenges in Metagenomic Data Integration with Adversarial Generative Models

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The rapid advancement of high-throughput metagenomics has produced extensive and heterogeneous datasets with significant implications for environmental and human health. Integrating these datasets is crucial for understanding the functional roles of microbiomes and the interactions within microbial communities. However, this integration remains challenging due to technical heterogeneity and the inherent complexity of these biological systems. To address these challenges, we introduce ABaCo, a generative model that combines a Variational Autoencoder (VAE) with an adversarial discriminator specifically designed to handle the unique characteristics of metagenomic data. Our results demonstrate that ABaCo effectively integrates metagenomic data from multiple studies, corrects technical heterogeneity, outperforms existing methods, and preserves taxonomic-level biological signals. We have developed ABaCo as an open-source, fully documented Python library to facilitate, support and enhance metagenomics research in the scientific community.

Article activity feed