Correlation measures in metagenomic data: the blessing of dimensionality

Alessandro Fuschi
Alessandra Merlotti
Thi Dong Binh Tran
Hoan Nguyen
George M. Weinstock
Daniel Remondini

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Microbiome analysis has revolutionized our understanding of various biological processes, spanning human health, epidemiology (including antimicrobial resistance and horizontal gene transfer), as well as environmental and agricultural studies. At the heart of microbiome analysis lies the characterization of microbial communities through the quantification of microbial taxa and their dynamics. In the study of bacterial abundances, it is becoming more relevant to consider their relationship, to embed these data in the framework of network theory, allowing characterization of features like node relevance, pathway and community structure. In this study, we address the primary biases encountered in reconstructing networks through correlation measures, particularly in light of the compositional nature of the data, within-sample diversity, and the presence of a high number of unobserved species. These factors can lead to inaccurate correlation estimates. To tackle these challenges, we employ simulated data to demonstrate how many of these issues can be mitigated by applying typical transformations designed for compositional data. These transformations enable the use of straightforward measures like Pearson’s correlation to correctly identify positive and negative relationships among relative abundances, especially in high-dimensional data, without having any need for further corrections. However, some challenges persist, such as addressing data sparsity, as neglecting this aspect can result in an underestimation of negative correlations.

Version published to 10.1101/2024.02.29.582875 on bioRxiv
Mar 4, 2024

Understanding Pathways in Bioinformatics, Genomics, and Health Applications

This article has 1 author:
1. Diptarup Mallick
This article has no evaluationsLatest version Jan 19, 2026
Integrating Microbiome Data Visualization into FAIRDatabase using Edge Functions

This article has 3 authors:
1. Roman van Eldijk
2. Shivam Kumar
3. Vivek Sheraton M
This article has no evaluationsLatest version Jan 27, 2026
Shotgun metagenomics: a deep insight into the composition and function of the complex microbial world

This article has 7 authors:
1. Grazia Visci
2. Elisabetta Notario
3. Giuseppe Defazio
4. Mariano Francesco Caratozzolo
5. Bruno Fosso
6. Marinella Marzano
7. Graziano Pesole
This article has no evaluationsLatest version Jan 30, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Understanding Pathways in Bioinformatics, Genomics, and Health Applications

Integrating Microbiome Data Visualization into FAIRDatabase using Edge Functions

Shotgun metagenomics: a deep insight into the composition and function of the complex microbial world