Ratios in Disguise, Truths Arise: Glycomics Meets Compositional Data Analysis

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Comparative glycomics data are an instance of compositional data defined by the Aitchison simplex, where measured glycans are parts of a whole, indicated by relative abundances, which are then compared between conditions. Applying traditional statistical analyses to this type of data often results in misleading conclusions, such as spurious “decreases” of glycans between conditions when other structures sharply increase in abundance, or routine false-positive rates of >25% for differential abundance. Our work introduces a compositional data analysis framework, specifically tailored to comparative glycomics, to account for these data dependencies. We employ center log-ratio (CLR) and additive log-ratio (ALR) transformations, augmented with a model incorporating scale uncertainty/information, to introduce the most robust and sensitive glycomics data analysis pipeline. Applied to many publicly available comparative glycomics datasets, we show that this model controls false-positive rates and results in new biological findings. Additionally, we present new modalities to analyze comparative glycomics data with this framework. Alpha- and beta-diversity enable exploration of glycan distributions within and between biological samples, while cross-class glycan correlations shed light on complex and previously undetected interdependencies. These new approaches have revealed deeper insights into glycome variations that are critical to understanding the roles of glycans in health and disease.

Article activity feed