Model-based Standardization of Correlation Coefficients Improves Multi-Omic Clustering and Biological Signal Discovery

Max Robinson
Heeju Noh
Lance Pflieger
Noa Rappoport

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Multi-omic data pose a particular challenge for Weighted Correlation Network Analysis (WCNA or WGCNA) due to (platform- or) batch-specific characteristics, such as resolution, accuracy, dynamic range, and sources of spurious variation. When unaccounted for, these differences can result in a bias toward single-batch clusters as well as greater sensitivity to "noisier" batches during clustering. Here we propose mitigating these effects using null models fitted separately to the bulk of analyte-analyte correlations within each batch and across each pair of batches. We then map the batch-specific null models to a standard null model, removing batch-dependent distributional differences. This approach is compatible with any correlation-based clustering approach. Since the null model represents information not captured in individual pairwise correlations, we show how to incorporate this additional information into both distance-based clustering and WCNA. For distance-based clustering, we increase distances corresponding to correlations consistent with the null model. For WCNA, we provide a new soft threshold (adjacency) function based on the likelihood of a correlation under the null model. The resulting network can be easily incorporated into the WCNA workflow. These methods are implemented in R package standardcor, and we illustrate the package on simulated data as well as an existing multi-omic dataset.

Version published to 10.1101/2025.11.17.688875 on bioRxiv
Nov 17, 2025

Methods for the integrated meta-analysis of mean and variation effects

This article has 5 authors:
1. Alistair Senior
2. Tim Dodgson
3. Malgorzata Lagisz
4. Yefeng Yang
5. Shinichi Nakagawa
This article has no evaluationsLatest version Jan 8, 2026
When to cluster phenotypic data? A simulation-based framework to guide decisions in agrobiodiversity research

This article has 1 author:
1. Abdel Kader NAINO JIKA
This article has no evaluationsLatest version Jan 9, 2026
Classification of Bio-Data with Interval Dissimilarities: A Multidimensional Scaling Framework

This article has 4 authors:
1. Md. Anwarul Islam Bhuiyan
2. Sohana Jahan
3. Md. Babul Hasan
4. Md. Maruf Hossain
This article has no evaluationsLatest version Jan 21, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Methods for the integrated meta-analysis of mean and variation effects

When to cluster phenotypic data? A simulation-based framework to guide decisions in agrobiodiversity research

Classification of Bio-Data with Interval Dissimilarities: A Multidimensional Scaling Framework