Three Classes of Confound in Gene-Regulatory-Network Inference: A Systematic Audit and Open-Source Diagnostic Toolkit

Ihor Kendiukhov

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background : Inferred gene regulatory networks (GRNs) from single-cell RNA-seq are used to prioritize transcription-factor–target hypotheses, yet edge rankings can be inflated by confounds that are rarely audited systematically. Individual confound classes—technical batch effects, genomic proximity co-expression, and degree-distribution artifacts—have been studied in isolation, but no prior work has conducted a unified audit across all three on the same datasets and inference methods. Results : We present grn_confound_audit, an open-source Python package that implements a unified three-class confound audit covering technical bias (batch, donor, and assay-method leakage), genomic-structural bias (chromosomal proximity inflation), and topological bias (degree-distribution artifacts). Across three Tabula Sapiens tissues and 12 inference methods, the tool reveals that: (i) donor and batch identity are recoverable from edge features at AUC 0.85–0.97; (ii) prior-heavy methods show 2–3× genomic-proximity enrichment that attenuates to 1.15–1.28× under degree-preserving rewiring; (iii) no individual edge reaches FDR ≤ 0.10 under topological null calibration despite strong global separation (z-scores 12–60). The three confound classes are largely orthogonal, and joint filtering retains only ∼28% of candidate edges. Perturbation validation using CRISPR data shows that technically blacklisted edges have 2.7-fold lower perturbation-significant rates. Conclusions : The grn_confound_audit toolkit enables routine multi-class confound diagnostics for any scored GRN edge list, producing per-edge quality indices, standardised reports, and actionable recommendations. We propose that confound auditing should become a standard component of GRN publications alongside accuracy benchmarks.

Version published to 10.21203/rs.3.rs-8997629/v1 on Research Square
Mar 26, 2026

Systematic evaluation of single-cell foundation model interpretability reveals attention captures co-expression rather than unique regulatory signal

This article has 1 author:
1. Ihor Kendiukhov
This article has no evaluationsLatest version Mar 26, 2026
From GWAS to Causal Inference: A Beginner’s Guide to Mendelian Randomization with Code Examples

This article has 7 authors:
1. Ahmed M Salih
2. Roman Roy
3. Yuhe Wang
4. Irene Treccani
5. Andre Altmann
6. Zahra Raisi-Estabragh
7. Gloria Menegaz
This article has no evaluationsLatest version Apr 9, 2026
scRNA-seq Bias Detector: An Integrated Unsupervised Anomaly Recognition and Multi-Track Quality Control Framework for Single-Cell Transcriptomics

This article has 1 author:
1. Yashwant Nama
This article has no evaluationsLatest version Mar 19, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Systematic evaluation of single-cell foundation model interpretability reveals attention captures co-expression rather than unique regulatory signal

From GWAS to Causal Inference: A Beginner’s Guide to Mendelian Randomization with Code Examples

scRNA-seq Bias Detector: An Integrated Unsupervised Anomaly Recognition and Multi-Track Quality Control Framework for Single-Cell Transcriptomics