The Limits of Cross-Species WGCNA: Library Imbalance and Signal Dilution Constrain Effector Gene Recovery in Dual-Organism RNA-seq

Amit Fenn
Ralph Hückelhoven
Nadia Kamal

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Dual-organism RNA sequencing (RNA-seq) experiments, in which the transcriptomes of a host and a microbe are sequenced simultaneously, are increasingly used to study plant–microbe interactions. A central analytical goal is identifying effector proteins and their host targets through gene co-expression. Weighted Gene Co-expression Network Analysis (WGCNA) is the dominant tool for gene co-expression analyses, yet its ability to recover interaction-interface genes from a merged dual-organism matrix has not been systematically characterised. Here we present a simulation framework using real gene models from Hordeum vulgare (barley) and Blumeria graminis f. sp. Hordei M.Liu & Hambl (powdery mildew) to evaluate single-network WGCNA across a gradient of plant-to-fungal library size ratios (1:1–20:1), three levels of co-expression signal strength, and three WGCNA network construction types (signed, unsigned, signed hybrid). We embed 20 model effector genes (bridge genes) driven by a mixed host–pathogen eigengene and evaluate recovery using four metrics aligned with the biological objective: cross-species hub rank, top-decile hub enrichment, bridge gene detection rate, and bridge co-separation (the fraction of effector–target pairs co-assigned to the same detected module). Across 225 simulation runs (15 conditions × 5 replicates × 3 network types), bridge genes are robustly identifiable as cross-species connectivity hubs (mean rank 0.92 versus 0.50 for module genes) but co-assignment of effector–target pairs to the same module fails in 41% of runs due to scale-free topology collapse. Signal strength ( η ² = 0.12) and library ratio ( η ² = 0.22) are the primary determinants of co-separation, while network type choice accounts for less than 2%. A read-depth bias systematically inflates pathogen gene hub ranks relative to host genes at high ratios. These results establish that the method can identify effector candidates as cross-species hubs under a broad range of conditions, but reliable co-assignment requires adequate pathogen read depth and strong co-expression signal—properties that experimental design, not analytical parameterisation, must provide.

Version published to 10.64898/2026.04.30.721941 on bioRxiv
May 5, 2026

De novo protein discovery in non-model organisms

This article has 1 author:
1. Asif Ali
This article has no evaluationsLatest version May 13, 2026
EffectorGeneP: accurate gene annotation in pathogen genomes from infection transcriptomes

This article has 12 authors:
1. Jana Sperschneider
2. Camilla Langlands-Perry
3. Jian Chen
4. Jibril Lubega
5. Taj Arndell
6. David Lewis
7. Eva Henningsen
8. Cheryl Blundell
9. Thomas Vanhercke
10. Kostya Kanyuka
11. Melania Figueroa
12. Peter Dodds
This article has no evaluationsLatest version May 5, 2026
geneSync: Gene Symbol Harmonization for Large-scale RNA-seq Data Integration

This article has 2 authors:
1. Zhijun Feng
2. Ting Li
This article has no evaluationsLatest version May 7, 2026

The Limits of Cross-Species WGCNA: Library Imbalance and Signal Dilution Constrain Effector Gene Recovery in Dual-Organism RNA-seq

Discuss this preprint

Listed in

Abstract

Article activity feed

De novo protein discovery in non-model organisms

EffectorGeneP: accurate gene annotation in pathogen genomes from infection transcriptomes

geneSync: Gene Symbol Harmonization for Large-scale RNA-seq Data Integration

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

De novo protein discovery in non-model organisms

EffectorGeneP: accurate gene annotation in pathogen genomes from infection transcriptomes

geneSync: Gene Symbol Harmonization for Large-scale RNA-seq Data Integration