A compositional reanalysis of poly(A)-selected RNA-seq reveals tank effects but no survival-associated differences

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Poly(A)-selected RNA-seq datasets are routinely generated in aquaculture research, yet the microbial information contained in unmapped reads is seldom explored due to the low abundance of nonhost transcripts and concerns about contamination. In this study, we repurposed Atlantic salmon gill RNA-seq data to assess whether meaningful microbial signals can be recovered using a contamination-aware and compositionally appropriate framework. Unmapped reads were analyzed with a custom Kraken2 database composed exclusively of complete, circularized salmon-associated bacterial genomes together with all available Atlantic salmon assemblies and the human genome. Although microbial sequences represented only a small fraction of total reads, 21 genera were detectable across samples. Genus-level profiles, Jaccard-based ordination, and ANCOM-BC analyses consistently revealed clear differences between tanks, whereas no associations were observed for sex or survival status. Three species exhibited significant tank-specific effects, indicating that environmental factors contributed the strongest detectable structure in the data. The limited microbial diversity recovered here reflects the expected constraints of poly(A)-enriched libraries, yet the results demonstrate that unmapped reads from host-derived RNA-seq can still provide informative environmental signatures when analyzed with curated reference databases and compositional statistical approaches. This strategy offers a practical means to extract exploratory microbiome information from existing transcriptomic datasets.

Article activity feed