Quantification of organelle genome contamination in public Silene latifolia RAD-seq datasets

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Objective: Restriction site-associated DNA sequencing (RAD-seq) is widely used for linkage mapping and population/phylogenetic inference, yet off-target reads originating from chloroplast and mitochondrial genomes may bias downstream analyses. We quantified organelle-derived reads in publicly available Silene latifolia RAD-seq libraries and compared them with whole-genome sequencing (WGS) and cDNA (RNA-seq) datasets. Results: We analyzed 42 libraries from public repositories (26 RAD-seq, 9 WGS, 7 cDNA). Organelle-derived reads were detected in every library. RAD-seq libraries were typically low in organelle reads (23/26 libraries <5%), but three RAD-seq runs showed extreme organelle carry-over (26.4-32.9%). Across RAD-seq libraries, organelle-mapped reads were predominantly mitochondrial rather than chloroplast. These results highlight that organelle carry-over can vary markedly among RAD-seq libraries and may benefit from routine quantification and filtering prior to SNP calling to reduce potential downstream bias. Conclusions: Organelle-derived reads are detectable in all examined S. latifolia libraries and may be extreme in a subset of RAD-seq runs; routine quantification and removal of organelle-mapped reads may be a useful component of RAD-seq quality control to minimize potential downstream bias.

Article activity feed