Data-driven design of LNA-blockers for efficient contaminant removal in Ribo-seq libraries

Dario A. Ricciardi
Franziska E. Peter
Maik Böhmer

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Ribo-Seq libraries often contain a high amount of non-coding RNA fragments, which can significantly reduce the information output of these experiments. Contaminants can comprise up to 90% of a Ribo-Seq library, showing high sequence variability and diverse fragmentation, which hinders the effectiveness of rRNA depletion kits with fixed target sequences. We developed a workflow to identify experiment-specific contaminants from a small-scale, preliminary sequencing run. This enables the design of locked nucleic acid (LNA) oligonucleotides that target the contaminating fragments, thereby preventing their amplification during library preparation. This process requires only a single pipetting step and no additional purification. In a proof-of-concept experiment, just five LNAs reduced contaminating fragments by over 30 %, doubling the amount of useful sequencing data from Ribo-Seq experiments.

We offer a script to identify and visualize contaminants and optimized target sequences, along with guidelines for designing custom LNA sets and a collection of predesigned LNAs for Arabidopsis thaliana across various common growth conditions, serving as a foundation for a public LNA repository.

Significance Statement

Ribo-Seq libraries often contain abundant non-coding RNA contaminants, which, because of their high sequence variability and diverse fragmentation, are challenging to remove. We present a computational pipeline that identifies experiment-specific target sequences and allows for their efficient depletion using custom LNA probes in a single pipetting step, thereby increasing sequencing yield and reducing costs. A public LNA repository will support sharing validated targets within the research community.

Version published to 10.1101/2025.09.11.675547 on bioRxiv
Sep 16, 2025

Discuss this preprint

Listed in

Abstract

Significance Statement

Article activity feed