Benchmarking long-read genome assemblers for three sequencing protocols and three agricultural species

Clément Birbes
Andreea Dréau
Denis Milan
Carole Iampietro
Christine Gaspin
Cécile Donnadieu
Matthias Zytnicki
Christophe Klopp

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Today, long read technologies make it possible to produce telomere-to-telomere genome assemblies. These assemblies permit more accurate whole genome analyses, including on repeated regions.

The genome assembly process can chain several steps such as read correction, contigs assembly, contig polishing, scaffolding and gap filling, among others. Each step usually requires at least one software package, and input sequences. For the end user, it is not necessarily clear which combination of tools, together with read types, will work best for a given genome.

In this work, we will focus on contig production, which is a central and complex task of the assembly process. It aims at producing the longest, errorless, sequences, called contigs , from reads. While it is possible to produce contigs from short reads, long reads are now widely preferred, since they produce much longer contigs.

In this work, we evaluate several contig producing software packages (usually named assemblers ), on long reads generated by two sequencers using three protocols, for three eukaryotic, complex species with different characteristics. Our aim is twofold. First, we would like to give readers insight on the impact of sequencing technology and assembler combinations, in order to help them make their choice for a given genome. Second, we would like to present different assembly metrics and provide a critical view on their interpretation.

Version published to 10.1101/2025.02.14.638238 on bioRxiv
Feb 18, 2025

A Benchmarking Framework to Catalyze Individual Human Genome Projects

This article has 3 authors:
1. Manjushri kalpande
2. Apoorva Ganesh
3. Subhashini Srinivasan
This article has no evaluationsLatest version Dec 17, 2025
Nanopore Data-Driven Near-T2T Genome Assembly of <em>Hippophae rhamnoides</em> ssp. <em>mongolica</em> Rousi

This article has 15 authors:
1. Alexander Arkhipov
2. Nadezhda Bolsheva
3. Elena Pushkova
4. Vladislav Babenko
5. Yury Zubarev
6. Vera Kovalenko
7. Fedor Kostromskoy
8. Elizaveta Ivankina
9. Ekaterina Dvorianinova
10. Nikolai Barsukov
11. Daiana Krupskaya
12. Elena Borkhert
13. Ksenia Klimina
14. Nataliya Melnikova
15. Alexey Dmitriev
This article has no evaluationsLatest version Dec 15, 2025
Shotgun metagenomics: a deep insight into the composition and function of the complex microbial world

This article has 7 authors:
1. Grazia Visci
2. Elisabetta Notario
3. Giuseppe Defazio
4. Mariano Francesco Caratozzolo
5. Bruno Fosso
6. Marinella Marzano
7. Graziano Pesole
This article has no evaluationsLatest version Jan 30, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

A Benchmarking Framework to Catalyze Individual Human Genome Projects

Nanopore Data-Driven Near-T2T Genome Assembly of <em>Hippophae rhamnoides</em> ssp. <em>mongolica</em> Rousi

Shotgun metagenomics: a deep insight into the composition and function of the complex microbial world