First chromosome-scale genome assemblies and comprehensive structural characterization of Tunisian durum wheat ( Triticum turgidum subsp. durum ) landraces Chili and Mahmoudi
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Durum wheat ( Triticum turgidum subsp. durum ) is a globally important crop for pasta and couscous production. Chili and Mahmoudi are historically significant Tunisian landraces valued for exceptional grain quality, high protein content, and adaptation to arid Mediterranean climates. Yet no high-quality reference genome assemblies were available for either variety before this work. We assembled both genomes using publicly available PacBio HiFi long reads and Illumina Hi-C proximity ligation data deposited under NCBI BioProject PRJNA1420514. HiFi reads were assembled with hifiasm v0.25.0 in primary mode, and Hi-C scaffolding was performed with YAHS v1.2a.2 after read alignment with BWA-MEM. Assembly quality was assessed with QUAST v5.3.0 and BUSCO v5.8.0 (embryophyta_odb10 lineage). The Chili assembly spans 10.84 Gbp across 3,472 scaffolds with a scaffold N50 of 844.9 Mbp and BUSCO completeness of 99.4%. The Mahmoudi assembly spans 10.70 Gbp across 3,258 scaffolds with a scaffold N50 of 2,072 Mbp and BUSCO completeness of 99.3%. Merqury v1.3 confirmed high base accuracy (QV 68.0 for Chili, QV 68.3 for Mahmoudi) and k-mer completeness (>98% for both). Independent validation with wfmash confirmed 98.6% mean alignment identity across 11,172 chromosome-to-reference alignments. Post-assembly characterization of GC profiling, centromere architecture, ribosomal DNA arrays, and structural variation revealed extensive genome-level detail. Both assemblies substantially exceed the contiguity of existing durum wheat references and represent the first chromosome-scale-contiguity genomic resources for North African durum wheat landraces. The Mahmoudi accession carried a putative 2B-3B homeologous fusion on chromosome 3B (4,710 Mbp), a structural novelty in an ancient Tunisian landrace. Assembly and Pseudomolecules available from Zenodo (10.5281/zenodo.20366290).
The workflow was executed reproducibly on the public Galaxy Europe platform, demonstrating that reference-quality plant genome assembly is achievable without local HPC infrastructure.
Graphical Abstract
Highlights
-
First chromosome-scale-contiguity assemblies for Chili and Mahmoudi landraces
-
Chili genome: 10.84 Gbp, scaffold N50 844.9 Mbp, BUSCO 99.4%
-
Mahmoudi genome: 10.70 Gbp, scaffold N50 2,072 Mbp, BUSCO 99.3%
-
Merqury QV ~68 confirms base accuracy, >98% k-mer completeness
-
>140-fold scaffold N50 improvement over Svevo v1 reference