The first near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum

This article has been Reviewed by the following groups

Read the full article

Abstract

Common bread wheat, Triticum aestivum, has one of the most complex genomes known to science, with 6 copies of each chromosome, enormous numbers of near-identical sequences scattered throughout, and an overall haploid size of more than 15 billion bases. Multiple past attempts to assemble the genome have produced assemblies that were well short of the estimated genome size. Here we report the first near-complete assembly of T. aestivum, using deep sequencing coverage from a combination of short Illumina reads and very long Pacific Biosciences reads. The final assembly contains 15 344 693 583 bases and has a weighted average (N50) contig size of 232 659 bases. This represents by far the most complete and contiguous assembly of the wheat genome to date, providing a strong foundation for future genetic studies of this important food crop. We also report how we used the recently published genome of Aegilops tauschii, the diploid ancestor of the wheat D genome, to identify 4 179 762 575 bp of T. aestivum that correspond to its D genome components.

Article activity feed

  1. Now published in GigaScience doi: 10.1093/gigascience/gix097

    Aleksey V. Zimin 1Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD2Institute for Physical Sciences and Technology, University of Maryland, College Park, MDFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteDaniela Puiu 1Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MDFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteRichard Hall 3Pacific Biosciences, Menlo Park, CAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSarah Kingan 3Pacific Biosciences, Menlo Park, CAFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteSteven L. Salzberg 1Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD5Departments of Biomedical Engineering, Computer Science, and Biostatistics, Johns Hopkins University, Baltimore, MDFind this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Steven L. SalzbergFor correspondence: salzberg@jhu.edu

    A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/gix097 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.

    These peer reviews were as follows:

    Reviewer 1: http://dx.doi.org/10.5524/REVIEW.100877 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.100879 Reviewer 3: http://dx.doi.org/10.5524/REVIEW.100878 Reviewer 4: http://dx.doi.org/10.5524/REVIEW.100880