The 1001G+ project: A curated collection of Arabidopsis thaliana long-read genome assemblies to advance plant research

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Arabidopsis thaliana was the first plant for which a high-quality genome sequence became available. The publication of the first reference genome sequence almost 25 years ago was already accompanied by genome-wide data on sequence polymorphisms in another accession, or naturally occurring strain. Since then, inventories of genome-wide diversity have been generated at increasingly precise levels. High-density genotype data for A. thaliana , including those from the 1001 Genomes Project, were key to demonstrating the enormous power of GWAS in inbred populations of wild plants, and the comparison of intraspecific polymorphism with interspecific divergence has illuminated many aspects of plant genome evolution. Over the past decade, an increasing number of nearly complete genome sequences have been published for many more accessions. Here, we highlight the diversity of a curated collection of previously published and so far unpublished genome sequences assembled using different types of long reads, including PacBio Continuous Long Reads (CLR), PacBio High Fidelity (HiFi) reads, and Oxford Nanopore Technologies (ONT) reads. This 1001 Genomes Plus (1001G+) resource is being made available at http://1001genomes.org . We invite colleagues with yet unpublished genome assemblies from A. thaliana accessions to contribute to this effort.

Article activity feed