A chromosome-scale draft genome sequence of horsegram (Macrotyloma uniflorum)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Horsegram (Macrotyloma uniflorum [Lam.] Verdc.) is an underutilized warm-season diploid legume (2n = 20, 22). Because of its ability to grow under water-deficient and marginal soil conditions, horsegram is a preferred choice in the era of global climate change. In recognition of its potential as a crop species, we generated and analyzed a draft genome sequence for a horsegram variety, HPK-4. Ten chromosome-scale pseudomolecules were created by aligning Illumina scaffold sequences onto a linkage map. The total length of the ten pseudomolecules was 259.2 Mbp, covering 89% of the total length of the assembled sequences. A total of 36,105 genes were predicted on the assembled sequences. Diversity analysis of 89 horsegram accessions by dd-RAD-Seq identified 277 single nucleotide polymorphisms (SNPs), suggesting narrow genetic diversity among the horsegram accessions. This is the first attempt to generate a draft genome sequence of horsegram and will provide a reference for sequence-based analysis of horsegram germplasm.

Article activity feed

  1. Horsegram

    **Reviewer 2. Penghao Wang ** Authors presented a paper on describing a new pseudo-chromosome draft genome sequences of a legume plant horsegram and some bioinformatics analyses based on the data. The presented assembly is of good quality and the bioinformatics analysis performed is sound. The resources made available by the study should prove valuable to researchers working on the plant and legume community on a whole. The paper is generally well written and I personally found out the paper is quite easy to follow. A few grammatical errors can be found. The bioinformatics methodology that has been utilised in the study is sound and the software used fit the goals of the study. However, authors need to present more details on some analysis components, e.g. the parameter set used for the software, the version of the software, the OS, etc, so that the analysis can be better reproduced. For example, in Methods section, line 76 the Jellyfish program was used to estimate the genome size, the parameter, version, OS of running the software were not mentioned. Line 78 SOAPdenovo2 apart from Kmer the most important parameter, what about the rest? SSPACE 2.0 was used for scaffolding, the insert sizes? Platanus, MaSuRCA, TruSPAdes, RepeatMasker, augustus, all these software involve a number of parameters, and the details on how they were used need to be provided. Because the results can be sharply different with different parameters. Some figures appear to be created by using some tools, and these tools need to be acknowledged and referenced. For example, is Circus used to generate the circular plot in Fig 5? In addition, I could not find captions for all the main figures.

    Recommendation: Minor Revision

  2. Summary

    **Reviewer 1. Tianzuo Wang ** Is the language of sufficient quality? It can be improved better.

    Shirasawa et al. reported a Chromosome-scale draft genome sequence of horsegram, and performed the analysis of comparative genomics. 1.If Pacbio data was used, the quality of genome can be improved much. 2.Only genomes of P. vulgaris, V. angularis and L. japonicus in the legume were used for phylogenetic analysis. Soybean and Medicago, as the model legume plants, should be added at last. 3.The section of Whole genome structure in horsegram should be introduced before Diversity analysis in genetic resources, Genes related to drought tolerance, and Transcript sequencing, gene prediction and annotation. Because genome information is the foundation of other analysis.

    Recommendation Major Revision