A Chromosome-scale draft genome sequence of horsegram ( Macrotyloma uniflorum )

Read the full article See related articles


Horsegram [ Macrotyloma uniflorum (Lam.) Verdc.] is an underutilized warm season diploid legume (2n=20, 22), It is consumed as a food legume in India, and animal feed and fodder in Africa and Australia. Because of its ability to grow under water-deficient and marginal soil conditions, horsegram is a preferred choice in the era of the global climatic change. In recognition of its potential as a crop species, we generated and analyzed a draft genome sequence for HPK-4. The genome sequences of HPK-4 were generated by Illumina platform. Ten chromosome-scale pseudomolecules were created by aligning scaffold sequences onto a linkage map. The total length of the ten pseudomolecules were 259.2 Mb, covering 89% of the total length of the assembled sequences. A total of 36,105 genes were predicted on the assembled sequences, and 14,736 were considered to be horsegram specific genes by comparative analysis with Phaseolus. vulgaris , Vigna. angularis , Lotus. japonicus and Arabidopsis. thaliana . The results of macrosynteny analysis suggested that the genome structure of V. angularis is more similar to horsegram than that of P. vulgaris. Diversity analysis in the 91 accessions of horsegram with dd-RAD-Seq analysis indicated narrow genetic diversity among the horsegram accessions. This is the first attempt to generate a draft genome sequence in horsegram and will provide a reference for sequence-based analysis of the horsegram germplasm to elucidate the genetic basis of important traits.

Article activity feed

  1. Horsegram

    **Reviewer 2. Penghao Wang ** Authors presented a paper on describing a new pseudo-chromosome draft genome sequences of a legume plant horsegram and some bioinformatics analyses based on the data. The presented assembly is of good quality and the bioinformatics analysis performed is sound. The resources made available by the study should prove valuable to researchers working on the plant and legume community on a whole. The paper is generally well written and I personally found out the paper is quite easy to follow. A few grammatical errors can be found. The bioinformatics methodology that has been utilised in the study is sound and the software used fit the goals of the study. However, authors need to present more details on some analysis components, e.g. the parameter set used for the software, the version of the software, the OS, etc, so that the analysis can be better reproduced. For example, in Methods section, line 76 the Jellyfish program was used to estimate the genome size, the parameter, version, OS of running the software were not mentioned. Line 78 SOAPdenovo2 apart from Kmer the most important parameter, what about the rest? SSPACE 2.0 was used for scaffolding, the insert sizes? Platanus, MaSuRCA, TruSPAdes, RepeatMasker, augustus, all these software involve a number of parameters, and the details on how they were used need to be provided. Because the results can be sharply different with different parameters. Some figures appear to be created by using some tools, and these tools need to be acknowledged and referenced. For example, is Circus used to generate the circular plot in Fig 5? In addition, I could not find captions for all the main figures.

    Recommendation: Minor Revision

  2. Summary

    **Reviewer 1. Tianzuo Wang ** Is the language of sufficient quality? It can be improved better.

    Shirasawa et al. reported a Chromosome-scale draft genome sequence of horsegram, and performed the analysis of comparative genomics. 1.If Pacbio data was used, the quality of genome can be improved much. 2.Only genomes of P. vulgaris, V. angularis and L. japonicus in the legume were used for phylogenetic analysis. Soybean and Medicago, as the model legume plants, should be added at last. 3.The section of Whole genome structure in horsegram should be introduced before Diversity analysis in genetic resources, Genes related to drought tolerance, and Transcript sequencing, gene prediction and annotation. Because genome information is the foundation of other analysis.

    Recommendation Major Revision