Whole-genome sequencing of wild and ancestral Dura provides insight into the untapped genomic information of undomesticated oil palm ( Elaeis guineensis Jacq.)
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Oil palm (Elaeis guineensis Jacq.) is a globally important crop, and its genetic improvements benefit from comprehensive genome sequencing. Here, we report the whole-genome sequencing and annotation of two key genetic resources: the wild (Eg-DCM) and ancestral (Eg-DBG) Dura accessions, using a combination of short- and long-read sequencing technologies. De novo assembly followed by polishing, proximity ligation, and reference-guided scaffolding yielded high-quality assemblies with ungapped lengths of 1.71 Gb (Eg-DBG) and 1.48 Gb (Eg-DCM). Eg-DCM and Eg-DBG genomes exhibited high completeness, with over 97% of Benchmarking Universal Single-Copy Orthologs (BUSCOs) recovered across the Eukaryota, Viridiplantae, and Embryophyta datasets. Repetitive elements, particularly retrotransposons, dominated both genomes, accounting for 46.10% of Eg-DBG and 43.85% of Eg-DCM. Gene prediction initially identified 61,256 (Eg-DBG) and 53,985 (Eg-DCM) genes, which were refined into high-confidence gene sets of 39,263 and 35,298, respectively. Additionally, 1,760 and 1,684 putative resistance (R) genes were identified in Eg-DCM and Eg-DBG, with similar class distributions. The five major R gene classes comprise KIN, RLK, RLP, CNL, and CK. With further research, the assembled whole-genome sequences and the annotated genes of Eg-DBG and Eg-DCM offer valuable insights into the untapped genomic information of undomesticated accessions, with implications for future breeding and crop improvement efforts of oil palm.