Genomic insights into Indian wheat stripe rust pathotypes from long-read hybrid assemblies
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Stripe rust, caused by Puccinia striiformis f. sp. tritici ( Pst ), poses a significant threat to global wheat production. Resistance in wheat cultivars is frequently overcome due to rapid evolution of pathogen virulence. Until recently, genome assemblies of Indian Pst pathotypes were based exclusively on short-read sequencing, which is limited in resolving the highly repetitive and heterozygous dikaryotic genomes of rust fungi. Results We generated hybrid genome assemblies for five Indian Pst pathotypes (110S119, 238S119, 46S119, 110S84, and 78S84) using high-coverage PacBio and Illumina sequencing. Assembly with Maryland Super-Read Celera Assembler (MaSuRCA) resulted in genome sizes ranging from 75.21 Mb (110S119) to 83.03 Mb (78S84), with contig counts ranging from 286 to 877. All assemblies exhibited GC content > 44% and > 90% completeness based on Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis, indicating high assembly quality. Gene prediction with Funannotate identified 14,559 to 15,283 protein-coding genes per pathotype. Functional classification of predicted proteins was performed using InterProScan. Phylogenetic analysis based on single-copy orthologs clustered the five Indian pathotypes into a single clade, with 78S84 and 238S119 forming one subgroup, and 110S119 and 46S119 another. Conclusions These high-quality genome assemblies represent the first long-read-based resources for Indian Pst pathotypes and provide valuable genomic insights into stripe rust diversity and evolution. They will serve as a foundation for rust surveillance, evolutionary studies, and the development of durable resistance in wheat.