Integrative Genomic and Machine Learning Approaches Reveal Evolutionary Signatures in the Winged Bean Mitochondrial Genome
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The mitochondrial genome (mitogenome) of Psophocarpus tetragonolobus (winged bean), a nutritionally valuable yet genomically underexplored tropical legume, was assembled using high-coverage PacBio long reads and Illumina short reads. The 366,925 bp circular genome encodes 64 genes (38 protein-coding, 20 tRNAs, 6 rRNAs) and contains nine fragmented protein-coding genes, indicative of dynamic mitogenome architecture. Repeat profiling revealed 100 dispersed repeats (30–110 bp) and 25 SSRs (4.95% of the genome), with assembly graph inspection and recombination models supporting subgenomic circles and isoforms. Comparative analyses across 15 legumes showed pervasive purifying selection, with positive selection in specific codons of atp4 , ccmB , cox1 , nad3 , and rps10 . Codon usage bias differed markedly between organelles: mitochondrial genes exhibited moderate bias consistent with neutral expectations, whereas chloroplast genes showed greater variability, suggesting additional selective constraints. Synteny mapping revealed multiple conserved and inverted regions between organelles, highlighting structural divergence. Leveraging 14 codon bias metrics, we implemented the first machine learning framework for organelle genome classification in plants, achieving up to 0.96 AUC and identifying GC3s as the most influential feature. This integrative genomic, evolutionary, and ML-based approach advances understanding of P. tetragonolobus mitogenome evolution and establishes a proof-of-concept with potential cross-kingdom applications.