Nutritional Genomics of Tepary Bean (Phaseolus acutifolius): Genome‑wide association analysis and genomic prediction of seed nutritional traits and size
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Tepary bean [ Phaseolus acutifolius ] is a drought- and heat-tolerant, nitrogen-fixing legume that offers a promising low-input protein source. Nonetheless, the genetic factors influencing seed protein and amino acid profiles are not well understood. We evaluated 206 diverse accessions along with four controls in organic fields, measuring hundred-seed weight [HSW], seed width, total soluble seed protein [%], and the profiles of nineteen free amino acids. Using genotyping-by-sequencing, we identified 49,384 high-quality SNPs and conducted GWAS with multiple models [GLM, MLM, BLINK, FarmCPU] on BLUPs, controlling population structure. Results We found genome-wide significant links to protein percentage on Chr08, including candidate genes like a WNK kinase and a 2-oxoglutarate/Fe [II]-dependent oxygenase. Additionally, trait-specific loci were identified for fifteen of the nineteen free amino acids, indicating a modular genetic architecture. Notably, the essential amino acids threonine, methionine, and lysine each had unique significant loci, marking the first tepary-specific markers for these nutritionally important traits. Fewer but stable associations related to seed size were observed on Chr02 [HSW; V-ATPase subunit] and Chr07 [seed width; Aux/IAA]. Genomic prediction models further revealed high predictive ability for seed size [r ≈ 0.90–0.96] and moderate accuracy for protein and amino acid traits [r ≈ 0.15–0.45], consistent with their polygenic and modular genetic structure. Conclusion By integrating GWAS with genomic prediction, we identify candidate genes, trait-specific genomic regions, and reliable benchmarks for predicting protein concentration, essential amino acids, and seed size in tepary bean. The alignment between association signals and prediction accuracy supports a dual-breeding approach that combines marker-assisted selection for key loci with genomic selection to leverage residual polygenic variation. This combined framework strengthens opportunities to enhance seed nutritional quality without negatively affecting seed size and offers synteny-based entry points for gene discovery and introgression across Phaseolus species.