Incorporating gene expression and environment for genomic prediction in wheat
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The adoption of novel molecular strategies such as genomic selection (GS) in crop breeding have been key to maintaining rates of genetic gain through increased efficiency and shortening the cycle of evaluation relative to conventional selection. In the search for improved methodologies that incorporate novel sources of variation for the assessment of genetic merit, GS remains a focus of crop breeding research globally. Here we explored the role transcrip-tome data could play in enhancing GS using wheat as a test case. Across 286 wheat lines, we integrated phenotype and multi-omic data from controlled environment and field experiments including ca. 40K single nucleotide polymorphisms (SNP), abundance data for ca. 50K transcripts as well as meta-data (e.g. categorical environments) predicted individual genetic merit for two agronomic traits, flowering time and height. We combined phenotype and multi-omic data from both controlled environments and field experiments. This included ca. 40K single nucleotide polymorphisms (SNPs), ca. 50K transcript abundance data, and metadata (such as categorical environmental conditions). Using this integrated data, we predicted individual genetic merit for two agronomic traits: flowering time and height. We evaluated the performance of different model scenarios based on linear (GBLUP) and Gaussian/nonlinear (RKHS) regression in the Bayesian analytical frame-work. These models explored the relative contributions of different combinations of explanatory variables; additive genomic (G), transcriptomic (T) and environment (E), with and without considering non-additive epistasis and the G × E random effects. In controlled environments, where traits were measured under contrasting daylength regimes (long and short days), transcriptome abundance outperformed other explanatory variables when considered independently, while the model combining SNP, environment and G × E marginally outperformed the transcriptome. The best performing model for prediction of both flowering and height combined all data types, G × E and epistasis, where the GBLUP framework showed slightly better performance overall compared with RKHS across all tests. Under field conditions, we similarly found that models combining all variables were superior, with the GBLUP and RKSH methods performing equally well. However, the relative contribution of the transcriptome was reduced. Our results show there is a predictive advantage to direct inclusion of the transcriptome for genomic evaluation in wheat breeding. However, the complexity and cost of generating large scale transcriptome data are likely to limit its feasibility for commercial breeding. We demonstrate that combining less costly environmental covariates with conventional genomic data provide a practical alternative with similar gains to the transcriptome when environments are well characterised.
Highlights
-
Incorporating transcriptome and environment in genomic prediction;
-
Model comparisons and Bayesian inference.
-
Differential random effects of transcriptome, SNP and environment.