Cost-effective genomic prediction for fertility traits: A comparison of machine learning and conventional models using low-coverage sequencing in Holstein heifers
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Fertility is one of the major factors affecting the efficiency of dairy herd, and genomic selection ( GS ) on milk yield, while ignoring fertility, has resulted in a decline in heifer fertility. Consider the cost of breeding, it is important to enhance the accuracy of GS with cost-effective way for fertility traits. This study investigates the genomic prediction ( GP ) of fertility traits in Holstein heifers using both machine learning ( ML ) and conventional methods, based on data from SNP arrays and low-coverage sequencing data. In this study, we collected 45,320 Holstein heifers with phenotype and pedigree records, from which we generated genomic data for 3,683 Holstein heifers using lcGWS. We first estimated the heritability for age at first service ( AFS ), gestation length ( GLh ) and age at first calving ( AFC ). We then compared the prediction performance of ML methods, kernel ridge regression ( KRR ), support vector regression, and random forest regression, with GBLUP, ssGBLUP and BayesR3 regarding GP accuracy and unbiasedness. Inputs for ML includes genomic relationship matrices ( GRM ), principal components, and SNPs. The results revealed that the heritability for the three fertility traits ranged from 0.09 to 0.48. Prediction accuracy from imputed low-coverage sequencing data was comparable to that from standard SNP chips. When both pedigree and genotypic data were used for GS, ssGBLUP yielded the highest prediction accuracy for AFS and GLh. Crucially, using only genomic data, KRR_GRM improved GP accuracy by up to 28.57% compared to GBLUP and by up to 9.46% compared to BayesR3. Our results highlight the effectiveness of low-coverage sequencing data in breeding applications and the ML’s potential to enhance GP accuracy for fertility traits, offering practical insights for dairy breeding programs.
Implications
To improve the accuracy of GS with cost-effective way for fertility traits, this study confirms that low-coverage sequencing offers both accuracy and cost-effectiveness for fertility in Holstein heifers. The research provides a decision-making framework for breeding workers: the ssGBLUP model is optimal when combining pedigree data, while machine learning methods are superior with only genomic data. This study offers a practical tool for achieving efficient and economical genetic improvement in dairy cattle.