Harmonizing Heterogeneous Datasets: Imputation and quality control for multi genotyping platform and multi-breed LD SNP panel analysis in Cattle

Akanksha Kesharwani
Ananthasayanam Sudhakar
Nilesh Nayee
Sujit Saha
Swapnil Gajjar
Tejas Gohil

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Low-density (LD) SNP chips are widely used in cattle genomics for applications such as breed characterization, genomic selection, and breeding value estimation. However, differences in SNP content across chip versions and missing genotypes can limit the utility of these datasets. Imputation techniques, using a unified reference panel, can address these challenges by improving data completeness and enabling cross-platform compatibility. In this study, 30,124 cattle DNA samples representing five breeds—Gir, Sahiwal, Kankrej, crossbred Holstein Friesian (CBHF), and crossbred Jersey (CBJY)—were genotyped using various versions of INDUSCHIP, a low-density SNP chip, across Illumina and Affymetrix platforms. To ensure data compatibility, genotype data underwent rigorous quality control and were standardized onto a unified reference panel. Missing genotypes were imputed using this panel, and masked analysis demonstrated an average concordance of 94.56% between genotyped and imputed data. Further evaluation using the Dosage R Square (DR2) metric showed that most imputed SNPs achieved DR2 scores above 0.75, indicating high imputation accuracy and reliability across all breeds. The imputed dataset generated in this study provides a robust and harmonized genomic resource for cattle breeding programs. This resource supports critical applications such as breed purity assessment, genomic selection, and breeding value estimation, enhancing the accuracy and efficiency of genetic improvement initiatives.

Version published to 10.21203/rs.3.rs-7518091/v1 on Research Square
Oct 9, 2025

Comparison of BLUPF90IOD3 and MiXBLUP implementations of the single-step model applied to the Polish national dairy cattle evaluation

This article has 4 authors:
1. Dawid Słomian
2. Michalina Jakimowicz
3. Tomasz Suchocki
4. Joanna Szyda
This article has no evaluationsLatest version Dec 22, 2025
Derivation of prediction error variance for non-genotyped individuals in genomic selection

This article has 3 authors:
1. Vinícius Junqueira
2. Marcos Jun-Iti Yokoo
3. Fernando Flores
This article has no evaluationsLatest version Dec 17, 2025
Genomic Diversity, Inbreeding, and Selection Signatures in Duroc, Landrace, and Yorkshire Pigs from a Long-Term Closed Breeding System

This article has 5 authors:
1. Huangyi Tang
2. Henrique A. Mulim
3. Shi-Yi Chen
4. Allan P. Schinckel
5. Hinayah R. Oliveira
This article has no evaluationsLatest version Dec 22, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Comparison of BLUPF90IOD3 and MiXBLUP implementations of the single-step model applied to the Polish national dairy cattle evaluation

Derivation of prediction error variance for non-genotyped individuals in genomic selection

Genomic Diversity, Inbreeding, and Selection Signatures in Duroc, Landrace, and Yorkshire Pigs from a Long-Term Closed Breeding System