PyMSQ: a python package for fast Mendelian sampling (co)variance and haplotype-based similarity in genomic selection

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

While genomic selection (GS) boosts rapid genetic gains by leveraging dense marker data for genomic estimated breeding values (GEBVs), prolonged application can reduce haplotype diversity and increase inbreeding. To address these risks, recent research has emphasized Mendelian sampling variance (MSV) and covariance (MSC), which capture within-family (co)variation not fully reflected in GEBVs. In parallel, theoretical advances have introduced a haplotype-based similarity measure that targets shared heterozygous segments, enabling more direct control over haplotype diversity—either standalone or in combination with conventional coancestry-based and genomic relationship matrices.

Results

We present PyMSQ, an open-source Python package that implements two key developments: (1) a matrix-based approach for computing MSV and MSC in single-trait, multi-trait, and zygotic contexts, and (2) a haplotype-based similarity metric. By combining this matrix-based framework with optimized scientific libraries, PyMSQ achieves up to 332-fold faster computations than gamevar—a publicly available alternative—while preserving numerical accuracy. Using a Holstein-Friesian dataset, we demonstrate PyMSQ’s effectiveness in deriving MSV and MSC, as well as its novel similarity measure, which complements standard genomic relationship matrices by explicitly quantifying shared heterozygous segments rather than overall allele sharing, thereby providing additional insights for balancing immediate gains with long-term diversity.

Conclusion

By facilitating the practical use of MSV, MSC, and a haplotype-based similarity metric, PyMSQ enables breeders and quantitative geneticists to adopt haplotype diversity constraints—whether as a standalone criterion or in synergy with optimal contribution selection. This framework opens new possibilities for preserving key haplotypic segments, ultimately supporting more sustainable genomic selection strategies. PyMSQ is freely available under an MIT License at https://github.com/aromemusa/PyMSQ .

Article activity feed