Estimating haplotype values and mutation effects in the context of a local DNA tree

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background

Single nucleotide polymorphism (SNP) arrays provide genome-wide coverage of polymorphic sites across populations, but most quantitative genetics models assume that the effects of these mutations remain constant across generations and populations. This assumption overlooks population dynamics and genetic architecture that are central to trait expression, particularly when inferences are made across populations. Whole-genome sequence (WGS) data captures all variants and should, in principle, overcome these limitations, but its use has delivered only modest gains in prediction accuracy at considerable computational cost. Ancestral recombination graphs (ARGs) offer a representation of genome variations that describes how genetic variation is shaped by haplotype inheritance between generations (represented as local DNA trees) and associated mutations. This study investigates how a generative model on a local DNA tree can improve the estimation of mutation and haplotype effects, especially for rare or population-specific variants.

Methods

We developed a TBLUP approach that uses local DNA tree information to estimate haplotype and mutation effects. In each local DNA tree, branches connecting ancestral and descendant haplotypes represent DNA inheritance, and the trait associated with a branch corresponds to the mutation(s) it carries. Summing these branch-specific mutation effects from the tree root to each haplotype defines the haplotype values. This recursive structure yields a sparse and computationally efficient approach for estimating haplotype and mutation effects from the local DNA tree and phenotypes. We show how the TBLUP approach is similar and different to the SNP-BLUP and GBLUP approaches and demonstrate it with cattle mitochondrial DNA, a non-recombining genomic region, using both simulated and real data.

Results and conclusions

The TBLUP approach was computationally more efficient than SNP-BLUP/GBLUP approaches and produced more accurate estimates of haplotype values and mutation effects, which can vary between haplotypes. The accuracy increased when phenotypes were available for haplotypes across the local DNA tree rather than only for the recent haplotypes. Incorporating local DNA tree information enhances the use of genomic data in quantitative genetics. Extending the TBLUP approach to full ARGs will enable analysis across multiple local DNA trees (accounting and leveraging recombination), which will further improve quantitative genetic modelling and practical applications.

Article activity feed