A high-resolution atlas of cattle regulatory variants and their cross-species activity in matched human cells

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Identifying causal noncoding variants underlying complex traits in cattle remains challenging because high-resolution functional maps of regulatory variation are lacking. Here we combine massively parallel reporter assays with graph genomics to measure autonomous transcriptional activity from >1.5 billion DNA fragments spanning both cattle subspecies. In primary bovine cells, we assay >15 million variants and identify >150,000 expression-modulating variants enriched at cattle eQTL and GWAS loci. This enables the refinement of broad association signals to small sets of candidate functional regulatory variants. Our haplotype-aware framework captures rare, multi-allelic and tightly linked variants poorly resolved by conventional eQTL studies, and quantifies the disproportionate impact of larger variants on transcription. Furthermore, we use these data to train a deep-learning model that successfully predicts bovine promoter activity directly from sequence. Profiling the same cattle DNA in matched primary human cells reveals widespread conservation of promoter and enhancer activity, allelic effects and regulatory grammar, supporting the transfer of annotations and models across species. However, species-dependent effects are enriched in evolutionarily young sequences and p53-family motifs, highlighting the limits to simple cross-species extrapolation. Together, these data provide a high-resolution atlas of cattle regulatory variation and a framework for prioritising causal noncoding variants for cattle trait improvement.

Article activity feed