D4Z4End2End: complete genetic and epigenetic architecture of D4Z4 macrosatellites in FSHD, BAMS and reference cohorts

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The D4Z4 locus is a macrosatellite array on chromosome 4q that normally comprises 8 to >100 3.3-kb repeat units. Its size and repetitiveness render it refractory to most sequencing technologies; consequently its genetic and epigenetic architectures remain incompletely understood despite their relevance to human health, in particular facioscapulohumeral muscular dystrophy (FSHD). Molecular diagnosis for FSHD following clinical description currently involves complex, multi-step and low resolution assays, aiming at identifying contractions on permissive haplotypes (FSHD type 1) or epigenetic reactivation (FSHD type 2) due to pathogenic variants in the epigenetic machinery (most often in SMCHD1 ). Here we leverage ultra-long whole-genome and Cas9-targeted sequencing to develop a fast and accurate workflow, D4Z4End2End, to comprehensively charactere the genetics and methylation of D4Z4 alleles. We apply it to samples from patients affected by FSHD1, FSHD2, and another disease caused by SMCHD1 variants, Bosma arhinia microphthalmia syndrome (BAMS), as well as publicly-available data from the 1000 Genomes Project and Human Pangenome Reference Consortium. We attain high read depth sequencing of full-length D4Z4 arrays of up to 40 repeat units (~132 kb), accurately capture contracted arrays, genetic mosaicism, and pathogenic SMCHD1 variants, and generate accurate consensus sequences of the full set of D4Z4 alleles for variant analysis. Moreover, we identify new allelic variants, analyse complex D4Z4 rearrangements including in- cis duplications, and reveal striking length- and SMCHD1 status-dependent methylation patterns across the D4Z4 array. Our findings provide new insights into human macrosatellite genetics and epigenetics, and demonstrate the potential of long-read nanopore sequencing to accelerate FSHD research and diagnostics.

Article activity feed