Characterization of full-length CNBP expanded alleles in myotonic dystrophy type 2 patients by Cas9-mediated enrichment and nanopore sequencing

Curation statements for this article:
  • Curated by eLife

    eLife logo

    Evaluation Summary:

    To precisely diagnose DM2 caused by CCTG repetition in CNBP, the authors established a Cas9-mediated target enrichment system followed by Nanopore sequencing and analysis. The authors are fully aware of the limitations of the current diagnostic tests of DM2 and efficiently presented what novel findings have been revealed by the Cas9 nanopore sequencing. The findings of the current study suggest that Cas9 nanopore sequencing can be very useful for accurate genetic diagnosis of DM2 and understanding the genotype-phenotype correlation of this disease.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their name with the authors.)

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Myotonic dystrophy type 2 (DM2) is caused by CCTG repeat expansions in the CNBP gene, comprising 75 to >11,000 units and featuring extensive mosaicism, making it challenging to sequence fully expanded alleles. To overcome these limitations, we used PCR-free Cas9-mediated nanopore sequencing to characterize CNBP repeat expansions at the single-nucleotide level in nine DM2 patients. The length of normal and expanded alleles can be assessed precisely using this strategy, agreeing with traditional methods, and revealing the degree of mosaicism. We also sequenced an entire ~50 kbp expansion, which has not been achieved previously for DM2 or any other repeat-expansion disorders. Our approach precisely counted the repeats and identified the repeat pattern for both short interrupted and uninterrupted alleles. Interestingly, in the expanded alleles, only two DM2 samples featured the expected pure CCTG repeat pattern, while the other seven presented also TCTG blocks at the 3′ end, which have not been reported before in DM2 patients, but confirmed hereby with orthogonal methods. The demonstrated approach simultaneously determines repeat length, structure/motif, and the extent of somatic mosaicism, promising to improve the molecular diagnosis of DM2 and achieve more accurate genotype–phenotype correlations for the better stratification of DM2 patients in clinical trials.

Article activity feed

  1. Author Response

    Reviewer #1 (Public Review):

    The authors of this study adopted Cas9-mediated enrichment of target locus and Nanopore long-read sequencing to accurately count repeat numbers in the CNBP gene, which is notorious for precise calling before. They also compared their result with that of the conventional approach, validating their approach. It is an interesting read and shows a pathway that a clinic can take in the near future.

    However, this paper's novel contributions need to be emphasised as there are some papers that utilized Nanopore sequencing to elucidate short repeats (https://pubmed.ncbi.nlm.nih.gov/35245110/; https://bmcmedgenomics.biomedcentral.com/articles/10.1186/s12920-020-00853-3).

    The reviewer is correct that ONT sequencing had been already utilized for the analysis of the microsatellite within the CNBP gene (Stevanovski et al. 2021; Mitsuhashi et al. 2021), however this was confined to CNBP alleles in the normal range only. Moreover, the approaches utilized present some critical drawbacks. The work of Mitsuhashi and colleagues exploited ONT whole genome sequencing, that is not applicable in the routine due to the very high costs. The group of Stevanovski utilized the recently introduced “Read Until” feature of ONT sequencing for the analysis of microsatellites in 37 disease-associated loci. This allows selective sequencing of pre-defined target DNA molecules, thus enabling a targeted sequencing with similar advantages of the Cas9 mediated sequencing presented hereby. However, enrichment levels achieved by “Read Until” (5x) are consistently lower than those obtained with the Cas9 approach (500x), due to higher background. This may constitute an important issue when dealing with extremely long CNBP alleles that can be disadvantaged in sequencing as compared to shorter contaminating fragments (Shruti V Iyer, BioRxiv 2022).

    These aspects, underlying the advantages of the Cas9 mediated sequencing presented, hereby have been now reported in the “Discussion”section (Lines 337-348).

    Another issue is the clinical utility of the approach. Although it is precise, it is not totally clear whether this accuracy is required in clinical practice, as the repeat status does not completely correlate with phenotypic severity.

    The genotype-phenotype issue in DM2 is still an open question and relies on a single study from Day et al. (2003; PMID:12601109) in which Southern blot analysis was used to determine the length of the DM2 mutation. Because of the extremely large size of the CCTG expansions and somatic instability of the repeat, Southern blot fails to detect the DM2 mutation in about 20% of known carriers, whose expansion length remains undeterminable. Moreover, detectable expanded alleles can appear as single discrete bands, multiple bands, or smears with no indication of the degree of mosaicism. The absence of precise genotype-phenotype correlation can be thus largely due to the technical difficulties in analysing such expansions in details. Despite a clinical utility of the presented approach would not be thus immediate due to lack of knowledge, we believe that the use of long read sequencing in large cohorts of DM2 patients could definitively clarify if information about the length, the composition and the degree of mosaicism of the DM2 mutation are associated with the severity of the DM2 clinical phenotype and/or with the disease age at onset.

    Considerations related to the clinical utility of the approach have been now included in the “Discussion” section (lines 294-300 and lines 420-428).

    Lastly, it is not clear about the familial cases (A1-A4). What are their relationships and why their copy numbers are not exactly the same? Is it because of extreme recombination and variation even in a family or just represent limited accuracy?

    Cases A1-A4 derived from a large consanguineous DM2 family, whose pedigree has been now reported in Figure S1. The extreme variability in the (CCTG) and (TCTG) copy numbers within the family is typical of DM2 patients, as reported in Day et al., 2003. A tendency towards contractions rather than expansion of the CCTG array can also been observed in this family, in agreement with literature data. The meiotic instability of the (CCTG)n and (TCTG)n distal tract is probably due to unequal recombination events and errors during DNA replication/repair of this highly repetitive region, which give rise to somatic and germinal mosaicism. If we consider the variability in the number of (TG)v, this likely reflects a limited accuracy of the method, as discussed for the healthy alleles (Table 2). The 5’ (TG)v and (TCTG)w arrays are indeed supposed to be polymorphic in the general population but stable in the same individual and in the meiotic transmissions. Consistently, we now show in Figure S4 that all family members show an equivalent pattern of TG repetitions. Such small inconsistences probably reflect ONT sequencing errors and could be addressed by using the most recent base-calling algorithm and eventually the more accurate Q20+ chemistry. According to the Reviewer’s observations, all these aspects have been discussed more deeply in the manuscript, with the support of the additional Figure S1 and S4 (see Results lines 215-219 and Discussion lines 364-368)

    They lack a validation cohort, with prospective patients.

    The reviewer is correct, this is a pilot study on a limited number of DM2 patients. We are aware that a validation including a larger cohort of DM2 patients would be desirable to further confirm our results. This limitation of the study has been clearly indicated in the “Discussion” section” (lines 385-392). Unfortunately, the majority of available DNA samples derive from retrospective analyses and the DNA quantity/quality was not always sufficient for ONT sequencing. We are planning to collect at least 30 novel DNA samples from prospective DM2 cases, either sporadic or familiar. However, the limited number of DM2 patients referring to our centre (about 1-2 pts/month) and the low incidence of DM2 in the Italian population (Vanacore et al., 2016) will make this collection and validation not feasible in the short time.

  2. Evaluation Summary:

    To precisely diagnose DM2 caused by CCTG repetition in CNBP, the authors established a Cas9-mediated target enrichment system followed by Nanopore sequencing and analysis. The authors are fully aware of the limitations of the current diagnostic tests of DM2 and efficiently presented what novel findings have been revealed by the Cas9 nanopore sequencing. The findings of the current study suggest that Cas9 nanopore sequencing can be very useful for accurate genetic diagnosis of DM2 and understanding the genotype-phenotype correlation of this disease.

    (This preprint has been reviewed by eLife. We include the public reviews from the reviewers here; the authors also receive private feedback with suggested changes to the manuscript. Reviewer #1 and Reviewer #2 agreed to share their name with the authors.)

  3. Reviewer #1 (Public Review):

    The authors of this study adopted Cas9-mediated enrichment of target locus and Nanopore long-read sequencing to accurately count repeat numbers in the CNBP gene, which is notorious for precise calling before. They also compared their result with that of the conventional approach, validating their approach. It is an interesting read and shows a pathway that a clinic can take in the near future.

    However, this paper's novel contributions need to be emphasised as there are some papers that utilized Nanopore sequencing to elucidate short repeats (https://pubmed.ncbi.nlm.nih.gov/35245110/; https://bmcmedgenomics.biomedcentral.com/articles/10.1186/s12920-020-00853-3). Another issue is the clinical utility of the approach. Although it is precise, it is not totally clear whether this accuracy is required in clinical practice, as the repeat status does not completely correlate with phenotypic severity.

    Lastly, it is not clear about the familial cases (A1-A4). What are their relationships and why their copy numbers are not exactly the same? Is it because of extreme recombination and variation even in a family or just represent limited accuracy?

    They lack a validation cohort, with prospective patients.

  4. Reviewer #2 (Public Review):

    This is an interesting study in which the authors conducted Cas9-mediated enrichment and nanopore sequencing to characterize the full-length CNBP expanded alleles in nine myotonic dystrophy type 2 (DM2) patients. The authors are fully aware of the limitations of the current diagnostic tests of DM2 and efficiently presented what novel findings have been revealed by the Cas9 nanopore sequencing.

    Cas9 nanopore sequencing enabled sequencing of the expanded alleles of CNBP up to nearly 47kb at a single-nucleotide resolution. Detailed patterns of the expanded alleles (pure CCTG pattern vs. a combination of TCTG blocks at the 3'end) and the extent of somatic mosaicism could be characterized by Cas9 nanopore sequencing as well. These intriguing findings suggest that Cas9 nanopore sequencing can be very useful for accurate genetic diagnosis of DM2 and understanding the genotype-phenotype correlation of this disease.