Multi-omic re-analysis increases diagnostic yield in individuals with Cornelia de Lange syndrome
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background Approximately half of the individuals with a clinically diagnosed Mendelian condition do not receive a molecular diagnosis. Current standard-of-care diagnostic pipelines, which are largely focused on exonic sequence variants, may not be comprehensive enough to identify all pathogenic variants. A comprehensive analytical approach capable of identifying noncoding and structural variants is needed to bridge the diagnostic gap. Cornelia de Lange Syndrome (CdLS) is a multisystem developmental diagnosis caused primarily by pathogenic variants in one of the six genes known to cause CdLS (NIPBL, SMC3, SMC1A, HDAC8, RAD21, BRD4) , although pathogenic variants in additional phenocopy genes have also been implicated. We hypothesized that individuals with a clinical diagnosis of CdLS and no molecular diagnosis harbor pathogenic, causative variants that are not identified or prioritized by the current standard of care, exome-focused workflows. Methods We performed a re-analysis of the genome sequencing data from a previously published cohort of 173 individuals with a clinical diagnosis of CdLS (Gabriella Miller Kids First cohort) and expanded the scope of analysis to include noncoding and structural variants. We used RNA-sequencing data in a subset of individuals (n = 62) to complement the DNA workflow. Results Re-analysis, including copy-number and structural variation and using transcriptome sequencing as a complementary assay, revealed molecular etiologies in an additional 37 probands. Thus, increasing the total diagnostic yield in this previously undiagnosed cohort to 60%. The new diagnoses were enriched for variants beyond the standard exonic SNVs/ indels, including cryptic non-coding variants (promoter, deep intronic, large insertions), copy number variants, balanced rearrangements such as inversions, and variants in additional genes that phenocopy CdLS. Transcriptome aided re-analysis helped uncover cryptic noncoding variants in the DNA that lacked sufficient computational evidence for a splicing abnormality and yet produced aberrantly spliced mRNA. Conclusions Our results underscore the need for whole genome (and transcriptome) sequencing and a comprehensive, unbiased analytical protocol integrating structural and noncoding variants to exhaustively mine a phenotypically and genetically heterogeneous cohort to maximize its diagnostic yield. The additional diagnostic yield solely from noncoding and structural variants highlights the limitations of an exome-focused analysis workflow and highlights the utility of transcriptome analysis beyond the use of splicing prediction tools.