Evaluation of computational genotyping of structural variation for clinical diagnoses
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (GigaScience)
Abstract
Background
Structural variation (SV) plays a pivotal role in genetic disease. The discovery of SVs based on short DNA sequence reads from next-generation DNA sequence methods is error-prone, with low sensitivity and high false discovery rates. These shortcomings can be partially overcome with extensive orthogonal validation methods or use of long reads, but the current cost precludes their application for routine clinical diagnostics. In contrast, SV genotyping of known sites of SV occurrence is relatively robust and therefore offers a cost-effective clinical diagnostic tool with potentially few false-positive and false-negative results, even when applied to short-read DNA sequence data.
Results
We assess 5 state-of-the-art SV genotyping software methods, applied to short-read sequence data. The methods are characterized on the basis of their ability to genotype different SV types, spanning different size ranges. Furthermore, we analyze their ability to parse different VCF file subformats and assess their reliance on specific metadata. We compare the SV genotyping methods across a range of simulated and real data including SVs that were not found with Illumina data alone. We assess sensitivity and the ability to filter initial false discovery calls. We determined the impact of SV type and size on the performance for each SV genotyper. Overall, STIX performed the best on both simulated and GiaB based SV calls, demonstrating a good balance between sensitivity and specificty.
Conclusion
Our results indicate that, although SV genotyping software methods have superior performance to SV callers, there are limitations that suggest the need for further innovation.
Article activity feed
-
Now published in GigaScience doi: 10.1093/gigascience/giz110
Varuna Chander 1Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston TX 77030Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteRichard A. Gibbs 1Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston TX 77030Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFritz J. Sedlazeck 1Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston TX 77030Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Fritz J. SedlazeckFor correspondence: fritz.sedlazeck@bcm.edu
A version of this preprint has been published in the Open Access journal GigaScience …
Now published in GigaScience doi: 10.1093/gigascience/giz110
Varuna Chander 1Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston TX 77030Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteRichard A. Gibbs 1Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston TX 77030Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteFritz J. Sedlazeck 1Human Genome Sequencing Center, Baylor College of Medicine, One Baylor Plaza, Houston TX 77030Find this author on Google ScholarFind this author on PubMedSearch for this author on this siteORCID record for Fritz J. SedlazeckFor correspondence: fritz.sedlazeck@bcm.edu
A version of this preprint has been published in the Open Access journal GigaScience (see paper https://doi.org/10.1093/gigascience/giz110 ), where the paper and peer reviews are published openly under a CC-BY 4.0 license.
These peer reviews were as follows:
Reviewer 1: http://dx.doi.org/10.5524/REVIEW.101890 Reviewer 2: http://dx.doi.org/10.5524/REVIEW.101891
-
-
-