Long-range PCR and Nanopore sequencing for localisation and phasing variants: an end-to-end clinical application workflow

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Background: Next-Generation short-read sequencing has limited diagnostic utility in phasing distantly separated variants and analysing genomic regions with high homology. Determining the phase of variants from parental chromosomes is critical for accurate identification of compound heterozygosity. Long-read sequencing technology is able to overcome these limitations through the analysis of long haplotypes of separated variants. This study has developed and validated a robust, end-to-end workflow for phasing and localising variants using long-range PCR (LR-PCR) and targeted Nanopore sequencing for clinical implementation. Methods: NA24385 (HG002) reference DNA was used for all tests. Four PCR kits were tested to optimise LR-PCR for targets between 1 to 20 kb. Amplicons were barcoded and sequenced on Flongle flow cells, with up to eight amplicons on each flow cell. An in-house bioinformatic pipeline was developed to analyse the amplicons. This pipeline is capable of detecting chimeric reads (a known PCR artefact), and incorporating Clair3 for variant calling, and WhatsHap and HapCUT2 for phasing. Results: The UltraRun LongRange PCR Kit performed with a 90% success rate for DNA amplification up to 22 kb. All 15 tested heterozygous Single Nucleotide Variant (SNV) pairs, and 10 small InDels, with inter-variant distances from 5.8 to 21.4 kb, were phased with 100% concordance to known phase. Furthermore, SNV calling within six low-mappability genes demonstrated precision and sensitivity of 100% against benchmark data. The median proportion of chimeric reads was maintained at 2.80% (range 1.79–16.12%) under optimised conditions. Conclusions: This study establishes a reliable and affordable clinical diagnostic workflow for accurate phasing of variants separated by up to ~ 20 kb and for variant localisation in genomic regions not able to be sequenced by short-read sequencing. This integrated approach enables implementation in diagnostic settings to resolve complex genetic findings and improve variant interpretation.

Article activity feed