Long-read transcriptome analysis using IsoRanker for identifying pathogenic variants in Mendelian conditions
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Identifying pathogenic non-coding variants that contribute to Mendelian conditions remains challenging as the functional impact of these variants on gene function is often unknown. We present IsoRanker, a long-read transcriptome sequencing-based framework that prioritizes functionally relevant non-coding variants by detecting genes and novel isoforms with outlier expression, allelic imbalance, and/or nonsense-mediated decay (NMD). We generated paired cycloheximide-treated and untreated fibroblast transcriptomes from 31 individuals (3 individuals with known transcript-altering rare variants and 28 individuals with unsolved conditions) and linked transcripts to phased long-read genomes. IsoRanker successfully recovered known transcript alterations in this cohort and remained robust in subsampling analyses to cohorts of 11 individuals and ∼5 million full-length transcripts per individual. However, performance was dependent upon de novo isoform caller choice, particularly for NMD-sensitive and novel isoforms. Among 28 previously unsolved cases, IsoRanker deprioritized most fibroblast-expressed candidate splice site variants while nominating new leads. In one individual, IsoRanker prioritized HARS1 , revealing biallelic non-coding variants that together produced a partial HARS1 loss-of-function and informed targeted therapy in this individual using histidine supplementation. These findings establish long-read, NMD-aware transcriptomics with IsoRanker as an effective approach for generating isoform-level functional evidence, improving classification of non-coding variants and supporting the diagnosis of individuals with rare diseases.