Gene-Pseudogene Inversions as a Hidden Source of Missing Heritability

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Historically defined as non–functional copies of coding genes, pseudogenes are an abundant yet underexplored element in the human genome, despite growing evidence linking them to human diseases. From a genome wide screen, we identified 411 gene/pseudogene pairs located in opposite orientation, an arrangement which is permissive for the occurrence of inversions, including 46 genes already associated with human disease. Next, by analysing long read sequencing (LRS) data from the 1000 Genomes Project, we confirmed that at least 3.6% of healthy individuals carry an inversion involving one of these gene/pseudogene pairs, while they were previously undetected by short read sequencing. Most importantly, we identified novel and recurrent inversions between SORD and its pseudogene SORD2P in 13 out of 151 patients (9%) affected by SORD–related Charcot–Marie–Tooth (CMT) neuropathy, including 6 out of 8 (75%) of SORD–CMT cases where only one pathogenic variant was identified on short read sequencing, making it the third most common pathogenic allele causing SORD–CMT. Of interest, gene/pseudogene pairs displaying chromatin contact in Micro-C data, including SORD/SORD2P, were found to be more likely to undergo inversion events. Overall, our results highlight gene/pseudogene inversions as a previously underrecognized type of pathogenic structural variant. Wider use of LRS could reveal their true prevalence and contribution to the missing heritability in Mendelian diseases.

Article activity feed