LINE-1 Depletion at Promoters of Neurodevelopmental Disorder Genes: A Genome-Wide Analysis
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
LINE-1 retrotransposons constitute approximately 17% of the human genome and are capable of influencing gene expression when inserted in proximity to regulatory regions. Neurodevelopmental disorder (NDD) genes require precise spatiotemporal regulation during brain development; however, the relationship between LINE-1 occupancy and the promoter architecture of NDD-associated loci has not been systematically examined. Here, a genome-wide computational analysis was performed comparing LINE-1 element density in promoter regions (±2 kb from the transcription start site) across five NDD gene sets — including genes annotated to autism spectrum disorder (SFARI Gene database), seizure (HP:0001250), attention-deficit/hyperactivity disorder (HP:0007018), their phenotypic intersection, and syndromic NDD genes — against a curated housekeeping gene set (n=1,982) derived from the HRT Atlas. All NDD gene sets exhibited significantly lower LINE-1 promoter occupancy compared to housekeeping genes (Mann-Whitney U test, p < 0.05 across all tested NDD subsets). A consistent gradient was observed, with genes annotated to both seizure and ADHD phenotypes showing the lowest LINE-1 occupancy (23.5% vs 31.1% in housekeeping genes; p = 0.0029, rank-biserial r = 0.082). The observed depletion was further supported by length-matched control analysis (n=678 pairs, p = 0.036, r = 0.066), suggesting that the signal is not fully explained by intronic size differences. Furthermore, no significant differences in GC content (p = 0.3289) or CpG observed/expected ratios (p = 0.9665) were detected between NDD Tier 1 and housekeeping gene promoters, indicating that the results are not attributable to sequence composition biases. These findings are consistent with stronger selective constraint against LINE-1 insertions at NDD promoters, potentially reflecting the intolerance of these loci to transcriptional dysregulation. The pronounced depletion in genes annotated to both seizure and ADHD phenotypes provides a genomic context for understanding the regulatory vulnerability of pleiotropic neurodevelopmental loci, with implications for interpreting non-coding variation in clinical genomics.