Multicancer analyses of short tandem repeat variations reveal shared gene regulatory mechanisms
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Background
Short tandem repeats (STRs) have been reported to influence gene expression across various human tissues. While STR variations are enriched in colorectal (CRC), stomach (STAD) and endometrial (UCEC) cancers, particularly in microsatellite instable (MSI) tumors, their functional effects and regulatory mechanisms on gene expression remain poorly understood across these cancer types.
Results
Here, we leverage whole-exome sequencing and gene expression data to identify STRs for which repeat lengths are associated with the expression of nearby genes (eSTRs) in CRC, STAD and UCEC tumors. Our analyses reveal that tumor STR profiles effectively capture both MSI phenotype and population structure. While most eSTRs are cancer-specific, shared eSTRs across multiple cancers exhibit consistent effects on gene expression. Notably, coding-region eSTRs identified in all three cancer types show positive correlations with nearby gene expression. We further validate the functional effects of eSTRs by demonstrating associations between somatic eSTR mutations and gene expression changes during the transition from normal to tumor tissues, suggesting their potential roles in tumorigenesis. Combined with DNA methylation data, we perform the first quantitative analysis of the interplay between STR variations and DNA methylation in tumors. We identify eSTRs where repeat lengths are associated with methylation levels of nearby CpG sites (meSTRs) and show that over 70% of eSTRs are significantly linked to local DNA methylation. Importantly, the effects of meSTRs on DNA methylation remain consistent across cancer types.
Conclusions
Overall, our findings enhance the understanding of how functional STR variations influence gene expression and DNA methylation. Our study highlights shared regulatory mechanisms of STRs across multiple cancers, offering a foundation for future research into their broader implications in tumor biology.