Using Deep Learning Models of Gene Regulation to Guide Drug Prioritization

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Drug repurposing offers a cost-effective strategy to accelerate therapeutic discovery, but most computational approaches fail to model noncoding genetic variation. Because over 90% of genome-wide association study (GWAS) risk variants reside in noncoding regions, linking regulatory variation to therapeutic hypotheses remains a major challenge. Here, we developed an integrative deep learning framework that links allele-specific enhancer prediction to transcription factor (TF)-centered gene expression changes and drug-induced transcriptional profiles to prioritize candidate therapeutics. Our cell type-specific deep learning enhancer models accurately distinguish active enhancers across seven cell lines. Using breast cancer as a proof-of-concept, we found that GWAS heritability is significantly enriched in MCF7 enhancers, supporting MCF7 as the cellular context for this disease. Allele-specific variant scoring identified breast cancer risk variants with strong allele-dependent effects, and attribution-based motif discovery revealed enrichment of FOXA1-associated motif features, consistent with FOXA1 upregulation in primary tumors. Integration of the FOXA1 knockdown-induced and drug-induced gene expression profiles identified 63 candidate compounds for treatment of breast cancer, including 18 approved drugs, with recovery of the known breast cancer therapy fulvestrant. Among prioritized compounds, 54% showed anti-correlated transcriptional effects across eight core breast cancer pathways, compared to 5.3% of non-prioritized compounds. Integration of drug-gene interaction data further refined these to eight compounds with supporting experimental or clinical evidence. Together, these results establish a regulatory variant-guided drug repurposing framework that connects noncoding genetic variation to therapeutic candidates and provides a generalizable strategy for translating the noncoding genome into pharmacologically relevant hypotheses.Drug repurposing offers a cost-effective strategy to accelerate therapeutic discovery, but most computational approaches fail to model noncoding genetic variation. Because over 90% of genome-wide association study (GWAS) risk variants reside in noncoding regions, linking regulatory variation to therapeutic hypotheses remains a major challenge. Here, we developed an integrative deep learning framework that links allele-specific enhancer prediction to transcription factor (TF)-centered gene expression changes and drug-induced transcriptional profiles to prioritize candidate therapeutics. Our cell type-specific deep learning enhancer models accurately distinguish active enhancers across seven cell lines. Using breast cancer as a proof-of-concept, we found that GWAS heritability is significantly enriched in MCF7 enhancers, supporting MCF7 as the cellular context for this disease. Allele-specific variant scoring identified breast cancer risk variants with strong allele-dependent effects, and attribution-based motif discovery revealed enrichment of FOXA1-associated motif features, consistent with FOXA1 upregulation in primary tumors. Integration of the FOXA1 knockdown-induced and drug-induced gene expression profiles identified 63 candidate compounds for treatment of breast cancer, including 18 approved drugs, with recovery of the known breast cancer therapy fulvestrant. Among prioritized compounds, 54% showed anti-correlated transcriptional effects across eight core breast cancer pathways, compared to 5.3% of non-prioritized compounds. Integration of drug-gene interaction data further refined these to eight compounds with supporting experimental or clinical evidence. Together, these results establish a regulatory variant-guided drug repurposing framework that connects noncoding genetic variation to therapeutic candidates and provides a generalizable strategy for translating the noncoding genome into pharmacologically relevant hypotheses.

Article activity feed