A structure-guided approach to non-coding variant evaluation for transcription factor binding using AlphaFold 3

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Non-coding single-nucleotide variants (SNVs) that alter transcription factor (TF) binding can affect gene expression and contribute to disease. Sequence-based methods can excel at predicting TF binding, but rely on training data and can exhibit TF-specific biases. Here we propose a structure-guided approach for non-coding SNVs, using AlphaFold 3 (AF3) to model TF-DNA complexes and FoldX for downstream physics-based assessment. Benchmarked against SNP-SELEX data for six TFs (SPIB, ELK3, ETV4, SF-1, PAX5 and MEIS2), the FoldX-based strategy showed good agreement with experimental allele preferences. Interestingly, differences in AF3’s interface predicted template modelling (ipTM) score aligned even more closely with SNP-SELEX results, generally surpassing energy-based metrics. Application to known disease-associated variants recapitulated most reported effects for TFs including NKX2-5, GATA3 and USF2A-USF1. In these examples, considering both ΔipTM and FoldX energies proved more reliable than either metric alone. While less accurate than state-of-the-art sequence-based methods, this work demonstrates that structural modelling can yield interpretable insights into how non-coding variants influence TF binding. By highlighting both the promise and limitations of AF3 in this context, our study provides a framework for complementary structural evaluation of regulatory variants.

Article activity feed