Under which circumstances do genomic neural networks learn motifs and their interactions?
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The use of neural networks to model genomic data in sequence-to-function scenarios has soared over the last decade. There remains much debate about whether these models are interpretable, either inherently, or when using downstream interpretability or explainability techniques (xAI). Conclusions are further complicated by the steady publication of novel models, each with their own architectures, evaluations, and xAI experimental designs. Here, we posit that many of these complications arise due to a lack of explicit specification of a generative model, baseline comparators, and thorough evaluation. Consequently, we attempt to reconcile concerns of interpretability under a motif-based generative model by simulating at scale over 1000 motif-based genetic architectures and evaluating the ability of different model architectures to predict an outcome given a sequence as input. We first show that a single convolutional layer is sufficient to discover motifs in a sequence-to-function model due to the way in which it shares the gradient locally amongst nucleotides. We next build upon this by showing that across genetic and network architectures— including attention, LSTMs, and stacked convolutions—most models are capable of modeling motifs and their interactions, with certain models outperforming others across genetic contexts and sample sizes. Distinguishing between shallow-level interpretations of motifs and deeper, gradient-based interpretations of motifs, we show that these approaches discover separate but overlapping sets of motifs, depending on motif characteristics. Finally, we validate our findings on an experimental dataset, and conclude that while attention is accurate, there are genetic contexts in which other neural networks complement findings from attention-based models and produce higher correlations between predictive performance and interpretability. The work here suggests that when a generative model is correctly specified, most models are to an extent interpretable, whether their architectures are inherently so or not. Moreover, our work highlights opportunities for methods development in motif discovery and also implies that employing a mixture of model architectures may be best for biological discovery.