Predicting gene expression using millions of yeast promoters reveals cis -regulatory logic

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation

Gene regulation involves complex interactions between multiple transcription factors. While early attempts to train deep neural networks to predict gene expression were limited to naturally occurring promoter sequences, the advent of gigantic parallel reporter assays has expanded available training data by orders of magnitude. Despite these advances, a clear understanding of how to use deep learning to study gene regulation is still lacking.

Method

Here we investigate the complex association between gene promoters and expression in S. cerevisiae using Camformer, a residual convolutional neural network that ranked 4th in the Random Promoter DREAM Challenge 2022. We present the original model trained on 6.7 million random promoter sequences and investigate 270 alternative models to determine what factors contribute most to model performance. Finally, we use explainable AI to uncover regulatory signals.

Results

We show that Camformer accurately decodes the association between promoters and gene expression ( r 2 = 0.914 ± 0.003, ρ = 0.962 ± 0.002) and provides a substantial improvement over previous state of the art. Furthermore, we show that a much smaller model with approximately 90% fewer parameters than the original model can achieve a high predictive performance. Using Grad-CAM and in silico mutagenesis, we demonstrate that the model learns both individual motifs and their hierarchy. For example, while an IME1 motif on its own increases gene expression, the co-occurrence of a UME6 motif provides a switch to strongly reduce gene expression. Thus, deep learning models such as Camformer can provide detailed insights into cis -regulatory logic.

Availability and Implementation

The data and code used and developed in our experiments are publicly available at: https://github.com/Bornelov-lab/Camformer .

Article activity feed