A scalable method for modulating plant gene expression using a multispecies genomic model and protoplast-based massively parallel reporter assay
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Precision breeding tools such as CRISPR-Cas genome editing can speed up innovation in plant biotechnology and boost crop yields. The challenge remains to efficiently apply precision breeding methods to plant gene regulation. Endogenous gene regulatory sequences are subject to complex transcriptional control, a bottleneck in altering gene expression patterns in a predictable way. Here we present the CRE.AI.TIVE platform, enabling upregulation of plant gene activity without a priori knowledge of individual cis-regulatory elements or their specific location. A predictive machine learning model underpinning the platform has been trained on a wide range of tissue specific transcriptomic and epigenomic coverage datasets from DNA sequence of 12 plant species, showing competitive performance on RNA-seq coverage prediction across all species. Our platform further combines in silico DNA sequence mutagenesis and a protoplast-based massively parallel reporter assay (MPRA). We demonstrate the platform’s functionality by mutagenesis of a proximal promoter of the tomato gene SlbHLH96 which yields predictions of variant gene activity in silico . 2,000 sequence candidates with varying predicted gene expression strength were validated with MPRA in plant protoplasts, identifying variants with significantly upregulated gene activity. A portion of functional sequence variants were further individually evaluated with a fluorescence reporter assay and were observed to contain a new order of known cis-regulatory elements. The CRE.AI.TIVE platform offers a first-of-its-kind scalable method of gene upregulation in plants with native DNA sequences without the need for CRE cataloguing and rational promoter design.