Contrastive modelling of transcription and transcript abundance in legumes using PlanTT
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Predicting the impacts of sequence variation on gene expression remains a challenging task. Further, in plants, we have a limited understanding of the relative contributions of different gene expression regulatory mechanisms. To address these limitations we generated a comparative multiomic dataset comprising matched 3’-RNA-seq and PRO-seq data from matched tissues of reference genotypes of four legumes of the invert repeat lacking clade ( Pisum sativum, Vicia faba, Lathyrus sativa and Medicago truncatula ). Focused on the challenging task of predicting expression differences between ortholog pairs from unseen orthogroups, we used this dataset and a novel prediction framework to build contrastive models that predict quantitative differences (effect size differences) in transcription and transcript abundance.