Alternative splicing in 300 human cells and tissues is sparsely distributed and strongly associated with cell identity

Read the full article

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In eukaryotes, alternative splicing allows RNA sequences to be changed as required in their cellular context. While the use of gene expression without considering transcript isoforms is commonly used to classify cell types, the significance of alternative splicing to cell identity is not fully understood. Using data-driven classification and clustering methods, we analysed the RNA Atlas, an ultra-deep total RNA and polyA capture sequencing dataset obtained from 300 different types of cell and tissue samples across the human body. We show that splice inclusion values can distinguish cell identity effectively, and sometimes better than gene expression. These specific differences in alternative splicing are biologically relevant, with many splice events expressed in a limited number of cell types. We show that distributions of splice inclusion levels are “sparser” than gene expression levels, being discontinuous and containing outliers – akin to a control dial with a few settings. Gene expression levels display finer variation, taking on a diverse range of values akin to a control dial with many increments. Complementary analysis of comparable RNA-Seq samples from the ENCODE dataset yielded consistent results regardless of the splice event or gene expression quantification tools used, as well as the choice of RiboMinus TM or polyA capture-derived RNA libraries. Our results highlight the utility of alternative splicing data in data-driven and functional analyses, and the unique relationship between alternative splicing and cell identity.

Article activity feed