Systematic analysis of alternative splicing in time course data using Spycone
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Motivation
During disease progression or organism development, alternative splicing may lead to isoform switches that demonstrate similar temporal patterns and reflect the alternative splicing co-regulation of such genes. Tools for dynamic process analysis usually neglect alternative splicing.
Results
Here, we propose Spycone, a splicing-aware framework for time course data analysis. Spycone exploits a novel IS detection algorithm and offers downstream analysis such as network and gene set enrichment. We demonstrate the performance of Spycone using simulated and real-world data of SARS-CoV-2 infection.
Availability and implementation
The Spycone package is available as a PyPI package. The source code of Spycone is available under the GPLv3 license at https://github.com/yollct/spycone and the documentation at https://spycone.readthedocs.io/en/latest/.
Supplementary information
Supplementary data are available at Bioinformatics online.
Article activity feed
-
-
SciScore for 10.1101/2022.04.28.489857: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources For the SARS-CoV-2 dataset, we used Trimmomatic v0.39 [50] to remove Illumina adapter sequences and low quality bases (Phred score < 30) followed by Salmon v1.5.1 [51] for isoform quantification with a mapping-based model, the human genome version 38, and an Ensembl genome annotation version 104. Trimmomaticsuggested: NoneSalmonsuggested: (Salmon, RRID:SCR_017036)Protein-protein interaction network and Domain-domain interaction: A PPI network is obtained from BioGRID (v.4.4.208) [29] and a domain-domain interaction network from 3did (v2019_01) [23]. BioGRIDsuggested: (BioGrid Australia, …SciScore for 10.1101/2022.04.28.489857: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources For the SARS-CoV-2 dataset, we used Trimmomatic v0.39 [50] to remove Illumina adapter sequences and low quality bases (Phred score < 30) followed by Salmon v1.5.1 [51] for isoform quantification with a mapping-based model, the human genome version 38, and an Ensembl genome annotation version 104. Trimmomaticsuggested: NoneSalmonsuggested: (Salmon, RRID:SCR_017036)Protein-protein interaction network and Domain-domain interaction: A PPI network is obtained from BioGRID (v.4.4.208) [29] and a domain-domain interaction network from 3did (v2019_01) [23]. BioGRIDsuggested: (BioGrid Australia, RRID:SCR_006334)Clustering analysis: The clustering algorithms are implemented using the scikit-learn machine learning package in python (v0.23.2) [54] and tslearn (v0.5.1.0) time course machine learning package in python [17]. pythonsuggested: NoneGseapy is a python wrapper of GSEA and Enrichr [20]. Enrichrsuggested: (Enrichr, RRID:SCR_001575)We used NEASE with KEGG and Reactome pathways. KEGGsuggested: NoneReactomesuggested: (Reactome, RRID:SCR_003485)Finally, we performed motif enrichment analysis using the motifs module from the Biopython library [55]. Biopythonsuggested: (Biopython, RRID:SCR_007173)Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:Limitations: Spycone achieves high precision and considerably higher recall than the only competing tool TSIS. Nevertheless, the moderate recall we observe in particular in the presence of noise shows that there is further room for method improvement. In our simulation Model 2, where we allowed for isoform switches between minor isoforms, we observed a reduction in both precision and recall. Spycone identifies only two isoforms that switch per event, but in reality, an event could involve more than two isoforms. In the future, we should consider multiple-isoforms switches to handle more complex scenarios. Spycone uniquely offers features for detailed downstream analysis and allows for detecting the rewiring of network modules in a time course as a result of coordinated domain gain/loss. This type of analysis is limited by the availability of the structural annotation. However, the current developments in computational structural biology that could expand the information about domains and domain-domain interactions e.g., AlphaFold2 [48], will greatly strengthen our tool. Lastly, our PSSM-based approach for splicing factor analysis does not allow us to investigate splicing factors that bind indirectly through other adaptor proteins, requiring further experiments that establish binding sites for such proteins. Spycone was thus far applied exclusively to bulk RNA-seq data. When considering tissue samples, IS switches between time points could also be attributed to changes in cell...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-