GUIdEStaR (G-quadruplex, uORF, IRES, Epigenetics, Small RNA, Repeats), the integrated metadatabase in conjunction with neural network methods

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

GUIdEStaR integrates existing databases of various types of G-quadruplex, upstream Open Reading Frame (uORF), Internal Ribosome Entry Site (IRES), methylation to RNA and histone protein, small RNA, and repeats. GUIdEStaR consists of approx. 40,000 genes and 320,000 transcripts. An mRNA transcript is divided into 5 regions (5’UTR, 3’UTR, exon, intron, and biological region) where each region contains presence-absence data of 169 different types of elements. Recently, artificial intelligence (AI) based analysis of sequencing data has been gaining popularity in the area of bioinformatics. GUIdEStaR generates datasets that can be used as inputs to AI methods. At the GUIdEStaR homepage, users submit gene symbols by clicking a “Send” button, and shortly result files in CSV format are available for download at the result website. Users have an option to send the result files to their email addresses. Additionally, the entire database and the example Java codes are also freely available for download. Here, we demonstrate the database usage with three neural network classification studies-1) small RNA study for classifying transcription factor (TF) genes into either one of TF mediated by small RNA originated from SARS-CoV-2 or by human microRNA (miRNA), 2) cell membrane receptor study for classifying receptor genes as either with virus interaction or without one, and 3) nonsense mediated mRNA decay (NMD) study for classifying cell membrane and nuclear receptors as either NMD target or non-target. GUIdEStaR is available for access to the easy-to-use web-based database at www.guidestar.kr and for download at https://sourceforge.net/projects/guidestar .

Article activity feed

  1. SciScore for 10.1101/2021.02.25.432957: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board Statementnot detected.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variablenot detected.

    Table 2: Resources

    No key resources detected.


    Results from OddPub: Thank you for sharing your data.


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • No funding statement was detected.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.