Knowledge synthesis from 100 million biomedical documents augments the deep expression profiling of coronavirus receptors
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
The COVID-19 pandemic demands assimilation of all available biomedical knowledge to decode its mechanisms of pathogenicity and transmission. Despite the recent renaissance in unsupervised neural networks for decoding unstructured natural languages, a platform for the real-time synthesis of the exponentially growing biomedical literature and its comprehensive triangulation with deep omic insights is not available. Here, we present the nferX platform for dynamic inference from over 45 quadrillion possible conceptual associations extracted from unstructured biomedical text, and their triangulation with Single Cell RNA-sequencing based insights from over 25 tissues. Using this platform, we identify intersections between the pathologic manifestations of COVID-19 and the comprehensive expression profile of the SARS-CoV-2 receptor ACE2. We find that tongue keratinocytes, airway club cells, and ciliated cells are likely underappreciated targets of SARS-CoV-2 infection, in addition to type II pneumocytes and olfactory epithelial cells. We further identify mature small intestinal enterocytes as a possible hotspot of COVID-19 fecal-oral transmission, where an intriguing maturation-correlated transcriptional signature is shared between ACE2 and the other coronavirus receptors DPP4 (MERS-CoV) and ANPEP (α-coronavirus). This study demonstrates how a holistic data science platform can leverage unprecedented quantities of structured and unstructured publicly available data to accelerate the generation of impactful biological insights and hypotheses.
The nferX Platform Single-cell resource - https://academia.nferx.com/
Article activity feed
-
SciScore for 10.1101/2020.03.24.005702: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Single-cell data processing pipeline: For each study, a counts matrix was downloaded from a public data repository such as the Gene Expression Omnibus (GEO) or the Broad Institute Single Cell Portal (Table S1). Gene Expression Omnibussuggested: (Gene Expression Omnibus (GEO, RRID:SCR_005012)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We …SciScore for 10.1101/2020.03.24.005702: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Single-cell data processing pipeline: For each study, a counts matrix was downloaded from a public data repository such as the Gene Expression Omnibus (GEO) or the Broad Institute Single Cell Portal (Table S1). Gene Expression Omnibussuggested: (Gene Expression Omnibus (GEO, RRID:SCR_005012)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- No conflict of interest statement was detected. If there are no conflicts, we encourage authors to explicit state so.
- No funding statement was detected.
- No protocol registration statement was detected.
-
