High-throughput SARS-CoV-2 and host genome sequencing from single nasopharyngeal swabs
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
During COVID19 and other viral pandemics, rapid generation of host and pathogen genomic data is critical to tracking infection and informing therapies. There is an urgent need for efficient approaches to this data generation at scale. We have developed a scalable, high throughput approach to generate high fidelity low pass whole genome and HLA sequencing, viral genomes, and representation of human transcriptome from single nasopharyngeal swabs of COVID19 patients.
Article activity feed
-
-
SciScore for 10.1101/2020.07.27.20163147: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Sample Collection and diagnostics: Residual VTM from SARS-CoV-2 positive nasopharyngeal swabs collected during clinical assessment of asymptomatic and symptomatic patients at Stanford Healthcare were Stanford Healthcaresuggested: NoneNon-SARS-CoV-2 reads were filtered out with Kraken220, using an index of human and viral genomes in RefSeq (index downloaded from https://genexa.ch/sars2-bioinformatics-resources/). RefSeqsuggested: (RefSeq, RRID:SCR_003496)Reads per COVID gene were … SciScore for 10.1101/2020.07.27.20163147: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement not detected. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Sample Collection and diagnostics: Residual VTM from SARS-CoV-2 positive nasopharyngeal swabs collected during clinical assessment of asymptomatic and symptomatic patients at Stanford Healthcare were Stanford Healthcaresuggested: NoneNon-SARS-CoV-2 reads were filtered out with Kraken220, using an index of human and viral genomes in RefSeq (index downloaded from https://genexa.ch/sars2-bioinformatics-resources/). RefSeqsuggested: (RefSeq, RRID:SCR_003496)Reads per COVID gene were collected from the ReadsPerGene STAR output file, and the total mappable reads were collected from the Log. STARsuggested: (STAR, RRID:SCR_015899)500 μl of 1.3 pM DNA sequencing library was loaded into a MiniSeq Mid Output Kit (300-cycles) (FC-420-1004), and sequenced using MiniSeq DNA sequencer (Illumina Inc., San Diego, CA). MiniSeqsuggested: None27,28 Host Sequence Alignment: Low-coverage FASTQ sequences underwent quality control assessment via FastQC v0.11.8 before alt-aware alignment to GRCh38.p12 using BWA-MEM v0.7.17-r1188. FastQCsuggested: (FastQC, RRID:SCR_014583)BWA-MEMsuggested: (Sniffles, RRID:SCR_017619)After duplicate marking, base quality score recalibration was performed with Picard Tools’ BaseRecalibrator and high-confidence variant call sets from dbSNP and the 1000 Genomes Project. Picardsuggested: (Picard, RRID:SCR_006525)dbSNPsuggested: (dbSNP, RRID:SCR_002338)1000 Genomes Projectsuggested: (1000 Genomes Project and AWS, RRID:SCR_008801)Quality control metrics, including coverage, were generated with Qualimap BAMQC v2.2.1, Samtools v1.10, and Mosdepth v0.2.9. Qualimapsuggested: (QualiMap, RRID:SCR_001209)Samtoolssuggested: (SAMTOOLS, RRID:SCR_002105)Finally, quality control reports for each sample were aggregated using MultiQC v1.9 MultiQCsuggested: (MultiQC, RRID:SCR_014982)) Variant Calling, Imputation, PCA, Kinship: BAM files were used for an initial calling with bcftools v1.9 mpileup29. bcftoolssuggested: (SAMtools/BCFtools, RRID:SCR_005227)Results from OddPub: Thank you for sharing your code.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-
SciScore for 10.1101/2020.07.27.20163147: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement Institutional Review Board approval for anonymous sequencing of host and viral genomics was obtained from the Stanford University School of Medicine IRB. Randomization not detected. Blinding This also enabled confirmation of six blindly duplicated samples and 2 pairings of first-degree relatives through kinship analysis (Figure 2C). Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources The estimated phylogenetic tree is the maximum clade credibility tree obtained with BEAST 215 using a fixed mutation rate of 1.04x10-3 per base per year, the Coalescent Extended Bayesian … SciScore for 10.1101/2020.07.27.20163147: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement Institutional Review Board approval for anonymous sequencing of host and viral genomics was obtained from the Stanford University School of Medicine IRB. Randomization not detected. Blinding This also enabled confirmation of six blindly duplicated samples and 2 pairings of first-degree relatives through kinship analysis (Figure 2C). Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources The estimated phylogenetic tree is the maximum clade credibility tree obtained with BEAST 215 using a fixed mutation rate of 1.04x10-3 per base per year, the Coalescent Extended Bayesian Skyline prior16, and the HKY substitution model17. (F) RPKMs for individual SARS-CoV-2 genes were averaged over samples with similar CT values. BEASTsuggested: (BEAST, SCR_010228)Methods Sample Collection and diagnostics: Residual VTM from SARS-CoV-2 positive nasopharyngeal swabs collected during clinical assessment of asymptomatic and symptomatic patients at Stanford Healthcare were used. Stanford Healthcaresuggested: NoneNon-SARS-CoV-2 reads were filtered out with Kraken220, using an index of human and viral genomes in RefSeq (index downloaded from https://genexa.ch/sars2-bioinformatics-resources/). RefSeqsuggested: (RefSeq, SCR_003496)Reads per COVID gene were collected from the ReadsPerGene STAR output file, and the total mappable reads were collected from the Log. STARsuggested: (STAR, SCR_015899)500 µl of 1.3 pM DNA sequencing library was loaded into a MiniSeq Mid Output Kit (300-cycles) (FC-420-1004), and sequenced using MiniSeq DNA sequencer (Illumina Inc., San Diego, CA). MiniSeqsuggested: None, KIR ligands (C1 and C2) and imputed HLA haplotypes.27,28 Host Sequence Alignment: Low-coverage FASTQ sequences underwent quality control assessment via FastQC v0.11.8 before alt-aware alignment to GRCh38.p12 using BWA-MEM v0.7.17-r1188. FastQCsuggested: (FastQC, SCR_014583)<div style="margin-bottom:8px"> <div><b>BWA-MEM</b></div> <div>suggested: (Sniffles, <a href="https://scicrunch.org/resources/Any/search?q=SCR_017619">SCR_017619</a>)</div> </div> </td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">After duplicate marking, base quality score recalibration was performed with Picard Tools’ BaseRecalibrator and high-confidence variant call sets from dbSNP and the 1000 Genomes Project.</td><td style="min-width:100px;border-bottom:1px solid lightgray"> <div style="margin-bottom:8px"> <div><b>Picard</b></div> <div>suggested: (Picard, <a href="https://scicrunch.org/resources/Any/search?q=SCR_006525">SCR_006525</a>)</div> </div> <div style="margin-bottom:8px"> <div><b>dbSNP</b></div> <div>suggested: (dbSNP, <a href="https://scicrunch.org/resources/Any/search?q=SCR_002338">SCR_002338</a>)</div> </div> <div style="margin-bottom:8px"> <div><b>1000 Genomes Project</b></div> <div>suggested: (1000 Genomes Project and AWS, <a href="https://scicrunch.org/resources/Any/search?q=SCR_008801">SCR_008801</a>)</div> </div> </td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Quality control metrics, including coverage, were generated with Qualimap BAMQC v2.2.1, Samtools v1.10, and Mosdepth v0.2.9.</td><td style="min-width:100px;border-bottom:1px solid lightgray"> <div style="margin-bottom:8px"> <div><b>Qualimap</b></div> <div>suggested: (QualiMap, <a href="https://scicrunch.org/resources/Any/search?q=SCR_001209">SCR_001209</a>)</div> </div> <div style="margin-bottom:8px"> <div><b>Samtools</b></div> <div>suggested: (Samtools, <a href="https://scicrunch.org/resources/Any/search?q=SCR_002105">SCR_002105</a>)</div> </div> </td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Finally, quality control reports for each sample were aggregated using MultiQC v1.9</td><td style="min-width:100px;border-bottom:1px solid lightgray"> <div style="margin-bottom:8px"> <div><b>MultiQC</b></div> <div>suggested: (MultiQC, <a href="https://scicrunch.org/resources/Any/search?q=SCR_014982">SCR_014982</a>)</div> </div> </td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Variant Calling, Imputation, PCA, Kinship: BAM files were used for an initial calling with bcftools v1.9 mpileup29.</td><td style="min-width:100px;border-bottom:1px solid lightgray"> <div style="margin-bottom:8px"> <div><b>bcftools</b></div> <div>suggested: (SAMtools/BCFtools, <a href="https://scicrunch.org/resources/Any/search?q=SCR_005227">SCR_005227</a>)</div> </div> </td></tr><tr><td style="min-width:100px;vertical-align:top;border-bottom:1px solid lightgray">Beckman independence fellowship (MJS) and American Heart Association (MJS, KS), the National Heart Lung and Blood Institute (K08HL143185 to VNP), The John Taylor Babbitt Foundation (VNP) and Sarnoff Cardiovascular Research Foundation (VNP)</td><td style="min-width:100px;border-bottom:1px solid lightgray"> <div style="margin-bottom:8px"> <div><b>American Heart Association</b></div> <div>suggested: (American Heart Association, <a href="https://scicrunch.org/resources/Any/search?q=SCR_007210">SCR_007210</a>)</div> </div> </td></tr></table>
Data from additional tools added to each annotation on a weekly basis.
About SciScore
SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore is not a substitute for expert review. SciScore checks for the presence and correctness of RRIDs (research resource identifiers) in the manuscript, and detects sentences that appear to be missing RRIDs. SciScore also checks to make sure that rigor criteria are addressed by authors. It does this by detecting sentences that discuss criteria such as blinding or power analysis. SciScore does not guarantee that the rigor criteria that it detects are appropriate for the particular study. Instead it assists authors, editors, and reviewers by drawing attention to sections of the manuscript that contain or should contain various rigor criteria and key resources. For details on the results shown here, including references cited, please follow this link.
-
-