Employing a systematic approach to biobanking and analyzing clinical and genetic data for advancing COVID-19 research

This article has been Reviewed by the following groups

Read the full article See related articles

Abstract

Within the GEN-COVID Multicenter Study, biospecimens from more than 1000 SARS-CoV-2 positive individuals have thus far been collected in the GEN-COVID Biobank (GCB). Sample types include whole blood, plasma, serum, leukocytes, and DNA. The GCB links samples to detailed clinical data available in the GEN-COVID Patient Registry (GCPR). It includes hospitalized patients (74.25%), broken down into intubated, treated by CPAP-biPAP, treated with O 2 supplementation, and without respiratory support (9.5%, 18.4%, 31.55% and 14.8, respectively); and non-hospitalized subjects (25.75%), either pauci- or asymptomatic. More than 150 clinical patient-level data fields have been collected and binarized for further statistics according to the organs/systems primarily affected by COVID-19: heart, liver, pancreas, kidney, chemosensors, innate or adaptive immunity, and clotting system. Hierarchical clustering analysis identified five main clinical categories: (1) severe multisystemic failure with either thromboembolic or pancreatic variant; (2) cytokine storm type, either severe with liver involvement or moderate; (3) moderate heart type, either with or without liver damage; (4) moderate multisystemic involvement, either with or without liver damage; (5) mild, either with or without hyposmia. GCB and GCPR are further linked to the GCGDR, which includes data from whole-exome sequencing and high-density SNP genotyping. The data are available for sharing through the Network for Italian Genomes, found within the COVID-19 dedicated section. The study objective is to systematize this comprehensive data collection and begin identifying multi-organ involvement in COVID-19, defining genetic parameters for infection susceptibility within the population, and mapping genetically COVID-19 severity and clinical complexity among patients.

Article activity feed

  1. SciScore for 10.1101/2020.07.24.20161307: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board StatementIRB: It started its activity on March 16, 2020, following approval by the Ethical Review Board of the Promoter Center, University of Siena (Protocol n. 16929, approval dated March 16, 2020).
    Consent: Written informed consent was obtained from all individuals who contributed samples and data.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variableHeart involvement was considered on the basis of one or more of the following abnormal data: a cardiac Troponin T (cTnT) value higher than the reference range (<15 ng/L) (indicative of ischemic disorder), an increase in the N-terminal (NT)-pro hormone BNP (NT-proBNP) value (reference value <88 pg/ml for males and <153 pg/ml for females) (indicative of heart failure), and the presence of arrhythmias (indicative of electric disorder).

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    To achieve this overall aim, the following specific objectives are being pursued: i) to perform sequencing (WES) on 2,000 COVID-19 patient samples [performed by the University of Siena (UNISI)]; ii) to perform genotyping (GWAS) on 2000 COVID-19 patients [performed by the Institute for Molecular Medicine of Finland (FIMM)]; iii) to associate the host genetic data obtained on 2,000 COVID-19 patients with severity and prognosis; iv) to share phenotypic data and samples across the GEN-COVID consortium platform as well as in cooperation with research institutions and national platforms through the GEN-COVID Disease Registry and Biobank; v) to share genetic data through the Network of Italian Genome (NIG: http://www.nig.cineca.it/, NIG database: http://nigdb.cineca.it) at CINECA, the largest Italian computing center.
    Biobank
    suggested: (HIV Biobank, RRID:SCR_004691)
    Library enrichment was tested by qPCR and the size distribution and concentration were determined using Agilent Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA).
    Agilent Bioanalyzer
    suggested: None
    Quality checks (SNP calling quality, cluster separation, and Mendelian and replication error) were done using GenomeStudio analysis software (Illumina).
    GenomeStudio
    suggested: (GenomeStudio, RRID:SCR_010973)
    The computer package Plink v1.90 [8] was used to process 700k SNP-genotyping data and to calculate SNP genotype statistics.
    Plink
    suggested: (PLINK, RRID:SCR_001757)
    The resulting plot is obtained with the Python Seaborn package.
    Python
    suggested: (IPython, RRID:SCR_001658)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.