Functional prediction and comparative population analysis of variants in genes for proteases and innate immunity related to SARS-CoV-2 infection

This article has been Reviewed by the following groups

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Abstract

New coronavirus SARS-CoV-2 is capable to infect humans and cause a novel disease COVID-19. Aiming to understand a host genetic component of COVID-19, we focused on variants in genes encoding proteases and genes involved in innate immunity that could be important for susceptibility and resistance to SARS-CoV-2 infection.

Analysis of sequence data of coding regions of FURIN, PLG, PRSS1, TMPRSS11a, MBL2 and OAS1 genes in 143 unrelated individuals from Serbian population identified 22 variants with potential functional effect. In silico analyses (PolyPhen-2, SIFT, MutPred2 and Swiss-Pdb Viewer) predicted that 10 variants could impact the structure and/or function of proteins. These protein-altering variants (p.Gly146Ser in FURIN ; p.Arg261His and p.Ala494Val in PLG ; p.Asn54Lys in PRSS1 ; p.Arg52Cys, p.Gly54Asp and p.Gly57Glu in MBL2 ; p.Arg47Gln, p.Ile99Val and p.Arg130His in OAS1 ) may have predictive value for inter-individual differences in the response to the SARS-CoV-2 infection.

Next, we performed comparative population analysis for the same variants using extracted data from the 1000 genomes project. Population genetic variability was assessed using delta MAF and Fst statistics. Our study pointed to 7 variants in PLG, TMPRSS11a, MBL2 and OAS1 genes with noticeable divergence in allelic frequencies between populations worldwide. Three of them, all in MBL2 gene, were predicted to be damaging, making them the most promising population-specific markers related to SARS-CoV-2 infection.

Comparing allelic frequencies between Serbian and other populations, we found that the highest level of genetic divergence related to selected loci was observed with African, followed by East Asian, Central and South American and South Asian populations. When compared with European populations, the highest divergence was observed with Italian population.

In conclusion, we identified 4 variants in genes encoding proteases ( FURIN, PLG and PRSS1 ) and 6 in genes involved in the innate immunity ( MBL2 and OAS1 ) that might be relevant for the host response to SARS-CoV-2 infection.

Article activity feed

  1. SciScore for 10.1101/2020.05.13.093690: (What is this?)

    Please note, not all rigor criteria are appropriate for all manuscripts.

    Table 1: Rigor

    Institutional Review Board StatementConsent: Written informed consent was obtained from all participants.
    IRB: The study was conducted in accordance with the Helsinki Declaration and approved by the Ethic Committee of Institute of Molecular Genetics and Genetic Engineering, University of Belgrade.
    Randomizationnot detected.
    Blindingnot detected.
    Power Analysisnot detected.
    Sex as a biological variableTotal of 143 unrelated Serbian individuals (84 males and 59 females) were previously analyzed by NGS approach using the Illumina Clinical Exome Sequencing TruSight One Gene Panel (Illumina, San Diego, CA, USA), as previoustly described [23].

    Table 2: Resources

    Software and Algorithms
    SentencesResources
    Total of 143 unrelated Serbian individuals (84 males and 59 females) were previously analyzed by NGS approach using the Illumina Clinical Exome Sequencing TruSight One Gene Panel (Illumina, San Diego, CA, USA), as previoustly described [23].
    NGS
    suggested: (PM4NGS, RRID:SCR_019164)
    Genotype data were extracted from the VCF files of Phase 3 variant calls of the 1000 Genomes Project (1kGP) sample collection (https://www.internationalgenome.org/) via Ensembl Data Slicer Tool.
    1000 Genomes Project
    suggested: (1000 Genomes Project and AWS, RRID:SCR_008801)
    To predict the effect of nonsynonymous amino acid substitutions, we used in silico prediction algorithms: PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2), SIFT/PROVEAN (http://provean.jcvi.org/index.php) and MutPred2 (http://mutpred.mutdb.org/).
    PolyPhen-2
    suggested: None
    MutPred2
    suggested: None
    http://mutpred.mutdb.org/
    suggested: (MutPred, RRID:SCR_010778)

    Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).


    Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

    Results from TrialIdentifier: No clinical trial numbers were referenced.


    Results from Barzooka: We did not find any issues relating to the usage of bar graphs.


    Results from JetFighter: We did not find any issues relating to colormaps.


    Results from rtransparent:
    • Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
    • Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
    • No protocol registration statement was detected.

    About SciScore

    SciScore is an automated tool that is designed to assist expert reviewers by finding and presenting formulaic information scattered throughout a paper in a standard, easy to digest format. SciScore checks for the presence and correctness of RRIDs (research resource identifiers), and for rigor criteria such as sex and investigator blinding. For details on the theoretical underpinning of rigor criteria and the tools shown here, including references cited, please follow this link.