Impact of natural selection on global patterns of genetic variation and association with clinical phenotypes at genes involved in SARS-CoV-2 infection
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Viruses are strong sources of natural selection pressure during human evolutionary history. Investigating genetic diversity and detecting signatures of natural selection at host genes related to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection help to identify functionally important variation. We conducted a large study of global genomic variation at host genes that play a role in SARS-CoV-2 infection with a focus on underrepresented African populations. We identified nonsynonymous and regulatory variants at ACE2 that appear to be targets of recent natural selection in some African populations. We detected evidence of ancient adaptive evolution at TMPRSS2 in the human lineage. Genetic variants that are targets of natural selection are associated with clinical phenotypes common in patients with coronavirus disease 2019.
Article activity feed
-
-
-
SciScore for 10.1101/2021.06.28.21259529: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Genomic data: The genomic data used in this study were from three sources: the Africa 6K project (referred to as the the “African Diversity” dataset) which is part of the TopMed consortium25, the 1000 Genomes project (1KG)26, and the Penn Medicine BioBank (PMBB). TopMedsuggested: NoneWhole genome sequencing (WGS) was performed to a median depth of 30X using DNA isolated from blood, PCR-free library construction and Illumina HiSeq X technology, as described elsewhere25. WGSsuggested: NoneThe 1000 Genomes datasets with super population ancestry labels (EUR, AFR, EAS, SAS, Other) were used … SciScore for 10.1101/2021.06.28.21259529: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources Genomic data: The genomic data used in this study were from three sources: the Africa 6K project (referred to as the the “African Diversity” dataset) which is part of the TopMed consortium25, the 1000 Genomes project (1KG)26, and the Penn Medicine BioBank (PMBB). TopMedsuggested: NoneWhole genome sequencing (WGS) was performed to a median depth of 30X using DNA isolated from blood, PCR-free library construction and Illumina HiSeq X technology, as described elsewhere25. WGSsuggested: NoneThe 1000 Genomes datasets with super population ancestry labels (EUR, AFR, EAS, SAS, Other) were used as QDA training datasets to determine the genetic ancestry labels for the PMBB population. 1000 Genomessuggested: (1000 Genomes Project and AWS, RRID:SCR_008801)Variant annotations: We used Ensembl Variant Effect Predictor (VEP) for variant annotations27. Ensembl Variant Effect Predictorsuggested: NoneVariantsuggested: (VARIANT, RRID:SCR_005194)For pathogenicity predictions, we used CADD28, SIFT29, PolyPhen30, Condel31, and REVEL scores in Ensembl. Ensemblsuggested: (Ensembl, RRID:SCR_002344)We visualized the location of these regulatory and eQTL variants using the UCSC genome browser and highlighted the variants using Adobe Illustrator. Adobe Illustratorsuggested: (Adobe Illustrator, RRID:SCR_010279)Association Testing: We used the R SKAT package for conducting a gene-based dispersion test and Biobin37; 38 for gene burden analysis. SKATsuggested: (SKAT, RRID:SCR_009396)The chimpanzee sequence (Clint_PTRv2/panTro6) used in the analysis was obtained from the UCSC genome browser. UCSC genome browsersuggested: (UCSC Genome Browser, RRID:SCR_005780)The overlapped SNPs were uploaded to the UCSC browser for visualization. UCSC browsersuggested: NoneThe ChIP-seq density dataset was obtained from http://remap.univ-amu.fr/ 33. DNase-seq and ChIP-seq clusters, layered H3K4Me3 (often found near Promoters), H3K4Me1 and H3K27Ac (often found near Regulatory Elements) data are from ENCODE32. ChIP-seqsuggested: (ChIP-seq, RRID:SCR_001237)Results from OddPub: Thank you for sharing your data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-