Novel Immunoglobulin Domain Proteins Provide Insights into Evolution and Pathogenesis Mechanisms of SARS-Related Coronaviruses
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (ScreenIT)
Abstract
A novel coronavirus (SARS-CoV-2) is the causative agent of an emergent severe respiratory disease (COVID-19) in humans that is threatening to result in a global health crisis. By using genomic, sequence, structural and evolutionary analysis, we show that Alpha- and Beta-CoVs possess several novel families of immunoglobulin (Ig) domain proteins, including ORF8 and ORF7a from SARS-related coronaviruses and two protein groups from certain Alpha-CoVs. Among them, ORF8 is distinguished in being rapidly evolving, possessing a unique insert and a hypervariable position among SARS-CoV-2 genomes in its predicted ligand-binding groove. We also uncover many Ig proteins from several metazoan viruses which are distinct in sequence and structure but share an architecture comparable to that of CoV Ig domain proteins. Hence, we propose that deployment of Ig domain proteins is a widely-used strategy by viruses, and SARS-CoV-2 ORF8 is a potential pathogenicity factor which evolves rapidly to counter the immune response and facilitate the transmission between hosts.
Article activity feed
-
SciScore for 10.1101/2020.03.04.977736: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources The program CD-HIT was used for similarity-based clustering (13). CD-HITsuggested: (CD-HIT, RRID:SCR_007105)Based on the MSA, a similarity plot was constructed by a custom Python script, which calculated the identity between each subject sequence and the SARS-CoV-2 genome sequence based on a custom sliding window size and step size. Pythonsuggested: (IPython, RRID:SCR_001658)Similarity-based clustering was conducted by BLASTCLUST, a BLAST score-based single-linkage clustering method (ftp://ftp.ncbi.nih.gov/blast/documents/blastclust.html). BLASTCLUSTsuggested: (BLASTClust, RRID:SCR_016641)Mul… SciScore for 10.1101/2020.03.04.977736: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
NIH rigor criteria are not applicable to paper type.Table 2: Resources
Software and Algorithms Sentences Resources The program CD-HIT was used for similarity-based clustering (13). CD-HITsuggested: (CD-HIT, RRID:SCR_007105)Based on the MSA, a similarity plot was constructed by a custom Python script, which calculated the identity between each subject sequence and the SARS-CoV-2 genome sequence based on a custom sliding window size and step size. Pythonsuggested: (IPython, RRID:SCR_001658)Similarity-based clustering was conducted by BLASTCLUST, a BLAST score-based single-linkage clustering method (ftp://ftp.ncbi.nih.gov/blast/documents/blastclust.html). BLASTCLUSTsuggested: (BLASTClust, RRID:SCR_016641)Multiple sequence alignments were built by the KALIGN (14), MUSCLE(16) and PROMALS3D(17) programs, followed by careful manual adjustments based on the profile–profile alignment, the secondary structure information and the structural alignment. KALIGNsuggested: (Kalign, RRID:SCR_011810)The alignments were colored using an in-house alignment visualization program written in perl and further modified using adobe illustrator. adobe illustratorsuggested: (Adobe Illustrator, RRID:SCR_010279)Identification of distinct viral Ig domain proteins: By using the protein remote relationship detection methods, we generated a collection of distinct Ig domains from the Pfam database (21) and also from our local domain database. Pfamsuggested: (Pfam, RRID:SCR_004726)Then, we utilized the hmmscan program of the HMMER package (22) and RPS-BLAST (12, 23) to retrieve the homologs from viral genomes. HMMERsuggested: (Hmmer, RRID:SCR_005305)The tree diagram was generated using MEGA Tree Explorer (26) Entropy analysis: Position-wise Shannon entropy (H) for a given multiple sequence alignment was calculated using the equation: P is the fraction of residues of amino acid type i, and M is the number of amino acid types. MEGAsuggested: (Mega BLAST, RRID:SCR_011920)Since in these low sequence-identity cases, sequence alignment is the most important factor affecting the quality of the model (Cozzetto and Tramontano, 2005), alignments used in this study have been carefully built and cross-validated based on the information from HHpred and edited manually using the secondary structure information. HHpredsuggested: (HHpred, RRID:SCR_010276)Structural analysis and comparison were conducted using the molecular visualization program PyMOL (30). PyMOLsuggested: (PyMOL, RRID:SCR_000305)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-
