Novel Immunoglobulin Domain Proteins Provide Insights into Evolution and Pathogenesis of SARS-CoV-2-Related Viruses

Abstract

The ongoing COVID-19 pandemic strongly emphasizes the need for a more complete understanding of the biology and pathogenesis of its causative agent SARS-CoV-2. Despite intense scrutiny, several proteins encoded by the genomes of SARS-CoV-2 and other SARS-like coronaviruses remain enigmatic. Moreover, the high infectivity and severity of SARS-CoV-2 in certain individuals make wet-lab studies currently challenging. In this study, we used a series of computational strategies to identify several fast-evolving regions of SARS-CoV-2 proteins which are potentially under host immune pressure. Most notably, the hitherto-uncharacterized protein encoded by ORF8 is one of them. Using sensitive sequence and structural analysis methods, we show that ORF8 and several other proteins from alpha- and beta-coronavirus comprise novel families of immunoglobulin domain proteins, which might function as potential immune modulators to delay or attenuate the host immune response against the viruses.

SciScore for 10.1101/2020.03.04.977736: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
The program CD-HIT was used for similarity-based clustering (13).	CD-HIT suggested: (CD-HIT, RRID:SCR_007105)
Based on the MSA, a similarity plot was constructed by a custom Python script, which calculated the identity between each subject sequence and the SARS-CoV-2 genome sequence based on a custom sliding window size and step size.	Python suggested: (IPython, RRID:SCR_001658)
Similarity-based clustering was conducted by BLASTCLUST, a BLAST score-based single-linkage clustering method (ftp://ftp.ncbi.nih.gov/blast/documents/blastclust.html).	BLASTCLUST suggested: (BLASTClust, RRID:SCR_016641)
Mul…

SciScore for 10.1101/2020.03.04.977736: (What is this?)

Please note, not all rigor criteria are appropriate for all manuscripts.

Table 1: Rigor

NIH rigor criteria are not applicable to paper type.

Table 2: Resources

Software and Algorithms
Sentences	Resources
The program CD-HIT was used for similarity-based clustering (13).	CD-HIT suggested: (CD-HIT, RRID:SCR_007105)
Based on the MSA, a similarity plot was constructed by a custom Python script, which calculated the identity between each subject sequence and the SARS-CoV-2 genome sequence based on a custom sliding window size and step size.	Python suggested: (IPython, RRID:SCR_001658)
Similarity-based clustering was conducted by BLASTCLUST, a BLAST score-based single-linkage clustering method (ftp://ftp.ncbi.nih.gov/blast/documents/blastclust.html).	BLASTCLUST suggested: (BLASTClust, RRID:SCR_016641)
Multiple sequence alignments were built by the KALIGN (14), MUSCLE(16) and PROMALS3D(17) programs, followed by careful manual adjustments based on the profile–profile alignment, the secondary structure information and the structural alignment.	KALIGN suggested: (Kalign, RRID:SCR_011810)
The alignments were colored using an in-house alignment visualization program written in perl and further modified using adobe illustrator.	adobe illustrator suggested: (Adobe Illustrator, RRID:SCR_010279)
Identification of distinct viral Ig domain proteins: By using the protein remote relationship detection methods, we generated a collection of distinct Ig domains from the Pfam database (21) and also from our local domain database.	Pfam suggested: (Pfam, RRID:SCR_004726)
Then, we utilized the hmmscan program of the HMMER package (22) and RPS-BLAST (12, 23) to retrieve the homologs from viral genomes.	HMMER suggested: (Hmmer, RRID:SCR_005305)
The tree diagram was generated using MEGA Tree Explorer (26) Entropy analysis: Position-wise Shannon entropy (H) for a given multiple sequence alignment was calculated using the equation: P is the fraction of residues of amino acid type i, and M is the number of amino acid types.	MEGA suggested: (Mega BLAST, RRID:SCR_011920)
Since in these low sequence-identity cases, sequence alignment is the most important factor affecting the quality of the model (Cozzetto and Tramontano, 2005), alignments used in this study have been carefully built and cross-validated based on the information from HHpred and edited manually using the secondary structure information.	HHpred suggested: (HHpred, RRID:SCR_010276)
Structural analysis and comparison were conducted using the molecular visualization program PyMOL (30).	PyMOL suggested: (PyMOL, RRID:SCR_000305)

Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).

Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.

Results from TrialIdentifier: No clinical trial numbers were referenced.

Results from Barzooka: We did not find any issues relating to the usage of bar graphs.

Results from JetFighter: We did not find any issues relating to colormaps.

Results from rtransparent:

Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
No protocol registration statement was detected.

Read the original source

Novel Immunoglobulin Domain Proteins Provide Insights into Evolution and Pathogenesis of SARS-CoV-2-Related Viruses

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Conserved Filovirus Proteins as Targets of Broad-Spectrum Antivirals

A divergent betacoronavirus with a functional furin cleavage site in South American bats

Structural and phenotypic plasticity of the RBD loop2 region is a key determinant for HKU5r-CoVs’ emergence in mink

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Conserved Filovirus Proteins as Targets of Broad-Spectrum Antivirals

A divergent betacoronavirus with a functional furin cleavage site in South American bats

Structural and phenotypic plasticity of the RBD loop2 region is a key determinant for HKU5r-CoVs’ emergence in mink