Assessing the evolution of SARS-CoV-2 lineages and the dynamic associations between nucleotide variations
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
Despite seminal advances towards understanding the infection mechanism of SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2), it continues to cause significant morbidity and mortality worldwide. Though mass immunization programmes have been implemented in several countries, the viral transmission cycle has shown a continuous progression in the form of multiple waves. A constant change in the frequencies of dominant viral lineages, arising from the accumulation of nucleotide variations (NVs) through favourable selection, is understandably expected to be a major determinant of disease severity and possible vaccine escape. Indeed, worldwide efforts have been initiated to identify specific virus lineage(s) and/or NVs that may cause a severe clinical presentation or facilitate vaccination breakthrough. Since host genetics is expected to play a major role in shaping virus evolution, it is imperative to study the role of genome-wide SARS-CoV-2 NVs across various populations. In the current study, we analysed the whole genome sequence of 3543 SARS-CoV-2-infected samples obtained from the state of Telangana, India (including 210 from our previous study), collected over an extended period from April 2020 to October 2021. We present a unique perspective on the evolution of prevalent virus lineages and NVs during this period. We also highlight the presence of specific NVs likely to be associated favourably with samples classified as vaccination breakthroughs. Finally, we report genome-wide intra-host variations at novel genomic positions. The results presented here provide critical insights into virus evolution over an extended period and pave the way to rigorously investigate the role of specific NVs in vaccination breakthroughs.
Article activity feed
-
-
-
The work presented is clear and the arguments well formed.
-
-
This is a study that would be of interest to the field and community. The reviewers have highlighted minor concerns with the work presented. Please ensure that you address their comments. Please deposit the data underlying the work in the Society’s data repository Figshare account here: https://microbiology.figshare.com/submit. Please also cite this data in the Data Summary of the main manuscript and list it as a unique reference in the References section. When you resubmit your article, the Editorial staff will post this data publicly on Figshare and add the DOI to the Data Summary section where you have cited it. This data will be viewable on the Figshare website with a link to the preprint and vice versa, allowing for greater discovery of your work, and the unique DOI of the data means it can be cited independently. Please provide …
This is a study that would be of interest to the field and community. The reviewers have highlighted minor concerns with the work presented. Please ensure that you address their comments. Please deposit the data underlying the work in the Society’s data repository Figshare account here: https://microbiology.figshare.com/submit. Please also cite this data in the Data Summary of the main manuscript and list it as a unique reference in the References section. When you resubmit your article, the Editorial staff will post this data publicly on Figshare and add the DOI to the Data Summary section where you have cited it. This data will be viewable on the Figshare website with a link to the preprint and vice versa, allowing for greater discovery of your work, and the unique DOI of the data means it can be cited independently. Please provide more detail in the Methods section and ensure that software is consistently cited and its version and parameters included. The reviewers have provided detailed commentary on your manuscript, which I encourage you to address in full. In particular there is a need to elaborate further on some of the methodologies used in your study, and improve some of the displayed figures. I look forward to receiving your revised manuscript. Best wishes Dr Andrew Bosworth
-
Comments to Author
This paper builds on work published in 2021 on a smaller set of samples and is a comprehensive, well-written and well-conceived, study. The appropriate techniques were used, and ethical approvals obtained. The results were discussed with reference to the relevant literature in a clear and concise manner. I have only a few, very minor, comments on clarifying some of the methodological issues. Materials and Methods Some clarification on how many samples provided sequence information would be helpful, perhaps in the Abstract where sample numbers are mentioned. The authors sate that the whole genome sequence from 3543 samples were analysed. However, the Materials and Methods section states that samples with rt-PCR Ct values
Please rate the manuscript for methodological rigour
Good
…Comments to Author
This paper builds on work published in 2021 on a smaller set of samples and is a comprehensive, well-written and well-conceived, study. The appropriate techniques were used, and ethical approvals obtained. The results were discussed with reference to the relevant literature in a clear and concise manner. I have only a few, very minor, comments on clarifying some of the methodological issues. Materials and Methods Some clarification on how many samples provided sequence information would be helpful, perhaps in the Abstract where sample numbers are mentioned. The authors sate that the whole genome sequence from 3543 samples were analysed. However, the Materials and Methods section states that samples with rt-PCR Ct values
Please rate the manuscript for methodological rigour
Good
Please rate the quality of the presentation and structure of the manuscript
Very good
To what extent are the conclusions supported by the data?
Strongly support
Do you have any concerns of possible image manipulation, plagiarism or any other unethical practices?
No
Is there a potential financial or other conflict of interest between yourself and the author(s)?
No
If this manuscript involves human and/or animal work, have the subjects been treated in an ethical manner and the authors complied with the appropriate guidelines?
Yes
-
Comments to Author
1. Methodological rigour, reproducibility and availability of underlying data I have some concerns about some of the methodology presented in this paper. Firstly, line 119 specifies that extraction of total RNA occured at BSL2, were there any considerations taken into place for handling a level 3 pathogen? Line 130: You mention "The synthesized cDNA", however, there isn't any report of how the cDNA was made - I assume this was part of the kits mentioned in the prior paragraph. Was cDNA generated with oligodT primers and random hexamers? For the in silico methodology, there is no mention of how primer sequences from the ARTIC V3 scheme were removed from sequencing reads in the pipeline. Please specify how you did this. If this hasn't been done this can contribute to false variant discoveries. I have …
Comments to Author
1. Methodological rigour, reproducibility and availability of underlying data I have some concerns about some of the methodology presented in this paper. Firstly, line 119 specifies that extraction of total RNA occured at BSL2, were there any considerations taken into place for handling a level 3 pathogen? Line 130: You mention "The synthesized cDNA", however, there isn't any report of how the cDNA was made - I assume this was part of the kits mentioned in the prior paragraph. Was cDNA generated with oligodT primers and random hexamers? For the in silico methodology, there is no mention of how primer sequences from the ARTIC V3 scheme were removed from sequencing reads in the pipeline. Please specify how you did this. If this hasn't been done this can contribute to false variant discoveries. I have checked the code on github and this part is hashed out in the script where you have commented mask primers using iVar. Please confirm how you accounted for this in your methodology. Line 158: you provide a link to virological.org - please can you add this to your reference list and specify which parts of the reference you have followed. Further comments regarding methodology: What method was used to translate the nucleotide sequences to amino acid sequences for you to infer phenotype changes? 2. Presentation of results Figure 1: Please check for colourblind accessibility. Panel 1C is quite difficult to interpret with the amount of different colours, make sure key points are in the figure legend/manuscript. Figure 2: There is no mention of N, Orf71/7b or nsp5 in the figure legend. However, this is a good visualisation of the variants increasing in frequency over time. Figure 3: If possible, could you supplement the titles for complete and partial vaccination on panel a? I found myself having to flick between pages to understand this figure, and I think these labels will help the reader to digest the information. This might also be a good opportunity to remind us how many samples were used in the analysis with N=x in the figure legend or supplemented on the figure. How does the time frame differ between the completely or partially vaccinated groups? I.e. were samples collected accross the same time period? Is there any variation in this? Does this explain any of the differences or are you confident it is vaccination status? Figure 4: Presents the iSNVs observed over time at selected positions. If appropriate, it would be useful to show the amino acid changes that are associated with the nucleotide changes you are reporting. 3. How the style and organization of the paper communicates and represents key findings The introduction is concise and sets the tone for the study whilst highlighting the importance of genomic data. One key focus of the study is to consider vaccine escape variants. Perhaps some more references and examples of this in the introduction would be useful. The results go on to describe the rise of the delta variant from March 2021, and how sub-variants of delta were being described from July 2021. They highlight the importance of having a closer look at the NVs in the next section. In the results section, I noticed the report of C14408T being described as P4720L when actually this should be P323L, which is the hitchhiker mutation for S:D614G that emerged early on in the pandemic. Through an odds ratio analysis, the authors have found mutations that could be associated with breakthrough cases. The order of the results and the associated figures do take you through a logical story for the data that is being discussed. 4. Literature analysis or discussion The introduction could have more literature to highlight other work that assesses the impact of variants on vaccine breakthrough - I am sure I have seen studies from the lab that talks about reduced neutralising effect of new variants etc. So even if there is no epidemiological data - it would be good to discuss work that has been conducted experimentally. Likewise in the discussion. The authors highlight that to their knowledge, this hasn't been done before. However, I believe there is certainly literature that could be brought in here to discuss the importance of their findings. The discussion highlights the pitfalls of the study and reinforces the need to assess this on larger sample sizes. 5. Any other relevant comments In the abstract, line one you use "its" - I think it will be more clear to refer to this as SARS-CoV-2 first. In the introduction the authors say that the WHO declared the pandemic in January 2020, this is incorrect and will need updating to March 2020. (Line 41)
Please rate the manuscript for methodological rigour
Satisfactory
Please rate the quality of the presentation and structure of the manuscript
Satisfactory
To what extent are the conclusions supported by the data?
Partially support
Do you have any concerns of possible image manipulation, plagiarism or any other unethical practices?
No
Is there a potential financial or other conflict of interest between yourself and the author(s)?
No
If this manuscript involves human and/or animal work, have the subjects been treated in an ethical manner and the authors complied with the appropriate guidelines?
Yes
-
SciScore for 10.1101/2022.01.19.22269572: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics IRB: The work was initiated following approvals from the Institutional Bioethics committee and Biosafety committee. Sex as a biological variable Sample collection strategy, dataset structure and features: A total of 3543 samples (1407 females and 2091 males (information unavailable for 45 samples)), representing the period April 1, 2020 to October 31, 2021, and belonging to Telangana, India, were analysed in this study (Table S1A). Randomization Odds ratio for estimating the association likelihoods of genomic alterations with vaccination breakthrough cases were estimated by creating contingency matrices for each NV identified in >5% of vaccinated samples and were compared with multiple random … SciScore for 10.1101/2022.01.19.22269572: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Ethics IRB: The work was initiated following approvals from the Institutional Bioethics committee and Biosafety committee. Sex as a biological variable Sample collection strategy, dataset structure and features: A total of 3543 samples (1407 females and 2091 males (information unavailable for 45 samples)), representing the period April 1, 2020 to October 31, 2021, and belonging to Telangana, India, were analysed in this study (Table S1A). Randomization Odds ratio for estimating the association likelihoods of genomic alterations with vaccination breakthrough cases were estimated by creating contingency matrices for each NV identified in >5% of vaccinated samples and were compared with multiple random subsamples of non-vaccinated cases starting March 2021 onwards. Blinding not detected. Power Analysis not detected. Cell Line Authentication Authentication: The synthesized cDNA was amplified using a multiplex polymerase chain reaction (PCR) protocol, producing 98 amplicons across the SARS-CoV-2 genome (https://artic.network/). Table 2: Resources
Software and Algorithms Sentences Resources All reads shorter than 30 bases or with a Phred quality score <20, were discarded. Phredsuggested: (Phred, RRID:SCR_001017)Reads were assembled to generate consensus fasta file using samtools mpileup and the consensus module of iVar with a base assigned as consensus if it had a minimum depth of at least 10 reads (setting ivarMinDepth=10). samtoolssuggested: (SAMTOOLS, RRID:SCR_002105)All structural representations were generated in PyMOL (The PyMOL Molecular Graphics System, Schrödinger, LLC). PyMOLsuggested: (PyMOL, RRID:SCR_000305)Results from OddPub: Thank you for sharing your data.
Results from LimitationRecognizer: An explicit section about the limitations of the techniques employed in this study was not found. We encourage authors to address study limitations.Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
Results from scite Reference Check: We found no unreliable references.
-