Genetic diversity and genomic epidemiology of SARS-CoV-2 during the first 3 years of the pandemic in Morocco: comprehensive sequence analysis, including the unique lineage B.1.528 in Morocco

This article has been Reviewed by the following groups

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

During the 3 years following the emergence of the COVID-19 pandemic, the African continent, like other regions of the world, was substantially impacted by COVID-19. In Morocco, the COVID-19 pandemic has been marked by the emergence and spread of several SARS-CoV-2 variants, leading to a substantial increase in the incidence of infections and deaths. Nevertheless, the comprehensive understanding of the genetic diversity, evolution, and epidemiology of several viral lineages remained limited in Morocco. This study sought to deepen the understanding of the genomic epidemiology of SARS-CoV-2 through a retrospective analysis. The main objective of this study was to analyse the genetic diversity of SARS-CoV-2 and identify distinct lineages, as well as assess their evolution during the pandemic in Morocco, using genomic epidemiology approaches. Furthermore, several key mutations in the functional proteins across different viral lineages were highlighted along with an analysis of the genetic relationships amongst these strains to better understand their evolutionary pathways. A total of 2274 genomic sequences of SARS-CoV-2 isolated in Morocco during the period of 2020 to 2023, were extracted from the GISAID EpiCoV database and subjected to analysis. Lineages and clades were classified according to the nomenclature of GISAID, Nextstrain, and Pangolin. The study was conducted and reported in accordance with STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) guidelines. An exhaustive analysis of 2274 genomic sequences led to the identification of 157 PANGO lineages, including notable lineages such as B.1, B.1.1, B.1.528, and B.1.177, as well as variants such as B.1.1.7, B.1.621, B.1.525, B.1.351, B.1.617.1, B.1.617.2, and its notable sublineages AY.33, AY.72, AY.112, AY.121 that evolved over time before being supplanted by Omicron in December 2021. Among the 2274 sequences analysed, Omicron and its subvariants had a prevalence of 59.5%. The most predominant clades were 21K, 21L, and 22B, which are respectively related phylogenetically to BA.1, BA.2, and BA.5. In June 2022, Morocco rapidly observed a recrudescence of cases of infection, with the emergence and concurrent coexistence of subvariants from clade 22B such as BA.5.2.20, BA.5, BA.5.1, BA.5.2.1, and BF.5, supplanting the subvariants BA.1 (clade display 21K) and BA.2 (clade display 21L), which became marginal. However, XBB (clade 22F) and its progeny such XBB.1.5(23A), XBB.1.16(23B), CH.1.1(23C), XBB.1.9(23D), XBB.2.3(23E), EG.5.1(23F), and XBB.1.5.70(23G) have evolved sporadically. Furthermore, several notable mutations, such as H69del/V70del, G142D, K417N, T478K, E484K, E484A, L452R, F486P, N501Y, Q613H, D614G, and P681H/R, have been identified. Some of these SARS-CoV-2 mutations are known to be involved in increasing transmissibility, virulence, and antibody escape. This study has identified several distinct lineages and mutations involved in the genetic diversity of Moroccan isolates, as well as the analysis of their evolutionary trends. These findings provide a robust basis for better understanding the distinct mutations and their roles in the variation of transmissibility, pathogenicity, and antigenicity (immune evasion/reinfection). Furthermore, the noteworthy number of distinct lineages identified in Morocco highlights the importance of maintaining continuous surveillance of COVID-19. Moreover, expanding vaccination coverage would also help protect patients against more severe clinical disease.

Article activity feed

  1. Thank you for addressing the reviewer comments. There are however a number of points that I do not think have been addressed adequately, please do consider making these changes in a revision. 1. The methods at lines 198-201 are not detailed enough to be reproducible. 2. The reviewers raise concerns about the readability of your figures. Whilst several improvements have been made, Fig 5b and 6b are too low resolution to be readable. Whilst I acknowledge that text will be small on a large phylogeny, a higher resolution will allow branch labels to be read when zoomed in.

  2. The reviewers and I agree that the methodology is sound and the insights important. The reviewers both raise concerns that require addressing in a revision. Please pay particular attention to the points raised regarding readability, context, and the appropriateness of variant names.

  3. Comments to Author

    General Comments There is a gap in the literature for a genomic analysis of the SARS-CoV-2 pandemic in Morocco covering the Omicron era, therefore this manuscript is novel and adds to the field of SC2 genomic epidemiology. However, I have some concerns about the fabric of the manuscript, specifically in the discussion and results sections. The manuscript exhaustively describes the progression of the SC2 pandemic in Morocco, however much of this information is very similar to the global pandemic progression and the manuscript would be much stronger if it effectively placed this within a global context. One way to achieve this could be to avoid describing well described SARS-CoV-2 lineages such as Alpha, Beta, Delta, Omicron etc and to focus specifically on the aspects of the pandemic progression in Morocco which stand out and to attempt to explain and / or contextualise these differences with the global pandemic. I also noticed that there was only a very small amount of discussion of the B.1.528 lineage, considering its presence in the title of the manuscript I was confused that discussion of the lineage was so limited after the large part focused on it in the results section. I would specifically be keen to see discussion of what distinguished it from related / co-occurring lineages both phylogenetically and genetically, specifically on the potential impacts of mutations which define the clade. I also suggest that the authors reconsider the usage of geographically named lineages e.g. "Moroccan variant", "Indian variant", etc the WHO has strongly discouraged their usage due to concerns of stigmatising peoples associated with those regions. Methodological rigour, reproducibility and availability of underlying data All tools used were well tested and publicly available, all data used in the analysis is only available via GISAID which means it is not truly public however this is not the responsibility of the manuscript authors. The authors make some unsupported claims, for example the usage of the word "significant" throughout the manuscript without any statistical analysis. Another example, is the reference to how the study "highlighted a convergent evolution of several variants particularly the Omicron variant" which is poorly supported, it is true that there is evidence of convergent evolution but the manuscript does not support this claim or demonstrate it effectively. Presentation of results * Figure legends for tables specifically require more detail, the role of a figure legend is to walk the reader through the figure and guide their understanding. * Table 1 needs a much more detailed figure legend, what does the colour mean? What are the denominators of the percentages? Etc * Using the total amount of a lineage as the percentage denominator here seems irrelevant to the point, surely the percentage of that lineage within that year is much more important? * Similar issues with table 2, it is not immediately clear what the denominator within the second column is in isolation * The log10 scale of figure 3 is very strange, please consider using a log transformed scale with the actual numbers on it e.g. 1, 10, 100 equally spaced. * This figure is extremely busy and it is unclear what the desired takeaways of the figure are, I suggest that the authors consider what information they wish to communicate and what conclusions they want the reader to make from that information. * Similar issues as above are present with figure 4, it is also worth noting that it is not a heatmap as is stated in the figure legend. * The b phylogeny in figure 6 is too busy to be interpretable, what are you seeking to convey with the inclusion of this figure generally? * Figure 7 reads far better than the above, it is clear how the a and b panels relate to another but it would be desirable for the figure to be better utilised in the text, there is only one reference to it and the text doesn't seem to rely on the figure at all. How the style and organization of the paper communicates and represents key findings The manuscript does not often clearly highlight what is a key finding, for example, the long sections discussing defining mutations for various well described lineages don't make it clear what is a new finding vs new evidence of previously observed and described trends. The manuscript authors might benefit from attempting to distil the manuscript into a small number of key findings and refocusing the manuscript on just these while giving adequate regional and global context for those findings. Literature analysis or discussion Several sections of the results section are not mentioned at all within the discussion, if a result does not merit any further discussion I question on what grounds it is included? When discussing the impacts of specific mutations it would be desirable to see a more fleshed out discussion of the evidence for and against their potential impact on the virus. Especially in sections such as the end of the second paragraph of the discussion (lines 510-513) where there are no citations supporting the speculation about the potential function of the P1263L mutation.

    Please rate the manuscript for methodological rigour

    Poor

    Please rate the quality of the presentation and structure of the manuscript

    Poor

    To what extent are the conclusions supported by the data?

    Partially support

    Do you have any concerns of possible image manipulation, plagiarism or any other unethical practices?

    No

    Is there a potential financial or other conflict of interest between yourself and the author(s)?

    No

    If this manuscript involves human and/or animal work, have the subjects been treated in an ethical manner and the authors complied with the appropriate guidelines?

    Yes

  4. Comments to Author

    Djorwe and colleagues present a paper detailing the genetic diversity and epidemiology of SARS-CoV-2 in Morocco through sequence analysis. No new sequencing data was procured during this study and all data was taken from publicly available sources including that of GISAID. The authors carried out sequence analysis on a total of 2274 genomes located within this database that were specific to Morocco. Within this analysis, they looked at phylogenetics and also the mutation profile of these sequences to establish the patten of 'variant waves' that moved through the country in comparison to others. As this covered the initial three years of the pandemic, 7+ variants were identified within the 2274 sequences. Finally, the authors go on to a lengthy discussion about the analysis they have presented and where it fits in with the wider context of the pandemic. The methodological rigour is sound and based on globally recognised platforms and analysis pipelines, there also making it very reproducible. Likewise, the data is all sourced from public records and therefore is also sound. In general, the results are presented in a profession manner however I have some issues with the figures in general as laid out in my comments below. In short, there is too much information on them and are small, thus making text hard to read. With regards to the style and organisation of the paper, I think it is in general written well however as noted in my comments below, there is some issues. There is lots of overstatements about how important this is. Whilst correct for Morocco, it is not globally given the tiny percentage of sequences analysed that are available. Whilst this was not the point of the study, the tone should befit the context. As for the literature analysis and discussion, I think it could be improved. The literature is somewhat narrow for the context (all SARS-CoV-2 Variants and mutational profiles) and I think the discussion is too long and could be shortened and made more concise. The conclusions, whilst supported by the data I think the spread of the virus across Morocco is overplayed. Overlaying these sequencing data on a map of the country may help support this claim. Overall, this paper is quite long given the data is it based off of. I think it needs to be made clear from the start that this is a retrospective study as no new data is presented here that cannot be found in public databases (E.G. GISAID). On the contrary, I do think this sort of information is important at a national level and understanding mutational profiles within geographical boundaries should be considered. Note, comments below contain both major and minor corrections. Comments: 1. Lines 67-70. Authors make the claim that extending vaccine coverage may prevent new variants emerging. The vaccines have been shown to not prevent infection but protect against more severe clinical disease. This must be altered accordingly. 2. Line 108. Remove 'of contamination' that is unnecessary 3. Line 123. 10 ORFs encoding 26 genes? This must be changed. 4. Lines 148-150. This is a strongly worded statement. Yes, the claim is accurate but specifically for Morocco. A global analysis of GISAID sequences would allow for better development of preventative measures. Suggestion correction to tone down this language. 5. Lines 201-203. This is too brief for the methodology and does not cite appropriate references for the various tools of classification. EG - Pangolin. 6. Line 251. Whilst I am sure the study was meticulous; this is an unnecessary word for a results section. Remove. 7. Lines 317-320. Notable has been used repeatedly, swap one of these. 8. Line 491. Elaboration is not the right word here. Development may suit better 9. Lines 546-551. The claim that alpha variant seems to have increased replication capacity due to a mutation is true, but I am not sure how your analysis provides any further evidence given you have done no In vitro analysis of this particular mutation. 10. Lines 578 & 580. Nomenclature was changed to Greek letters to stop certain variants being associated with geographic locations. You should remove the term 'Indian variants' and should stick with using Delta and Kappa. 11. Line 636. Full stop missing at end of sentence 12. Lines 646-648. You have repeated the same thing in two sentences. Correct this. Given that this is a retrospective study based on publicly available data, it seems obvious to this reviewer that mutation profiles will have been seen elsewhere first. Figure issues in general: I have looked at every additional figure at the end of this article and I have come to the same conclusion. They are too busy, particularly Figure 3A which is also hindered by the small size. The text on the figures is also small and hard to read thus hindering full clarity on the data presented. Suggest the authors address the figure issue.

    Please rate the manuscript for methodological rigour

    Very good

    Please rate the quality of the presentation and structure of the manuscript

    Satisfactory

    To what extent are the conclusions supported by the data?

    Partially support

    Do you have any concerns of possible image manipulation, plagiarism or any other unethical practices?

    No

    Is there a potential financial or other conflict of interest between yourself and the author(s)?

    No

    If this manuscript involves human and/or animal work, have the subjects been treated in an ethical manner and the authors complied with the appropriate guidelines?

    Yes