Geographical reconstruction of the SARS‐CoV‐2 outbreak in Lombardy (Italy) during the early phase
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (ScreenIT)
Abstract
The first identification of autochthonous transmission of SARS‐CoV‐2 in Italy was documented by the Laboratory of Clinical Microbiology, Virology and Bioemergencies of L. Sacco Hospital (Milano, Italy) on 20th February 2020 in a 38 years old male patient, who was found positive for pneumonia at the Codogno Hospital. Thereafter Lombardy has reported the highest prevalence of COVID‐19 cases in the country, especially in Milano, Brescia and Bergamo provinces. The aim of this study was to assess the potential presence of different viral clusters belonging to the six main provinces involved in Lombardy COVID‐19 cases in order to highlight peculiar province‐dependent viral characteristics. A phylogenetic analysis was conducted on 20 full length genomes obtained from patients addressing to several Lombard hospitals from February 20th to April 4th, 2020, aligned with 41 Italian viral genome assemblies available on GISAID database as of 30th March, 2020: two main monophyletic clades, containing 8 and 53 isolates, respectively, were identified. Noteworthy, Bergamo isolates mapped inside the small clade harbouring M gene D3G mutation. The molecular clock analysis estimated a cluster divergence approximately one month before the first patient identification, supporting the hypothesis that different SARS‐CoV‐2 strains had spread worldwide at different times, but their presence became evident only in late February along with Italian epidemic emergence. Therefore, this epidemiological reconstruction suggests that virus initial circulation in Lombardy was ascribable to multiple introduction. The phylogenetic reconstruction robustness, however, will be improved when more genomic sequences are available, in order to guarantee a complete epidemiological surveillance.
Article activity feed
-
-
SciScore for 10.1101/2020.07.23.20159871: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement IRB: Under Italian law, all sensitive data were deleted and only age, gender and sampling date were collected providing Ethics Committee approval unnecessary (Art. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Low quality reads bases were trimmed out using Trimmomatic software,10 using thirteen different parameter sets (LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36, LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36, LEADING:3 TRAILING:3 SLIDINGWINDOW:4:25 MINLEN:36, LEADING:3 TRAILING:10 SLIDINGWINDOW:4:15 MINLEN:36, LEADING:3 … SciScore for 10.1101/2020.07.23.20159871: (What is this?)
Please note, not all rigor criteria are appropriate for all manuscripts.
Table 1: Rigor
Institutional Review Board Statement IRB: Under Italian law, all sensitive data were deleted and only age, gender and sampling date were collected providing Ethics Committee approval unnecessary (Art. Randomization not detected. Blinding not detected. Power Analysis not detected. Sex as a biological variable not detected. Table 2: Resources
Software and Algorithms Sentences Resources Low quality reads bases were trimmed out using Trimmomatic software,10 using thirteen different parameter sets (LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:36, LEADING:3 TRAILING:3 SLIDINGWINDOW:4:20 MINLEN:36, LEADING:3 TRAILING:3 SLIDINGWINDOW:4:25 MINLEN:36, LEADING:3 TRAILING:10 SLIDINGWINDOW:4:15 MINLEN:36, LEADING:3 TRAILING:10 SLIDINGWINDOW:4:20 MINLEN:36, LEADING:3 TRAILING:10 SLIDINGWINDOW:4:25 MINLEN:36, LEADING:3 TRAILING:20 SLIDINGWINDOW:4:15 MINLEN:36, LEADING:3 TRAILING:20 SLIDINGWINDOW:4:20 MINLEN:36, LEADING:3 TRAILING:20 SLIDINGWINDOW:4:25 MINLEN:36, MAXINFO:50:0.3, MAXINFO:50:0.5, MAXINFO:50:0.7, MAXINFO:50:0.9). Trimmomaticsuggested: (Trimmomatic, RRID:SCR_011848)Then, SNP calling was performed following the GATK Best Practice procedure11, using the Wuhan-Hu-1 strain genome (accession MN908947.3) as reference. GATKsuggested: (GATK, RRID:SCR_001876)A global dataset including these 41 GISAID (Table 1) genome assemblies and the 20 genome assemblies produced in this study was produced and aligned using MAFFT. MAFFTsuggested: (MAFFT, RRID:SCR_011811)13 The low quality alignment regions at the extremities of the alignment were removed using Gblocks with default parameters. Gblockssuggested: (Gblocks, RRID:SCR_015945)The Hasegawa-Kishino-Yano model (HKY) was found as the simplest evolutionary model using JmodelTest 2.1.10.17 Phylogenetic analysis was performed using a Bayesian Markov Chain Monte Carlo (MCMC) method implemented in BEAST, v.1.10.418 with 10 million states and sampling every 1,000 steps. JmodelTestsuggested: (jModelTest, RRID:SCR_015244)BEASTsuggested: (BEAST, RRID:SCR_010228)Results from OddPub: We did not detect open data. We also did not detect open code. Researchers are encouraged to share open data when possible (see Nature blog).
Results from LimitationRecognizer: We detected the following sentences addressing limitations in the study:Given that the role of these mutations is still unclear, an Indian group investigated their possible influence on virus replication interference mediated by miRNA: they found out how some miRNA, present in different pathological conditions, are likely to bind to native N gene and repress its expression, thus helping in disease progression limitation; on the contrary, mutated variants could increase their chances of interference escape.26 Another small subclade was found in the main one, characterized by the presence of N gene mutation V246I in three sequences, all from Friuli Venezia Giulia region. Besides the unique geographical origin, it is noticeable that in GISAID map ‘Geography’ the V246I mutation is actually present only in Italy, as well as the V246A one only in Israel. Their rarity could have two probable explanations: on the one hand, available data are still limited, making difficult to have a reliable distribution; on the other hand, these variations could have a negative influence on viral fitness, diminishing efficacy in replication and consequently virus transmission. Other mutations found in the present work had negligible influence on phylogenetic analysis, even a biological significance can not be excluded: viral genome and proteins are key factors in patients management and any variation can extremely burden the efficacy of drugs, vaccines and diagnostic tools or be related to a more severe clinical presentation.27,28 In conclusion, this study gave insights...
Results from TrialIdentifier: No clinical trial numbers were referenced.
Results from Barzooka: We did not find any issues relating to the usage of bar graphs.
Results from JetFighter: We did not find any issues relating to colormaps.
Results from rtransparent:- Thank you for including a conflict of interest statement. Authors are encouraged to include this statement when submitting to a journal.
- Thank you for including a funding statement. Authors are encouraged to include this statement when submitting to a journal.
- No protocol registration statement was detected.
-