Nextstrain automates real-time phylodynamic analysis of open data for endemic and emerging pathogens
This article has been Reviewed by the following groups
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
- Evaluated articles (PREreview)
Abstract
Motivation: Genome sequencing provides an exceptional window into the evolutionary and epidemiological dynamics of endemic and emerging pathogens, and thus allows for better, more targeted, public health interventions. Online genomic surveillance platforms can provide near real-time insight into these dynamics. Results: Nextstrain provides continually updated real-time genomic surveillance for 21 viruses and the bacterial pathogen Mycobacterium tuberculosis, with most analyses relying solely on open sequence data. Each pathogen includes steps to fetch and curate open data, classify sequences using established nomenclature systems, perform phylodynamic analyses, and share the results publicly. These analyses are automated, with most running daily to provide continually updated snapshots of pathogen evolution. Availability and Implementation: All source code is available at https://github.com/nextstrain. Phylodynamic results can be visualized and downloaded at https://nextstrain.org/pathogens, and open sequence data and curated metadata are available at https://nextstrain.org/pathogens/files.
Article activity feed
-
This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/19962906.
A brief summary of the article Nextstrain is a real-time genomic surveillance system that turns public pathogen sequence data into continuously updated evolutionary and epidemiological analyses. The paper explains the software and workflow architecture behind these pipelines, including how they are automated and customized for different viruses and for Mycobacterium tuberculosis. It also shows examples of outbreak response and broader public health use cases, such as mpox and avian influenza.
What is the main research question? how to build and maintain a scalable, automated, open-source system for real-time pathogen genomic surveillance using publicly shared data.
What type of study …
This Zenodo record is a permanently preserved version of a PREreview. You can view the complete PREreview at https://prereview.org/reviews/19962906.
A brief summary of the article Nextstrain is a real-time genomic surveillance system that turns public pathogen sequence data into continuously updated evolutionary and epidemiological analyses. The paper explains the software and workflow architecture behind these pipelines, including how they are automated and customized for different viruses and for Mycobacterium tuberculosis. It also shows examples of outbreak response and broader public health use cases, such as mpox and avian influenza.
What is the main research question? how to build and maintain a scalable, automated, open-source system for real-time pathogen genomic surveillance using publicly shared data.
What type of study design is used? It presents the design and implementation of computational pipelines, describes workflow automation, and illustrates the platform with case examples and summary statistics across pathogens.
What are the main findings? Nextstrain now supports automated real-time analyses for 19 pathogens using open data, with most pipelines updating almost daily. The authors show that the platform can rapidly provide situational awareness during outbreaks and can track transmission, geographic spread, lineage emergence, and spillover events. They also demonstrate that the system is modular and customizable enough to support different genomes, data sources, and public health questions.
Do the results support the authors' conclusions? Why or why not? Yes. The system can produce useful and timely genomic surveillance showing examples where the platform informed outbreak interpretation.
What is one limitation of the study? The system depends on the availability, completeness, and reliability of public data, so missing or delayed submissions can limit what the platform can inform. The result could have been compared with other alternative systems to show its quantitative validity.
2 strengths One strength is the automation. Second strength is the adaptability; the platform covers many pathogens and adapts to different biological features by customizing the pipelines.
1 major concern Uneven sampling and data quality across regions, time periods, and pathogens, which can affect phylodynamic inferences.
What makes a peer review fair and ethical? Objective and respectful review based on the paper's scientific merit rather than the reviewers' personal preferences or competition.
What is one potential bias you should be aware of when reviewing this paper? Nextstrain already has an established reputation (built on 2018). This paper reports significant expansions and refinements. The potential bias would be that the reviewers may already have a positive view on Nextstrain which can make them less critical.
Competing interests
The author declares that they have no competing interests.
Use of Artificial Intelligence (AI)
The author declares that they used generative AI to come up with new ideas for their review.
-