Fully resolved assembly of Cryptosporidium parvum

Vipin K Menon
Pablo C Okhuysen
Cynthia L Chappell
Medhat Mahmoud
Medhat Mahmoud
Qingchang Meng
Harsha Doddapaneni
Vanesa Vee
Yi Han
Sejal Salvi
Sravya Bhamidipati
Kavya Kottapalli
George Weissenberger
Hua Shen
Matthew C Ross
Kristi L Hoffman
Sara Javornik Cregeen
Donna M Muzny
Ginger A Metcalf
Richard A Gibbs
Joseph F Petrosino
Fritz J Sedlazeck

This article has been Reviewed by the following groups

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

Evaluated articles (GigaScience)

Abstract

Background

Cryptosporidium parvum is an apicomplexan parasite commonly found across many host species with a global infection prevalence in human populations of 7.6%. Understanding its diversity and genomic makeup can help in fighting established infections and prohibiting further transmission. The basis of every genomic study is a high-quality reference genome that has continuity and completeness, thus enabling comprehensive comparative studies.

Findings

Here, we provide a highly accurate and complete reference genome of Cryptosporidium parvum. The assembly is based on Oxford Nanopore reads and was improved using Illumina reads for error correction. We also outline how to evaluate and choose from different assembly methods based on 2 main approaches that can be applied to other Cryptosporidium species. The assembly encompasses 8 chromosomes and includes 13 telomeres that were resolved. Overall, the assembly shows a high completion rate with 98.4% single-copy BUSCO genes.

Conclusions

This high-quality reference genome of a zoonotic IIaA17G2R1 C. parvum subtype isolate provides the basis for subsequent comparative genomic studies across the Cryptosporidium clade. This will enable improved understanding of diversity, functional, and association studies.

GigaScience
Mar 14, 2022

This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giac010), which carries out open, named peer-review.

These reviews are published under a CC-BY 4.0 license and were as follows:

Reviewer 2: Juan Alzate

The present paper entitled "Fully resolved assembly of Cryptosporidium parvum" shows the results of the genomic sequencing of the protozoan parasite C. parvum using both 2nd (Novaseq) and 3rd (ONT) generations NGS technologies. Additionally, they assembled the C. parvum genome and compared their results with the previous C. parvum IOWAA II reference. The authors also undertake some QC analysis to validate chromosome models.

The paper is interesting because there is a need to have a fully resolved Cryptospodium genome. The sequencing by itself is not much an achievement, the authors applied …

This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giac010), which carries out open, named peer-review.

These reviews are published under a CC-BY 4.0 license and were as follows:

Reviewer 2: Juan Alzate

The present paper entitled "Fully resolved assembly of Cryptosporidium parvum" shows the results of the genomic sequencing of the protozoan parasite C. parvum using both 2nd (Novaseq) and 3rd (ONT) generations NGS technologies. Additionally, they assembled the C. parvum genome and compared their results with the previous C. parvum IOWAA II reference. The authors also undertake some QC analysis to validate chromosome models.

The paper is interesting because there is a need to have a fully resolved Cryptospodium genome. The sequencing by itself is not much an achievement, the authors applied commercially available platforms. In the assembly process, they also used already known assemblers and mapper tools. I think BUSCO does not deliver the detailed results expected here. Maybe a more comprehensive analysis, including all the single-copy genes present in the C. parvum, can help to better support the quality of the genome.

One additional recommendation is that the authors present a detailed analysis of single nucleotide variants (SNVs). This data can be extracted from the same BAM files that the authors already generated for Structural Variants analysis. This analysis is particularly important because it can show the readers how clonal is the C. parvum strain used.

I don't know if this is possible. Can you compare your genome model with the one published here BioRxiv - DOI: 10.1101/2021.01.29.428682.?

Please make public the raw-read data. (Novaseq and ONT raw reads)

Please explain in more detail in the Methods section how do you find and analyze the structural variants.

I don't understand why to estimate the genome size. Could you explain it?

Read the original source
GigaScience
Mar 14, 2022

This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giac010), which carries out open, named peer-review.

These reviews are published under a CC-BY 4.0 license and were as follows:

Reviewer 1: Matthew Knox

Overall, Menon et al. present a significant contribution to the field with this work. Their fully resolved assembly of Cryptosporidium parvum is the first to my knowledge to utilize long read sequencing in whole genome sequencing for this group of protozoan parasites and as such provides validation of previously published work while also improving on current reference standards and providing a robust and well described analysis pipeline for future studies.

In my view, there are only a couple of issues with the paper that should be addressed. The first is a discussion of recent work using …

This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giac010), which carries out open, named peer-review.

These reviews are published under a CC-BY 4.0 license and were as follows:

Reviewer 1: Matthew Knox

Overall, Menon et al. present a significant contribution to the field with this work. Their fully resolved assembly of Cryptosporidium parvum is the first to my knowledge to utilize long read sequencing in whole genome sequencing for this group of protozoan parasites and as such provides validation of previously published work while also improving on current reference standards and providing a robust and well described analysis pipeline for future studies.

In my view, there are only a couple of issues with the paper that should be addressed. The first is a discussion of recent work using metabarcoding (e.g. DOI10.1016/j.meegid.2012.08.017, DOI10.1016/j.ijpara.2017.03.003), which demonstrates mixed infections in clinical samples of patients infected with Cryptosporidium which were missed with consensus Sanger sequencing. In some cases, mixtures of subtype families can be found, though dominance of a single subtype with a few closely related variants is more common and more likely in the current paper. Nonetheless, this may have implications for sequencing since purity of the "culture" cannot be guaranteed and results from the lack of reliable in vitro culture methods for Cryptosporidium.

The second issue I have is with the section on comparative genomics. Strictly speaking calling this a comparative genomics analysis is not correct since the authors do not compare genomes with genomes. Instead, it is based on comparison with a small subset of sanger generated sequences and does not add much to the paper in my view. If it is to be included, the text should be rephrased to better reflect the analyses and the identity (species, subtype, subtype family) of the sequences downloaded from genbank should be presented in more detail. Also, it is unclear what criteria were used to select these sequences from among the many hundreds available for C. parvum and this should be stated too.

In addition to significant comments above, I detected a few inconsistencies and typographical errors in the submission and have included minor comments (sticky notes) in the attached pdf document. I hope that the authors find this helpful in improving the manuscript.

Read the original source
Version published to 10.1093/gigascience/giac010
Jan 1, 2022
Version published to 10.1101/2021.07.07.451495 on bioRxiv
Jul 8, 2021

Genomic characterisation of Mycoplasma genitalium in Victoria, Australia, reveals lineage diversification and drivers of antimicrobial resistance.

This article has 17 authors:
1. Francesca Azzato
2. George Taiaroa
3. Janath Fernando
4. Mona L. Taouk
5. Vesna De Petra
6. Lenka A. Vodstrcil
7. Erica L. Plummer
8. Kerry Raios
9. Niamh Meagher
10. Jacqueline Prestedge
11. Eloise Williams
12. Leon Caly
13. Danielle J. Ingle
14. Benjamin P. Howden
15. Shivani Pasricha
16. Catriona S. Bradshaw
17. Deborah A. Williamson
This article has no evaluationsLatest version Jan 19, 2026
Genome Architecture Reveals Hidden Strain-Level Diversity in the Highly Conserved Fish Pathogen Nocardia seriolae

This article has 4 authors:
1. Sk Injamamul Islam
2. Khandker Shahed
3. Md Imtiaz Ahamed
4. Haitham Mohammed
Reviewed by Access Microbiology

This article has 3 evaluationsLatest version Dec 16, 2025Latest activity Apr 13, 2025
16S rRNA Variable Region Coverage in Salmonella enterica: Insights for Molecular Surveillance and Diagnostic Accuracy

This article has 4 authors:
1. Anubha Kumari
2. Md Misbaul Rashid
3. Priyambada Kumari
4. Abhishek Kumar Jaiswal
This article has no evaluationsLatest version Jan 22, 2026

This article has been Reviewed by the following groups

Discuss this preprint

Listed in

Abstract

Background

Findings

Conclusions

Article activity feed

Related articles

Genomic characterisation of Mycoplasma genitalium in Victoria, Australia, reveals lineage diversification and drivers of antimicrobial resistance.

Genome Architecture Reveals Hidden Strain-Level Diversity in the Highly Conserved Fish Pathogen Nocardia seriolae

16S rRNA Variable Region Coverage in Salmonella enterica: Insights for Molecular Surveillance and Diagnostic Accuracy