Small pangenome of Candida parapsilosis reflects overall low intraspecific diversity
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Candida parapsilosis is an opportunistic yeast pathogen that can cause life-threatening infections in immunocompromised humans. Whole genome sequencing (WGS) studies of the species have demonstrated remarkably low diversity, with strains typically differing by about 1.5 single nucleotide polymorphisms (SNPs) per 10 kb. However, SNP calling alone does not capture the full extent of genetic variation. Here, we define the pangenome of 372 C. parapsilosis isolates to determine variation in gene content. The pangenome consists of 5,859 genes, of which 48 are not found in the genome of the reference strain. This includes 5,791 core genes (present in ≥ 99.5% of isolates). Four genes, including the allantoin permease gene DAL4 , were present in all isolates but were truncated in some strains. The truncated DAL4 was classified as a pseudogene in the reference strain CDC317. CRISPR-Cas9 gene editing showed that removing the early stop codon (producing the full-length Dal4 protein) is associated with improved use of allantoin as a sole nitrogen source. We find that the accessory genome of C. parapsilosis consists of 68 homologous clusters. This includes 38 previously annotated genes, 27 novel paralogs of previously annotated genes and 3 uncharacterised ORFs. Approximately one-third of the accessory genome (24/68 genes) is associated with gene fusions between tandem genes in the major facilitator superfamily (MFS). Additionally, we identified two highly divergent C. parapsilosis strains and find that, despite their increased phylogenetic distance (∼30 SNPs per 10 kb), both strains have similar gene content to the other 372.
Importance
Candida parapsilosis is a human fungal pathogen, listed in the high priority group by the World Health Organisation. It is an increasing cause of hospital-acquired and drug-resistant infection. Here, we studied the genetic diversity of 372 C. parapsilosis isolates, the largest genomic surveillance of this species to date. We show that there is relatively little genetic variation. However, we identified two more distantly-related isolates from Germany, suggesting that even more sampling may yield more diversity. We find that the pangenome (the cumulative gene content of all isolates) is surprisingly small, compared to other fungal species. Many of the non-core genes are involved in transport. We also find that variations in gene content are associated with nitrogen metabolism, which may contribute to the virulence characteristics of this species.