Linking geography, isolation source, and genomic diversity in a global Candida albicans phylogeny
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Candida albicans is a common commensal species in multiple sites in the human microbiome that can also be an opportunistic pathogen across the body. Previous phylogenomic analyses have identified major clades, but these studies often relied on imprecise genomic methods or unphased genomes, limited geographic and ecological sampling, and a phylogenetic resolution strategy that has not been universally standardized within the research community. Here, we address these gaps by reconstructing a whole-genome phylogeny to examine how geography and site of isolation contribute to phylogenetic structure in C. albicans . We analyzed phased genomes from 938 global isolates acquired from diverse clinical and ecological contexts, including soil, and applied an agnostic, threshold-based clustering approach to systematically define cluster boundaries. In addition, we examined genomic features such as aneuploidy, the distribution of mating-type locus ( MTL ), genome-wide heterozygosity, and RNA interference (RNAi) disruption. Our analyses preserved the previously defined major clusters while identifying six novel clusters, predominantly composed of highly admixed Asian isolates. Although geographic origin and isolation source were each significantly associated with cluster, these associations were confounded because isolates from specific regions were disproportionately derived from particular sources, preventing attribution of the observed clustering to either factor. Over 95% of the isolates were heterozygous at the MTL , although homozygous forms were enriched in some clusters. Analysis of the AGO1 PAZ domain revealed both known and novel RNAi variants, predominantly in a heterozygous state. Aneuploidy was present in 8% of isolates, spread across the phylogeny. Intra-host analysis of isolates from 95 people revealed predominantly clonal colonization, though fourteen of the individuals harboured multiple genetic clusters. This study refines the phylogenetic structure of C. albicans , demonstrating how genomic features such as aneuploidy, heterozygosity, MTL composition, and RNAi disruption vary across isolates and provide insights into genomic plasticity in this species.