Linkage-based ortholog refinement in bacterial pangenomes with CLARC
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Bacterial genomes exhibit significant variation in gene content and sequence identity. Pangenome analyses explore this diversity by classifying genes into core and accessory clusters of orthologous groups (COGs). However, strict sequence identity cutoffs can misclassify divergent alleles as different genes, inflating accessory gene counts. CLARC (Connected Linkage and Alignment Redefinition of COGs) [ https://github.com/IndraGonz/CLARC ] improves pangenome analyses by condensing accessory COGs using functional annotation and linkage information. Through this approach, orthologous groups are consolidated into more practical units of selection. Analyzing 8,000+ Streptococcus pneumoniae genomes, CLARC reduced accessory gene estimates by more than 30% and improved evolutionary predictions based on accessory gene frequencies. By refining COG definitions, CLARC offers critical insights into bacterial evolution, aiding genetic studies across diverse populations.