Genomic and evolutionary factors influencing the prediction accuracy of optimal growth temperature in prokaryotes
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Bacteria and archaea have evolved diverse genomic adaptations to thrive across various temperatures. These adaptations include genome sequence optimizations, such as increased GC content in rRNA and tRNA, shifts in codon and amino acids usage, and the acquisition of functional genes conferring adaptation for specific temperatures. Since the experimental determination of optimal growth temperatures (OGT) is only possible for cultured species, predicting OGT from genomic information has become increasingly important given the exponential increase in genomic data. Although previous studies developed prediction models integrating multiple features based on genome composition using machine learning, the accuracy was variable depending on the target species, with models performing well for thermophiles but less accurately for psychrophiles. In this study, we curated the OGT and genomic data of 2,869 bacterial species to develop a novel prediction model incorporating features reflecting genomic adaptation toward lower temperatures. We found that species with rapid OGT shifts from their ancestors, including psychrophiles, showed less accuracy in genome composition-based models. Incorporating the gene presence/absence information associated with the rapid changes in OGT improved the prediction accuracy for psychrophiles. We also observed that OGT in archaea is phylogenetically more conserved than in bacteria, which may lead to the long-term optimization of the genome composition and explain high predictability of OGT in archaea. These findings highlight the importance of integrating long- and short-term evolutionary adaptations for phenotype prediction models.