Sequence, Structural and Functional Diversity of the Ubiquitous DNA/RNA-Binding Alba Domain

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article


The DNA/RNA-binding Alba domain is prevalent across all kingdoms of life. Discovered in archaea more than 40 years ago, this protein domain appears to have evolved from RNA- to DNA- binding, with a concomitant expansion in the range of cellular processes that it regulates. Indeed, Alba domain-containing proteins continue to exhibit functional ingenuity, with their most recent association being R-loop regulation in plants: they accomplish this by binding to DNA:RNA hybrid structures as heterodimers. To further explore Alba diversity and evolutionary relatedness, which could in turn lead to new functional revelations, we employed iterative searches in PSI-BLAST to identify 15161 true Alba domain-containing proteins from the NCBI non-redundant protein database. Next, by building sequence similarity networks (SSNs), we identified 13 distinct clusters representing various Alba subgroups. These included the classic eukaryotic Rpp20/Pop7 and Rpp25/Pop6 proteins as well as novel fungal Alba proteins and Plasmodium -specific Albas. Conservation analysis indicated that, despite overall low primary sequence similarity between the SSN clusters, the homo- or hetero-dimer interface is highly conserved. Furthermore, the presence of the classic Alba fold was confirmed in representative sequences from each cluster by comparison of their Hidden Markov Model (HMM) profiles and ab initio three-dimensional structures. Notably, Alba domains from lower, unicellular eukaryotes and fungal Albas exhibit structural deviations towards their C-terminal end. Finally, phylogenetic analysis, while supporting SSN clustering, revealed the evolutionary branchpoint at which the eukaryotic Rpp20- and Rpp25-like clades emerged from archaeal Albas, and the subsequent taxonomic lineage-based divergence within each clade. Taken together, this comprehensive analysis enhances our understanding of the evolutionary history of Alba domain-containing proteins across diverse organisms, their sequence and structural conservation, and functional implications for genome and RNA biology.

Article activity feed