A complete-genome view of phylum Nanobdellota and recurrent Form III RuBisCO transfer between archaea and Patescibacteriota
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The archaeal phylum Nanobdellota (formerly Nanoarchaeota ) was previously represented by four complete genomes. We present 208 complete Nanobdellota genomes from Oxford Nanopore metagenomes of the Baltic Sea water column and Fennoscandian groundwater (69–201 m below sea level), rotated to the ORC1/Cdc6 replication origin — a 52-fold expansion of complete-genome representation.
Across the ar53 supermatrix and a Nanobdellota -tuned 71-marker supermatrix on 1,239 taxa, the named GTDB orders within Nanobdellota are recovered as monophyletic clades, including the three orders that dominate our environmental sampling: Woesearchaeales , Pacearchaeales , and the GTDB placeholder order SCGC-AAA011-G17. This is consistent with the existing GTDB R232 order-level circumscription. We retire the SCGC-AAA011-G17 placeholder name, replacing it with a complete-genome-anchored SeqCode nomenclatural chain ( Maxwellarchaeales ord. nov., Maxwellarchaeaceae fam. nov., Maxwellarchaeum gen. nov., and Maxwellarchaeum balticum sp. nov.) without altering the order-level circumscription.
Pacearchaeales and Maxwellarchaeales retain no central or energy metabolism beyond Form III RuBisCO, PEP synthase, and ferredoxin; Woesearchaeales retains partial glycolysis and a V/A-type ATPase. A 4,262-tip phylogeny of rbcL (the RuBisCO large-subunit gene) identifies nine candidate archaea-to- Patescibacteriota Form III RuBisCO transfer events — including one to a Baltic Minisyncoccia — versus two reciprocal candidates, consistent with archaea-to-CPR being the more frequently identified direction in our data. All 256 Nanobdellota genomes (208 complete + 48 high-quality non-circular), the ar71 marker set with its 1,239-taxon ML tree, 154 Nanobdellota -trained HMMs for KEGG-ortholog detection in DPANN proteomes (94 ROBUST), and the 4,262-tip rbcL reference tree are released as a community resource, alongside the full analysis archive — alignments, intermediate trees, structural predictions, and per-step scripts — at Zenodo (DOI 10.5281/zenodo.20174424; see Using the resource ).