A Time Machine for Taxonomy

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

The NCBI Taxonomy Database is the primary resource for linking genomic information to taxonomic relationships, widely used across scientific disciplines and critically important to bioinformatics. This database is continuously changing as researchers discover and refine taxonomic relationships. Yet, tracking and comparing past taxonomic states is challenging due to frequent changes and the need to sift through numerous historical snapshots. To address this, we developed the Taxonomy Time Machine: a database for storing many snapshots of a taxonomic tree in a space-efficient manner. We have also created a web-based and programmatic (API) interface to make this data more accessible. This tool is capable of accurately reconstructing taxonomic lineages at any point in the history of the NCBI Taxonomy Database. We demonstrate that this tool is both perfectly accurate and significantly more efficient than loading and querying individual taxonomy snapshots, enabling its use on desktop computers as well as commodity web servers. We have made this tool available on the web ( https://taxonomy.onecodex.com ) as well as open source under the MIT license ( https://github.com/onecodex/taxonomy-time-machine ).

Article activity feed