ExTaxsI: an exploration tool of biodiversity molecular data
This article has been Reviewed by the following groups
Listed in
- Evaluated articles (GigaScience)
Abstract
Background
The increasing availability of multi-omics data is leading to regularly revised estimates of existing biodiversity data. In particular, the molecular data enable novel species to be characterized and the information linked to those already observed to be increased with new genomics data. For this reason, the management and visualization of existing molecular data, and their related metadata, through the implementation of easy-to-use IT tools have become a key point to design future research. The more users are able to access biodiversity-related information, the greater the ability of the scientific community to expand its knowledge in this area.
Results
In this article we focus on the development of ExTaxsI (Exploring Taxonomy Information), an IT tool that can retrieve biodiversity data stored in NCBI databases and provide a simple and explorable visualization. We use 3 case studies to show how an efficient organization of the available data can lead to obtaining new information that is fundamental as a starting point for new research. Using this approach highlights the limits in the distribution of data availability, a key factor to consider in the experimental design phase of broad-spectrum studies such as metagenomics.
Conclusions
ExTaxsI can easily retrieve molecular data and its metadata with an explorable visualization, with the aim of helping researchers to improve experimental designs and highlight the main gaps in the coverage of available data.
Article activity feed
-
This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giab092), which carries out open, named peer-review.
These reviews are published under a CC-BY 4.0 license and were as follows:
Reviewer 2: Luiz Gadelha
This manuscript proposes a presents a tool called ExTaxsI for management and plotting of molecular and taxonomic data from NCBI. Information can be persisted on a local database as well as FASTA-formatted sequences, which can be used to display the information as scatter or sunburst-pie plots, and maps. The tool uses the Entrez API from NCBI to retrieve data. It also uses the ETE toolkit to manage taxonomic data. Three use cases were presenting to demonstrate ExTaxsI:
- geospatial distribution and gene data of Atlantic cod and the Gadiformes Order,
- exploration of biodiversity data related to the …
This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giab092), which carries out open, named peer-review.
These reviews are published under a CC-BY 4.0 license and were as follows:
Reviewer 2: Luiz Gadelha
This manuscript proposes a presents a tool called ExTaxsI for management and plotting of molecular and taxonomic data from NCBI. Information can be persisted on a local database as well as FASTA-formatted sequences, which can be used to display the information as scatter or sunburst-pie plots, and maps. The tool uses the Entrez API from NCBI to retrieve data. It also uses the ETE toolkit to manage taxonomic data. Three use cases were presenting to demonstrate ExTaxsI:
- geospatial distribution and gene data of Atlantic cod and the Gadiformes Order,
- exploration of biodiversity data related to the SARS-COV-2 pandemic.
Using ExTaxSi from the command-line apparently produces consistent and correct outputs. However, ExTaxSi functionality seems to be available only through this command-line interface. This considerably limits the applicability of the tool since many researchers usually incorporate these routines programmatically to their scripts. It would be more useful if ExTaxSi functions were provided additionally through a library that could be imported in Python scripts. This would enable more use cases and would lead to a wider applicability. Some issues in a previous submission of this manuscript were corrected. A more detailed comparison with related tools is included and the installation instructions for the tool now work correctly. The documentation was also significantly improved.
-
This work has been peer reviewed in GigaScience (see paper https://doi.org/10.1093/gigascience/giab092), which carries out open, named peer-review.
These reviews are published under a CC-BY 4.0 license and were as follows:
Reviewer 1: Iddo Friedberg
The authors have markedly improved the software in terms of usability and documentation. The manuscript could still use some language editing.
-
-