Latest preprint reviews

  1. CompactTree: a lightweight header-only C++ library and Python wrapper for ultra-large phylogenetics

    This article has 1 author:
    1. Niema Moshiri
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      As volumes of viral and bacterial sequence data grow exponentially, the field of computational phylogenetics now demands resources to manage the burgeoning scale of this input data. This study introduces CompactTree, a C++ library designed for ultra-large phylogenetic trees with millions of tips. To address these scalability issues while maintaining ease of incorporation into external code bases, CompactTree is a header-only library with enhanced performance utilizing minimal dependencies, optimized node representation, and memory-efficient tree structure schemes. Resulting in significantly reduced memory footprints and improved processing times. Peer review requested some more detail on the functionality and some real-world examples, demonstrating the current utility of the tool. Although primarily supporting the (text-based) Newick format, the increased and extensibility scalability holds promise for multiple biological and epidemiological applications supporting more complex formats such as Nexus and NeXML. The tool is open source (GPLv3 licensed) and available in GitHub: https://niema.net/CompactTree

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  2. Draft genome of the endangered visayan spotted deer (Rusa alfredi), a Philippine endemic species

    This article has 8 authors:
    1. Ma. Carmel F. Javier
    2. Albert C. Noblezada
    3. Persie Mark Q. Sienes
    4. Robert S. Guino-o
    5. Nadia Palomar-Abesamis
    6. Maria Celia D. Malay
    7. Carmelo S. del Castillo
    8. Victor Marco Emmanuel N. Ferriols
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      The Visayan spotted deer (Rusa alfredi), is a small, endangered, primarily nocturnal species of deer found in the rainforests of the Visayan Islands in the Philippines. The present study reports the first draft genome assembly for the species, addressing a critical gap in genomic data for this IUCN-redlisted cervid. Using Illumina sequencing, the resulting genome assembly spans 2.52 Gb in size with a BUSCO completeness score of 95.5% and encompasses 24,531 annotated genes. Phylogenetic analysis suggests a close evolutionary relationship between R. alfredi and Cervus species suggesting that the genus Rusa is sister to Cervus. Peer-review teased out more benchmarking results and the annotation files, demonstrating this genomic resource is useful and usable for advancing population genetics and evolutionary studies, thereby informing conservation strategies and enhancing breeding programs for the critically threatened species. Providing whole genome sequences for other native species of Rusa could further provide genomic resources for detecting hybrids, which will also help the management and monitoring of these species, especially for the reintroduction of captive populations in the wild.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  3. The assembly and annotation of two teinturier grapevine varieties, Dakapo and Rubired

    This article has 6 authors:
    1. Eleanore J. Ritter
    2. Noé Cochetel
    3. Andrea Minio
    4. Peter Cousins
    5. Dario Cantu
    6. Chad Niederhuth
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      Teinturier grapes produce berries with pigmented skin and flesh, and are used in red wine blends, as they provide a deeper colour. This paper presents the genomes of two popular teinturier varieties (Dakapo and Rubired); sequenced, assembled, and annotated to provide additional resources for their use in breeding. Combining Nanopore and Illumina sequencing for Dakapo, scaffolding to the existing grapevine assembly to generate a final assembly of 508.5 Mbp and 36,940 gene annotations. For Rubired PacBio HiFi reads were assembled, scaffolded, and phased to generate a diploid assembly with two haplotypes 474.7-476.0 Mbp long and 56,681 genes annotated. Peer review has helped validate their high quality, these genomes hopefully enabling more insight into the genetics of grapevine berry colour and their other traits like frost and mildew-resistance.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  4. SqueezeCall: nanopore basecalling using a Squeezeformer network

    This article has 1 author:
    1. Zhongxu Zhu
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      The accuracy of basecalling of nanopore sequencing still needs to be improved. With recent advances in deep learning this paper introduces SqueezeCall, a novel end-to-end tool for accurate basecalling. This uses Squeezeformer-achitecture which integrates local context extraction through convolutional layers and long-range dependency modeling via global context acquisition. Testing and peer review demonstrated that SqueezeCall outperformed traditional RNN and Transformer-based basecallers across multiple datasets, indicating its potential to refine genomic assembly and facilitate direct detection of modified bases in future genomic analytics. Future work is ongoing that will focus on training on highly curated datasets, including known modifications, to further increase research value. SqueezeCall is MIT licensed and available from GitHub here: https://github.com/labcbb/SqueezeCall

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  5. A practical DNA data storage using an expanded alphabet introducing 5-methylcytosine

    This article has 11 authors:
    1. Deruilin Liu
    2. Demin Xu
    3. Liuxin Shi
    4. Jiayuan Zhang
    5. Kewei Bi
    6. Bei Luo
    7. Chen Liu
    8. Yuxiang Li
    9. Guangyi Fan
    10. Wen Wang
    11. Zhi Ping
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      DNA has huge potential as a data storage medium because of its incredibly high storage density and stability. This work addresses the potential of modified bases, specifically 5-methylcytosine (5mC), in enhancing DNA data storage systems. This paper introduces a transcoding scheme named R+, which incorporates this modified 5mC base to increase information density beyond the standard limits. By encoding various file types into DNA sequences of between 1.3 to 1.6 kb in size, this method achieves an average recovery rate of 98.97% (with reference), validating the effectiveness of the method. On top of a wet-lab protocol (hosted in protocols.io) for the experimental validation of the transcoding scheme, it also includes open source code for in-silico simulation tests. Peer review scruitinising the protocols and validation are reusable and provide convincing results. As nanopore sequencing has enabled reading of these modified bases, it is timely making them applicable as extra letters in the molecular alphabet for DNA data storage

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  6. Polyploid genome assembly of Cardamine chenopodiifolia

    This article has 8 authors:
    1. Aurélia Emonet
    2. Mohamed Awad
    3. Nikita Tikhomirov
    4. Maria Vasilarou
    5. Miguel Pérez-Antón
    6. Xiangchao Gan
    7. Polina Yu. Novikova
    8. Angela Hay
    This article has been curated by 1 group:
    • Curated by GigaByte

      This evaluation refers to version 1 of the preprint

      This work presents the genome of Cardamine chenopodiifolia, an amphicarpic plant (developing two fruit types, one above and another below ground) in the mustard (Brassicaceae) family. Cardamines also known as bittercresses and toothworts. As an octoploid species it has been challenging to create a genome reference for this species, and in this case the authors finally managed to achieve this using PacBio HiFi long-reads and Omni-C technology to assemble a fully phased, chromosome-level genome. Obtaining a 597Mb genome assembled into 32 phased chromosomes (plus mitochondrial and plastid genomes), and only having one gap in the centromeric region of chromosome 9. Peer review asked for additional QC and benchmarking, helping demonstrate the genome quality was very high, with only one gap and a N50 of 18.80Mb. The data presented here potentially helping to develop this species as an emerging model organism in the Brassicaceae for studying the development and evolution of amphicarpy by allopolyploidy.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 1 listLatest version Latest activity
  7. The genome of the sapphire damselfish Chrysiptera cyanea: a new resource to support further investigation of the evolution of Pomacentrids

    This article has 5 authors:
    1. Emma Gairin
    2. Saori Miura
    3. Hiroki Takamiyagi
    4. Marcela Herrera
    5. Vincent Laudet
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      Among hot topics in coral reef research, the difference between anemonefish and other damselfish is currently a popular area of research. In this study the authors provide a new high-quality non-anemonefish genome, which will be of high relevance to further the depth of such analyses. In this case of the sapphire damselfish Chrysiptera cyanea, a widely distributed damselfish in the Indo-Pacific area, often studied to elucidate the roles of various environmental controls on their reproduction, and investigate related hormonal processes To further the potential of biomolecular analyses based on this species, this study generated the first genome of a Chrysiptera fish from a male individual collected in Okinawa, Japan. Using PacBio and HiFI long-read sequencing with 94.5x coverage, a chromosome-scale genome was assembled and 28,173 genes identified and annotated. Peer review gathered more parameters and details on the quality, and the final assembly comprised of 896 Mb pairs across 91 contigs, and a BUSCO completeness of 97.6%. This reference genome should therefore be of high value for future genetic-based approaches, from population structure to gene expression analyses.
      

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  8. NeuroVar: an open-source tool for the visualization of gene expression and variation data for biomarkers of neurological diseases

    This article has 3 authors:
    1. Hiba Ben Aribi
    2. Najla Abassi
    3. Olaitan I. Awe

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 1 listLatest version Latest activity
  9. Whole-genome re-sequencing of the Baikal seal and other phocid seals for a glimpse into their genetic diversity, demographic history, and phylogeny

    This article has 3 authors:
    1. Marcel Nebenführ
    2. Ulfur Arnason
    3. Axel Janke
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      Due to them being found in the landlocked, isolated habitat of Lake Baikal makes the Baikal Seal (Pusa sibirica) unique among all pinnipeds as the only freshwater seal. This paper presents reference-based assemblies of six newly sequenced Baikal seal individuals, one individual of the ringed seal, as well as the first short-read data of the harbor seal and the Caspian seal . This data aiding the study of the genomic diversity of the Baikal seal and to contribute baseline data to the limited genomic data available for seals. Peer review extended the description of the used tools and parameters in the revised manuscript, and provided some more information on the methods..This newly generated sequencing data hopefully now helps to extend the phylogeny of the Phoca/Pusa group on genome-wide data and can also broaden the view into the genetic structure and diversity of the Baikal seal

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  10. TSTA: thread and SIMD-based trapezoidal pairwise/multiple sequence-alignment method

    This article has 4 authors:
    1. Peiyu Zong
    2. Wenpeng Deng
    3. Jian Liu
    4. Jue Ruan
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      The article presents strategies for accelerating sequence alignment using multithreading and SIMD (Single Instruction, Multiple Data) techniques, and introduces a new algorithm called TSTA (Thread and SIMD-Based Trapezoidal Pairwise/Multiple Sequence-Alignment). The Technical Release write-up presenting a detailed description of TSTA's performance in pairwise sequence alignment (PSA) and multiple sequence alignment (MSA), and compares it with various existing alignment algorithms. Demonstrating the performance gains achieved by vectorized SIMD technology and the application of threading. Testing and debugging a few errors, and adding some more background detail, demonstrating it can achieve faster comparison speed. Demonstrating TSTA's efficacy in pairwise sequence alignment and multiple sequence alignment, particularly with long reads, and showcasing considerable speed enhancements compared to existing tools.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
Page 1 of 11 Older