Showing page 1 of 2 pages of list content

  1. Whole-genome re-sequencing of the Baikal seal and other phocid seals for a glimpse into their genetic diversity, demographic history, and phylogeny

    This article has 3 authors:
    1. Marcel Nebenführ
    2. Ulfur Arnason
    3. Axel Janke
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      Due to them being found in the landlocked, isolated habitat of Lake Baikal makes the Baikal Seal (Pusa sibirica) unique among all pinnipeds as the only freshwater seal. This paper presents reference-based assemblies of six newly sequenced Baikal seal individuals, one individual of the ringed seal, as well as the first short-read data of the harbor seal and the Caspian seal . This data aiding the study of the genomic diversity of the Baikal seal and to contribute baseline data to the limited genomic data available for seals. Peer review extended the description of the used tools and parameters in the revised manuscript, and provided some more information on the methods..This newly generated sequencing data hopefully now helps to extend the phylogeny of the Phoca/Pusa group on genome-wide data and can also broaden the view into the genetic structure and diversity of the Baikal seal

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  2. TSTA: thread and SIMD-based trapezoidal pairwise/multiple sequence-alignment method

    This article has 4 authors:
    1. Peiyu Zong
    2. Wenpeng Deng
    3. Jian Liu
    4. Jue Ruan
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      The article presents strategies for accelerating sequence alignment using multithreading and SIMD (Single Instruction, Multiple Data) techniques, and introduces a new algorithm called TSTA (Thread and SIMD-Based Trapezoidal Pairwise/Multiple Sequence-Alignment). The Technical Release write-up presenting a detailed description of TSTA's performance in pairwise sequence alignment (PSA) and multiple sequence alignment (MSA), and compares it with various existing alignment algorithms. Demonstrating the performance gains achieved by vectorized SIMD technology and the application of threading. Testing and debugging a few errors, and adding some more background detail, demonstrating it can achieve faster comparison speed. Demonstrating TSTA's efficacy in pairwise sequence alignment and multiple sequence alignment, particularly with long reads, and showcasing considerable speed enhancements compared to existing tools.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  3. Chromosome-level genome assembly and annotation of the crested gecko, Correlophus ciliatus, a lizard incapable of tail regeneration

    This article has 3 authors:
    1. Marc A. Gumangan
    2. Zheyu Pan
    3. Thomas P. Lozito
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      The crested gecko (Correlophus ciliatus), is a lizard species endemic to New Caledonia, and a potentially interesting model organism due to its unusual (for a gecko) inability to regenerate amputated tails. With that in mind here is presented a new reference genome for the species, assembled using PacBio Sequel II platform and Dovetail Omni-C libraries. Producing a genome with a total size of 1.65 Gb, 152 scaffolds, a L50 of 6, and N50 of 109 Mb. Peer review making sure more detail was added on data acquisition and processing to enhance reproducibility. In the end producing potentially useful data for studying the genetic mechanisms involved in loss of tail regeneration.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  4. SMARTER-database: a tool to integrate SNP array datasets for sheep and goat breeds

    This article has 18 authors:
    1. Paolo Cozzi
    2. Arianna Manunza
    3. Johanna Ramirez-Diaz
    4. Valentina Tsartsianidou
    5. Konstantinos Gkagkavouzis
    6. Pablo Peraza
    7. Anna Maria Johansson
    8. Juan José Arranz
    9. Fernando Freire
    10. Szilvia Kusza
    11. Filippo Biscarini
    12. Lucy Peters
    13. Gwenola Tosser-Klopp
    14. Gabriel Ciappesoni
    15. Alexandros Triantafyllidis
    16. Rachel Rupp
    17. Bertrand Servin
    18. Alessandra Stella
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      This paper presents the SMARTER database, a collection of tools and scripts to gather, standardize, and share with the scientific community a comprehensive dataset of genomic data and metadata information on worldwide small ruminant populations. Which has come out of the EU multi-actor (12 country) H2020 project called SMARTER: SMAll RuminanTs breeding for Efficiency and Resilience. This bringing together genotypes for about 12,000 sheep and 6,000 goats, alongside phenotypic and geographic information. The paper providing insight into how the database was put together, presenting the code for the SMARTER—frontend, backend and API, alongside instructions for users. Peer review tested the platform and provided suggestions on improving the metadata. Demonstrating the project provides valuable information on sheep and goat populations around the world, that can be an essential tool for ruminant researchers. Enabling them to generate new insights and offer the possibility to store new genotypes and drive progress in the field.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  5. NucBalancer: streamlining barcode sequence selection for optimal sample pooling for sequencing

    This article has 2 authors:
    1. Saurabh Gupta
    2. Ankur Sharma
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      This paper presents NucBalancer, a R-pipeline and Shiny app designed for the optimal selection of barcode sequences for sample multiplexing in sequencing. Providing a user-friendly interface aiming to make this process accessible to both bioinformaticians and experimental researchers, enhancing its utility in adapting libraries prepared for one sequencing platform to be compatible with others. Important now with the introduction of additional sequencing platforms by Element Biosciences (AVITI System) and Ultima Genomics (UG100) increasing the diversity and capability of genomic research tools available. NucBalancer’s incorporation of dynamic parameters, including customizable red flag thresholds, allows for precise and practical barcode sequencing strategies. This adaptability is key in ensuring uniform nucleotide distribution, particularly in MGI sequencing and single-cell genomic studies, leading to more reliable and cost-effective sequencing outcomes across various experimental conditions. All the code is available under an open source license, and upon review the authors have also shared the code for the Shiny app.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  6. Building a community-driven bioinformatics platform to facilitate Cannabis sativa multi-omics research

    This article has 4 authors:
    1. Locedie Mansueto
    2. Tobias Kretzschmar
    3. Ramil Mauleon
    4. Graham J. King
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      This paper reports the establishment of the International Cannabis Genomics Research Consortium (ICGRC) web portal leveraging the open source Tripal platform to enhance data accessibility and integration for Cannabis sativa (Cannabis) multi-omics research. With the aim of bringing together the wealth of publicly available genomic, transcriptomic, proteomic, and metabolomic data sets to improve cannabis for food, fiber and medicinal traits. Tripal is a content management system for genomics data, presenting a ready-to-use specialized ‘omics modules for loading, visualization, and analysis, and is GMOD (Generic Model Organism Database) standards-compliant. The paper explaining how this was put together, what data and features are available, and providing a case study for other communities wanting to create their own Tripal platform. Covering their setup and customizations of the Tripal platform, and how they re-engineered modules for multi-omics data integration, and addition of many other custom features that can be reused. Peer review fixed a few minor bugs and added clarifications on how the platform will be updated.

      *This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity

    Scott C Edmunds

    There is a Cassyni webinar from the first author of this preprint which presents on this work here https://doi.org/10.52843/cassyni.y1p61f

  7. High-speed whole-genome sequencing of a Whippet: Rapid chromosome-level assembly and annotation of an extremely fast dog’s genome

    This article has 8 authors:
    1. Marcel Nebenführ
    2. David Prochotta
    3. Alexander Ben Hamadou
    4. Axel Janke
    5. Charlotte Gerheim
    6. Christian Betz
    7. Carola Greve
    8. Hanno Jörn Bolz
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      This Data Release paper presents the genome of the whippet breed of dog. Demonstrating a streamlined laboratory and bioinformatics workflows with PacBio HiFi long-read whole-genome sequencing that enables the generation of a high-quality reference genome within one week. The genome study being a collaboration between an academic biodiversity institute and a medical diagnostic company. The presented method of working and workflow providing examples that can be used for a wide range of future human and non-human genome projects. The final is 2.47 Gbp assembly being of high quality - with a contig N50 of 55 Mbp and a scaffold N50 of 65.7 Mbp. This reference being scaffolded into 39 chromosome-length scaffolds and the annotation resulting in 28,383 transcripts. The results also looked at the Myostatin gene which can be used for breeding purposes, as these heterozygous animals can have an advantage in dog races. The reviewers making the authors clarify this part a little better with additional results. Overall this study demonstrating how rapidly animal genome research can be carried out through close and streamlined time management and collaboration.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  8. RiboSnake – a user-friendly, robust, reproducible, multipurpose and documentation-extensive pipeline for 16S rRNA gene microbiome analysis

    This article has 9 authors:
    1. Ann-Kathrin Dörr
    2. Josefa Welling
    3. Adrian Dörr
    4. Jule Gosch
    5. Hannah Möhlen
    6. Ricarda Schmithausen
    7. Jan Kehrmann
    8. Folker Meyer
    9. Ivana Kraiselburd
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      This new software paper presents RiboSnake, a validated, automated, reproducible analysis pipeline implemented in the popular Snakemake workflow management system for microbiome analysis. Analysing16S rRNA gene amplicon sequencing data, this uses the widely used oQIIME2 [ tool as the basis of the workflow as it offers a wide range of functionality. Users of QIIME2 can be overwhelmed by the number of options at their disposal, and this workflow provides a fully automated and fully reproducible pipeline that can be easily installed and maintained. Providing an easy-to-navigate output accessible to non bioinformatics experts, alongside sets of already validated parameters for different types of samples. Reviewers requested some clarification for testing, worked examples and documentation, and this was improved to produce a convincingly easy-to-use workflow. Hopefully opening up an already very established technique to a new group of users and assisting them with reproducible science.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  9. PhysiMeSS - a new physiCell addon for extracellular matrix modelling

    This article has 4 authors:
    1. Vincent Noël
    2. Marco Ruscone
    3. Robyn Shuttleworth
    4. Cicely K. Macnamara
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      PhysiCell is an open source multicellular systems simulator for studying many interacting cells in dynamic tissue microenvironments. As part of the PhysiCell ecosystem of tools and modules this paper presents a PhysiCell addon, PhysiMeSS (MicroEnvironment Structures Simulation) which allows the user to accurately represent the extracellular matrix (ECM) as a network of fibres. This can specify rod-shaped microenvironment elements such as the matrix fibres (e.g. collagen) of the ECM, allowing the PhysiCell user the ability to investigate physical interactions with cells and other fibres. Reviewers asked for additional clarification on a number of features. And the paper now clear future releases will provide full 3D compatibility and include working on fibrogenesis, i.e. the creation of new ECM fibres by cells.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity

    Scott C Edmunds

    This is part of a series of papers on the PhysiCell Ecosystem, an open source, scalable codebase to simulate large systems of cells in 3-D tissues. See the other papers in the series here https://doi.org/10.46471/GIGABYTE_SERIES_0003

  10. Kinship analysis and pedigree reconstruction by RAD sequencing in cattle

    This article has 8 authors:
    1. Yiming Xu
    2. Wanqiu Wang
    3. Jiefeng Huang
    4. Minjie Xu
    5. Binhu Wang
    6. Yingsong Wu
    7. Yongzhong Xie
    8. Jianbo Jian
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      RAD-Seq (Restriction-site-associated DNA sequencing) is a cost-effective method for single nucleotide polymorphism (SNP) discovery and genotyping. In this study the authors performed a kinship analysis and pedigree reconstruction for two different cattle breeds (Angus and Xiangxi yellow cattle). A total of 975 cattle, including 923 offspring with 24 known sires and 28 known dams, were sampled and subjected to SNP discovery and genotyping using RAD-Seq. Producing a SNP panel with 7305 SNPs capturing the maximum difference between paternal and maternal genome information, and being able to distinguish between the F1 and F2 generation with 90% accuracy. Peer review helped highlight better the practical applications of this work. The combination of the efficiency of RNA-seq and advances in kinship analysis here can helpfully help improve breed management, local resource utilization, and conservation of livestock.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  11. Chromosomal-level genome assembly and single-nucleotide polymorphism sites of black-faced spoonbill Platalea minor

    This article has 20 authors:
    1. Hong Kong Biodiversity Genomics Consortium
    2. Jerome H.L. Hui
    3. Ting Fung Chan
    4. Leo L. Chan
    5. Siu Gin Cheung
    6. Chi Chiu Cheang
    7. James K.H. Fang
    8. Juan Diego Gaitan-Espitia
    9. Stanley C.K. Lau
    10. Yik Hei Sung
    11. Chris K.C. Wong
    12. Kevin Y.L. Yip
    13. Yingying Wei
    14. Wai Lok So
    15. Wenyan Nong
    16. Sean T.S. Law
    17. Paul Crow
    18. Aiko Leong
    19. Liz Rose-Jeffreys
    20. Ho Yin Yip
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment: This work is part of a series of papers from the Hong Kong Biodiversity Genomics Consortium sequencing the rich biodiversity of species in Hong Kong (see https://doi.org/10.46471/GIGABYTE_SERIES_0006). This example assembles the genome of the black-faced spoonbill (Platalea minor), an emblematic wading bird from East Asia that is classified as globally endangered by the IUCN. This Data Release reporting a 1.24Gb chromosomal-level genome assembly produced using a combination of PacBio SMRT and Omni-C scaffolding technologies. BUSCO and Merqury validation were carried out, gene models created, and peer reviewers also requested MCscan synteny analysis. This showed the genome assembly had high sequence continuity with scaffold length N50=53 Mb. Presenting data from 14 individuals this will hopefully be a useful and valuable resources for future population genomic studies aimed at better understanding spoonbill species numbers and conservation.

      *This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity

    Scott C Edmunds

    Reviewed and published by GigaByte as part of the Hong Kong Biodiversity Genomics series of papers they are publishing https://doi.org/10.46471/GIGABYTE_SERIES_0006

  12. Multicellular, IVT-derived, unmodified human transcriptome for nanopore-direct RNA analysis

    This article has 7 authors:
    1. Caroline A. McCormick
    2. Stuart Akeson
    3. Sepideh Tavakoli
    4. Dylan Bloch
    5. Isabel N. Klink
    6. Miten Jain
    7. Sara H. Rouhanifard
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      Oxford nanopore direct RNA sequencing (DRS) is a relatively new sequencing technology enabling measurements of RNA modifications. In vitro transcription (IVT)-based negative controls (i.e. modification-free transcripts) are a practical and targeted control for this direct sequencing, providing a baseline measurement for canonical nucleotides within a matched and biologically-derived sequence context. This work presents exactly this type of a long-read, multicellular, poly-A RNA-based, IVT-derived, unmodified transcriptome dataset. Review flagging more statistical analyses needed be performed for the data quality, and this was provided. The resulting data providing a resource to the direct RNA analysis community, helping reduce the need for expensive IVT library preparation and sequencing for human samples. And also serving as a framework for RNA modification analysis in other organisms.

      This evaluation refers to version 1 and 2 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  13. PhysiCell Studio: a graphical tool to make agent-based modeling more accessible

    This article has 8 authors:
    1. Randy Heiland
    2. Daniel Bergman
    3. Blair Lyons
    4. Julie Cass
    5. Heber L. Rocha
    6. Marco Ruscone
    7. Vincent Noël
    8. Paul Macklin
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      This paper presents a new tool to make using PhysiCell easier, which is an open-source, physics-based multicellular simulation framework with a very wide user base. PhysiCell Studio is a graphical tool that makes it easier to build, run, and visualize PhysiCell models. Over time, it has evolved from being a GUI to include many additional functionalities, and can be used as desktop and cloud versions. This paper outlines the many features and functions, the design and development process behind it, and deployment instructions. Peer review improved the organisation of the various repositories and adding both a requirements.txt and environment.yml files. Looking to the future the developers are planning to add new features based on community feedback and contributions, and this paper presents the many code repositories if readers wish to contribute to the development process.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity

    Scott C Edmunds

    See the introductory video for more on how this tool works: https://youtu.be/jkbPP1yDzME?si=ps_MvctAwfHDleXL

  14. Low-coverage whole genome sequencing for a highly selective cohort of severe COVID-19 patients

    This article has 7 authors:
    1. Renato Santos
    2. Víctor Moreno-Torres
    3. Ilduara Pintos
    4. Octavio Corral
    5. Carmen de Mendoza
    6. Vicente Soriano
    7. Manuel Corpas
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      Many studies have explored the genetic determinants of COVID-19 severity, these GWAS studies using microarrays or expensive whole-genome sequencing (WGS). Low-coverage WGS data can be imputed using reference panels to enhance resolution and statistical power while maintaining much lower costs, but imputation accuracy is difficult to balance. This work demonstrates how to address these challenges utilising the GLIMPSE1 algorithm, a less resource-intensive tool that produces more accurate imputed data than its predecessors. Generating a dataset containing 79 imputed low-coverage WGS samples from patients with severe COVID-19 symptoms during the initial wave of the SARS-CoV-2 pandemic in Spain. The validation of this imputation and filtering process shows that GLIMPSE1 can be confidently used to impute variants with minor allele frequency up to approximately 2%. After peer review the authors clarified and provided more validation and statistics and figures to help convince this approach was valid. This work showcasing the viability of using low-coverage WGS imputation to generate data for the study of disease-related genetic markers, alongside a validation methodology to ensure the accuracy of the data produced. Helping inspire confidence and encouraging others to deploy similar approaches to other infectious diseases, genetic disorders, or population-based genetic studies. Particularly in large-scale genomic projects and resource-limited settings where sequencing at higher coverage could prove to be prohibitively expensive.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity

    Scott C Edmunds

    For more insight see this video abstract from the lead author Manuel Corpas https://youtu.be/x6oVzt_H_Pk?si=Z7AxJZ_aNczi18ar

  15. Chromosomal-level genome assembly of golden birdwing Troides aeacus (Felder & Felder, 1860)

    This article has 21 authors:
    1. Hong Kong Biodiversity Genomics Consortium
    2. Jerome H.L. Hui
    3. Ting Fung Chan
    4. Leo L. Chan
    5. Siu Gin Cheung
    6. Chi Chiu Cheang
    7. James K.H. Fang
    8. Juan D. Gaitan-Espitia
    9. Stanley C.K. Lau
    10. Yik Hei Sung
    11. Chris K.C. Wong
    12. Kevin Y.L. Yip
    13. Yingying Wei
    14. Wai Lok So
    15. Wenyan Nong
    16. Hydrogen S.F. Pun
    17. Wing Kwong Yau
    18. Colleen Y.L. Chiu
    19. Sammi S.S. Chan
    20. Kacy K.L. Man
    21. Ho Yin Yip
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      This work is part of a series of papers from the Hong Kong Biodiversity Genomics Consortium sequencing the rich biodiversity of species in Hong Kong. This example presents the genome of the golden birdwing butterfly Troides aeacus (Lepidoptera, Papilionidae). A notable and popular species in Asia that faces habitat loss due to urbanization and human activities. The lack of genomic resources impedes conservation efforts based on genetic markers, as well as better understanding of its biology. Using PacBio HiFi long reads and Omni-C a 351Mb genome was assembled genome anchored to 30 pseudo-molecules. After reviewers requested more information on the genome quality it seems there was high sequence continuity with contig length N50 = 11.67 Mb and L50 = 14, and scaffold length N50 = 12.2 Mb and L50 = 13. Allowing a total of 24,946 protein-coding genes were predicted. This study presents the first chromosomal-level genome assembly of the golden birdwing T. aeacus, a potentially useful resource for further phylogenomic studies of birdwing butterfly species in terms of species diversification and conservation. This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  16. Chromosome-level genome assembly of the common chiton, Liolophura japonica (Lischke, 1873)

    This article has 25 authors:
    1. Hong Kong Biodiversity Genomics Consortium
    2. Project Coordinator and Co-Principal Investigators
    3. Jerome H.L. Hui
    4. Ting Fung Chan
    5. Leo L. Chan
    6. Siu Gin Cheung
    7. Chi Chiu Cheang
    8. James K.H. Fang
    9. Juan D. Gaitan-Espitia
    10. Stanley C.K. Lau
    11. Yik Hei Sung
    12. Chris K.C. Wong
    13. Kevin Y.L. Yip
    14. Yingying Wei
    15. DNA extraction, library preparation and sequencing
    16. Franco M.F. Au
    17. Wai Lok So
    18. Genome assembly and gene model prediction
    19. Wenyan Nong
    20. Gene family annotation
    21. Ming Fung Franco Au
    22. Samples collectors
    23. Tin Yan Hui
    24. Brian K.H. Leung
    25. Gray A. Williams
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      This work is part of a series of papers from the Hong Kong Biodiversity Genomics Consortium sequencing the rich biodiversity of species in Hong Kong. This example assembles the genome of the common chiton, Liolophura japonica (Lischke, 1873). Chitons are marine molluscs that can be found worldwide from cold waters to the tropics that play important ecological roles in the environment, but to date are lacking in genomes with only a few assemblies available. This data was produced using PacBio HiFi reads and Omni-C sequencing data, the resulting genome assembly being around 609 Mb in size. From this 28,010 protein-coding genes were predicted. After review improved the methodological details the quality metrics look near chromosome-level, having a scaffold N50 length of 37.34 Mb and 96.1% BUSCO score. This high-quality genome should hopefully be a valuable resource for gaining new insights into the environmental adaptations of L. japonica in residing the intertidal zones and for future investigations in the evolutionary biology in Polyplacophorans and other molluscs.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  17. Whole genome assembly and annotation of the King Angelfish (Holacanthus passer) gives insight into the evolution of marine fishes of the Tropical Eastern Pacific

    This article has 5 authors:
    1. Remy Gatins
    2. Carlos F. Arias
    3. Carlos Sánchez
    4. Giacomo Bernardi
    5. Luis F. De León
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      The King Angelfish (Holacanthus passer) is a great example of a Holacanthus angelfish that are some of the most iconic marine fishes of the Tropical Eastern Pacific. However, very limited genomic resources currently exist for the genus and these authors have assembled and annotated the nuclear genome of the species, and used it examine the demographic history of the fish. Using nanopore long reads to assemble a compact 583 Mb reference with a contig N50 of 5.7 Mb, and 97.5% BUSCOs score. Scruitinising the data, the BUSCO score was high compared to the initial N50’s, providing some useful lessons learned on how to get the most out of ONT data. The analysis suggests that the demographic history in H. passer was likely shaped by historical events associated with the closure of the Isthmus of Panama, rather than by the more recent last glacial maximum. This data provides a genomic resource to improve our understanding of the evolution of Holacanthus angelfishes, and facilitating research into local adaptation, speciation, and introgression of marine fishes. In addition, this genome can help improve the understanding of the evolutionary history and population dynamics of marine species in the Tropical Eastern Pacific.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  18. Chromosomal-level genome assembly of the long-spined sea urchin Diadema setosum (Leske, 1778)

    This article has 22 authors:
    1. Hong Kong Biodiversity Genomics Consortium
    2. Project Coordinator and Co-Principal Investigators
    3. Jerome H.L. Hui
    4. Ting Fung Chan
    5. Leo L. Chan
    6. Siu Gin Cheung
    7. Chi Chiu Cheang
    8. James K.H. Fang
    9. Juan D. Gaitan-Espitia
    10. Stanley C.K. Lau
    11. Yik Hei Sung
    12. Chris K.C. Wong
    13. Kevin Y.L. Yip
    14. Yingying Wei
    15. DNA extraction, library preparation and sequencing
    16. Wai Lok So
    17. Genome assembly and gene model prediction
    18. Wenyan Nong
    19. Sample collector, animal culture and logistics
    20. Apple P.Y. Chui
    21. Thomas H.W. Fong
    22. Ho Yin Yip
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      This work is part of a series of papers from the Hong Kong Biodiversity Genomics Consortium sequencing the rich biodiversity of species in Hong Kong. This example assembles the genome of the long-spined sea urchin Diadema setosum (Leske, 1778). Using PacBio HiFi long-reads and Omni-C data the assembled genome size was 886 Mb, consistent to the size of other sea urchin genomes. The assembly anchored to 22 pseudo-molecules/chromosomes, and a total of 27,478 genes including 23,030 protein-coding genes were annotated. Peer review added more to the conclusion and future perspectives. The data hopefully providing a valuable resource and foundation for a better understanding of the ecology and evolution of sea urchins.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity

    Scott C Edmunds

    See the other papers in this series on Hong Kong Biodiversity Genomics here https://doi.org/10.46471/GIGABYTE_SERIES_0006

  19. Genome assembly of the edible jelly fungus Dacryopinax spathularia (Dacrymycetaceae)

    This article has 25 authors:
    1. Hong Kong Biodiversity Genomics Consortium
    2. Project Coordinator and Co-Principal Investigators
    3. Jerome H.L. Hui
    4. Ting Fung Chan
    5. Leo L. Chan
    6. Siu Gin Cheung
    7. Chi Chiu Cheang
    8. James K.H. Fang
    9. Juan Diego Gaitan-Espitia
    10. Stanley C.K. Lau
    11. Yik Hei Sung
    12. Chris K.C. Wong
    13. Kevin Y.L. Yip
    14. Yingying Wei
    15. DNA extraction, library preparation and sequencing
    16. Tze Kiu Chong
    17. Sean T.S. Law
    18. Genome assembly and gene model prediction
    19. Wenyan Nong
    20. Genome analysis and quality control
    21. Wenyan Nong
    22. Sample collector and logistics
    23. Tze Kiu Chong
    24. Sean T.S. Law
    25. Ho Yin Yip
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      This work is part of a series of papers from the Hong Kong Biodiversity Genomics Consortium sequencing the rich biodiversity of species in Hong Kong. This example This work is part of a series of papers from the Hong Kong Biodiversity Genomics Consortium sequencing the rich biodiversity of species in Hong Kong. This example presenting the first whole genome assembly of Dacryopinax spathularia, an edible mushroom-forming fungus that is used in the food industry to produce natural preservatives. Using PacBio and Omni-C data a 29.2 Mb genome was assembled, with a scaffold N50 of 1.925 Mb and 92.0% BUSCO score demonstrating the quality (review pushing the authors to provide more detail and QC stats to help better convince on this). This data providing a useful resource for further phylogenomic studies in the family Dacrymycetaceae and investigations on the biosynthesis of glycolipids with potential applications in the food industry.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity
  20. Genome assembly of the milky mangrove Excoecaria agallocha

    This article has 26 authors:
    1. Hong Kong Biodiversity Genomics Consortium
    2. Project Coordinator and Co-Principal Investigators
    3. Jerome H.L. Hui
    4. Ting Fung Chan
    5. Leo L. Chan
    6. Siu Gin Cheung
    7. Chi Chiu Cheang
    8. James K.H. Fang
    9. Juan Diego Gaitan-Espitia
    10. Stanley C.K. Lau
    11. Yik Hei Sung
    12. Chris K.C. Wong
    13. Kevin Y.L. Yip
    14. Yingying Wei
    15. DNA extraction, library preparation and sequencing
    16. Sean T.S. Law
    17. Wai Lok So
    18. Genome assembly and gene model prediction
    19. Wenyan Nong
    20. Genome analysis and quality control
    21. Wenyan Nong
    22. Sample collector and logistics
    23. David T.W. Lau
    24. Sean T.S. Law
    25. Shing Yip Lee
    26. Ho Yin Yip
    This article has been curated by 1 group:
    • Curated by GigaByte

      Editors Assessment:

      This work is part of a series of papers from the Hong Kong Biodiversity Genomics Consortium sequencing the rich biodiversity of species in Hong Kong. This example assembles the genome of the milky mangrove Excoecaria agallocha, also known as blind-your-eye mangrove due to its toxic properties of its milky latex that can cause blindness when it comes into contact with the eyes. Living in the brackish water of tropical mangrove forests from India to Australia, they are an extremely important habitat for a diverse variety of aquatic species, including the mangrove jewel bug of which this tree is the sole food source for the larvae. Using PacBio HiFi long-reads and Omni-C technology a 1,332.45 Mb genome was assembled, with 1,402 scaffolds and a scaffold N50 of 58.95 Mb. After feedback the annotations were improved, predicting a very high number (73,740) protein coding genes. The data presented here provides a valuable resource for further investigation in the biosynthesis of phytochemical compounds in its milky latex with the potential of many medicinal and pharmacological properties. As well as increasing the understanding of biology and evolution in genome architecture in the Euphorbiaceae family and mangrove species adapted to high levels of salinity.

      This evaluation refers to version 1 of the preprint

    Reviewed by GigaByte

    This article has 2 evaluationsAppears in 2 listsLatest version Latest activity

    Scott C Edmunds

    First in a series of papers from the Hong Kong Biodiversity Genomics Consortium, a joint effort of eight universities funded by the Hong Kong University Grants Committee (UGC). This project aims to sequence genomes of local animals, plants, and fungi in the local territory with a state-of-the-art genome sequencer and to form a local network of biodiversity genomic research hub. You can read more in this blog: http://gigasciencejournal.com/blog/hong-kong-biodiversity-genomics-consortium/