GigaByte

GigaByte is an open access and open science journal published by GigaScience Press, BGI's Open Access and Open Data Publishing division. As with our sister-journal GigaScience— we publish ALL reusable and shareable research objects, such as data, software tools and workflows, from data-driven research.

About Followers

Featured lists

Endorsed by GigaByte

Preprints that have undergone Editor’s Assessment by GigaByte.

Curated by Scott C Edmunds

This list contains 50 articlesLast updated Nov 30, 2025

Latest preprint reviews

EMImR: a Shiny application for identifying transcriptomic and epigenomic changes

This article has 5 authors:
1. Hiba Ben Aribi
2. Careen Naitore
3. Farah Ayadi
4. Souheila Guerbouj
5. Olaitan I. Awe
This article has been curated by 1 group:
- Curated by GigaByte
  
  Editors Assessment:
  
  Coded and written up as part of the African Society for Bioinformatics and Computational Biology (ASBCB) Omicscodeathons, EMImR is a novel Shiny application for transcriptomic and epigenomic change identification and correlation wrapped up using a combination of Bioconductor and CRAN packages. Case studies are on publicly available GEO data corresponding to sequencing data of human blood cell samples of multiple sclerosis patients to demonstrate how the tool works. And a documentation and videos are provided. Peer review and the study highlighting the usefulness of the developed tool for analyzing transcriptomic and epigenomic data.
  
  This evaluation refers to version 1 of the preprint
Reviewed by GigaByte

This article has 2 evaluationsAppears in 2 listsLatest version Oct 30, 2025Latest activity Nov 28, 2025
Aedes mosquito distribution across urban and peri-urban areas of Kinshasa city, Democratic Republic of Congo

This article has 15 authors:
1. Victoire Nsabatien
2. Josue Zanga
3. Nono Mvuama
4. Arsene Bokulu
5. Hyacinthe Lukoki
6. Glodie Diza
7. Dorcas Kantin
8. Leon Mbashi
9. Christelle Bosulu
10. Narcisse Basosila
11. Erick Bukaka
12. Fiacre Agossa
13. Jonas Nagahuedi
14. Jean-Claude Palata
15. Emery Metelo
This article has been curated by 1 group:
- Curated by GigaByte
  
  Editors Assessment:
  
  In the Democratic Republic of Congo (DRC) Aedes mosquitoes are principal vectors of the arboviruses that cause yellow fever, chikungunya and dengue in the human population. However systematic surveillance data on these species remains limited, hindering for entomological and modelling research and control strategies. This paper is one of a series of Data Release papers in GigaByte supported by TDR and the WHO describing datasets hosted in GBIF to tackle these data gaps in vectors of human disease data. To address this data deficiency this paper presents a geo-referenced dataset of 6,577 entomological occurrence records collected in 2024 throughout urban and peri-urban areas of Kinshasa in the Democratic Republic of Congo. The data collected using Larval dipping, Human landing catches, Prokopack aspirator, and BG-Sentinel traps. Data auditing and peer review found the data well validated, but requested some additional fields and methodological details. This work and the extremely useful data provided representing an important step towards building a pan-African resource for Aedes mosquito data collection.
  
  This evaluation refers to version 1 of the preprint
Reviewed by GigaByte

This article has 2 evaluationsAppears in 2 listsLatest version Oct 7, 2025Latest activity Sep 17, 2025
Whole genome sequencing and assembly of the house sparrow, Passer domesticus

This article has 13 authors:
1. Vikas Kumar
2. Gopesh Sharma
3. Sankalp Sharma
4. Samvrutha Prasad
5. Shailesh Desai
6. Toral Vaishnani
7. Dalia Vishnudasan
8. Gopinathan Maheswaran
9. Kaomud Tyagi
10. Inderjeet Tyagi
11. Polavarapu B Kavi Kishor
12. Gyaneshwer Chaubey
13. Prashanth Suravajhala
This article has been curated by 1 group:
- Curated by GigaByte
  
  Editors Assessment:
  
  This paper presents present the genome sequencing of the house sparrow (Passer domesticus) carrying out genome assembly and annotation using in silico approaches with tools that could be a valuable resource for understanding passerine evolution, biology, ethnology, geography, and demography. The final genome assembly was generated using short read sequencing and a computational workflow that included Shovill, SPAdes, MaSuRCA, and BUSCO benchmarking. Producing a 922 MB reference genome with 24,152 genes. The first draft was significantly smaller than this but peer review provided suggestions on how to improve the assembly quality. And after a few attempts and assembly with a reasonable size and BUSCO score was achieved. This openly available data potentially serving as a valuable resource for checking adaptation, divergence, and speciation of birds.
  
  This evaluation refers to version 2 of the preprint
Reviewed by GigaByte

This article has 2 evaluationsAppears in 2 listsLatest version Jul 21, 2025Latest activity Sep 2, 2025
Chevreul: an R bioconductor package for exploratory analysis of full-length single cell sequencing

This article has 3 authors:
1. Kevin Stachelek
2. Bhavana Bhat
3. David Cobrinik
This article has been curated by 1 group:
- Curated by GigaByte
  
  Editors Assessment:
  
  This paper presents Chevreul, a new open-source R Bioconductor (meta-)package for processing and integration of scRNA-seq data from cDNA end-counting, full-length short-read or long-read protocols. Alongside a R Shiny app for easy visualization, formatting, and analysis for exploratory analyses of scRNA-seq data processed in the SingleCellExperiment Bioconductor or Seurat formats. The name of the tool is inspired by the colour theorist Michel-Eugène Chevreul and the optical illusion of the same name. To demonstrate the use of Chevreul, the authors provide a sample analysis, which helps to demonstrate how users can visualize a wide range of parameters, enabling transparent and reproducible scRNA-seq analyses. Peer review also pushing the author to provide extensive guidance materials to assist with use. Being implemented in R, the R package and integrated Shiny application are freely available under an open-source MIT license in Bioconductor and their GitHub page here: https://github.com/cobriniklab/chevreul
  
  This evaluation refers to version 1 of the preprint
Reviewed by GigaByte

This article has 2 evaluationsAppears in 2 listsLatest version Jun 24, 2025Latest activity Jul 30, 2025
Chromosome-level genome assembly of the lemon sole, Microstomus kitt (Pleuronectiformes: Pleuronectidae)

This article has 15 authors:
1. Marcel Nebenführ
2. David Prochotta
3. Maria A. Nilsson
4. Menno J. de Jong
5. Tunca D. Yazici
6. Fabienne Langefeld
7. Malambo Muloongo
8. Helena Woköck
9. Jakob Jilg
10. Sina C. Bender
11. Marvin M. Zangl
12. Juan-Manuel Ortega Guatame
13. Kimberley Williams
14. Moritz Sonnewald
15. Axel Janke
This article has been curated by 1 group:
- Curated by GigaByte
  
  Editors Assessment:
  
  This Data Release paper presents the first genome assembly of the lemon sole (Microstomus kitt), a commercially important flatfish found in European coastal waters. It is also interesting that this work was carried out in a University course setting involving the students. The resulting chromosome-level genome was assembled using long-read PacBio HiFi sequencing and the Hi-C technique. The 628 Mbp reference (which is consistent with other Pleuronectidae fish species) is assembled into 24 chromosome-length scaffolds with high completeness, achieving a scaffold N50 of 27.2 Mbp. Peer review and data curation made the author clarify a few points and share all of the data and results in an open and well curated manner. The annotated genome of the lemon sole, with its high continuity, should therefore provide important reference data for future population genetic analyses and conservation strategies of this organism.
  
  This evaluation refers to version 1 of the preprint
Reviewed by GigaByte

This article has 2 evaluationsAppears in 2 listsLatest version May 27, 2025Latest activity May 30, 2025
Chromosome-level genome assemblies of five Sinocyclocheilus species

This article has 6 authors:
1. Chao Bian
2. Ruihan Li
3. Yuqian Ouyang
4. Junxing Yang
5. Xidong Mu
6. Qiong Shi
This article has been curated by 1 group:
- Curated by GigaByte
  
  **Editors Assessment: ** Sinocyclocheilus are a genus of freshwater cavefish fish that are endemic to the Karst regions of Southwest China. Having diverse traits in morphology, behavior, and physiology typical of cavefish, that make them interesting models for studying cave adaptation and phylogenetic evolution. The manuscript assembled chromosomal-level genomes of five Sinocyclocheilus species, and conducted allotetraploid origin analysis on these species. Assembling S. grahami (the golden-line barbel), using PacBio and Hi-C sequencing technologies, a final chromosome-level genome assembly was 1.6 Gb in size with a contig N50 of 738.5 kb and a scaffold N50 of 30.7 Mb. With 93.1% of the assembled genome sequences and 93.8% of the predicted genes anchored onto 48 chromosomes. Subsequently the authors conducted a homologous comparison to obtain chromosome-level genome assemblies for four other Sinocyclocheilus species: S. maitianheensis, S. rhinocerous, S. anshuiensis, and S. Anophthalmus. With over 82% of the genome sequences anchored on these constructed chromosomes. Peer review provided clarification on the assembly strategy and provided more benchmarking. This data having the potential to contribute to species conservation and the exploitation of potential economic and ecological values of diverse Sinocyclocheilus members.
  
  This evaluation refers to version 1 of the preprint
Reviewed by GigaByte

This article has 2 evaluationsAppears in 2 listsLatest version May 9, 2025Latest activity May 14, 2025
Efficiently constructing complete genomes with CycloneSEQ to fill gaps in bacterial draft assemblies

This article has 19 authors:
1. Hewei Liang
2. Yuanqiang Zou
3. Mengmeng Wang
4. Tongyuan Hu
5. Haoyu Wang
6. Wenxin He
7. Yanmei Ju
8. Ruijin Guo
9. Junyi Chen
10. Fei Guo
11. Tao Zeng
12. Yuliang Dong
13. Yuning Zhang
14. Bo Wang
15. Chuanyu Liu
16. Xin Jin
17. Wenwei Zhang
18. Xun Xu
19. Liang Xiao
This article has been curated by 1 group:
- Curated by GigaByte
  
  Editors Assessment:
  
  With the recent official launch of BGI’s new CycloneSEQ sequencing platform that delivers long-reads using novel nanpores, this paper presents benchmarking data and validation studies comparing short, long-rea data from other platforms and hybrid assemblies. This study tests the performance of the new platform in sequencing diverse microbial genomes, presenting raw and processed data to enable others to scrutinise and verify the work. Being openly peer-reviewed, and having scripts and protocols also shared for the first time helps provide transparency in this benchmarking process to increase trust in this new technology. On top of benchmarking typed strains, the technology also was tested with complex microbial communities. Yielding complete metagenome-assembled genomes (MAGs) which were not achieved by short- or long-read assemblies alone. By directly reading DNA molecules without fragmentation, the study demonstrating CycloneSEQ delivers long-read data with impressive length and accuracy, unlocking gaps that short-read technologies alone cannot bridge. Future work is expanding to real samples, with and fine-tuning the balance between short-read and long-read data for even faster, higher-quality assemblies.
  
  This evaluation refers to version 1 of the preprint
Reviewed by GigaByte

This article has 3 evaluationsAppears in 2 listsLatest version Apr 25, 2025Latest activity Apr 28, 2025
Genome assembly and annotation of Acropora pulchra from Mo’orea French Polynesia

This article has 4 authors:
1. Trinity Conn
2. Jill Ashey
3. Ross Cunning
4. Hollie M. Putnam
This article has been curated by 1 group:
- Curated by GigaByte
  Editors Assessment:
  
  Acropora pulchra is a species small polyped stony corals in the family Acroporidae from the the Indo-Pacific. This Data Release is the first study in stony corals to present the DNA methylome in tandem with a high-quality genome assembled utilizing PacBio long-read HiFi sequencing. Sequencing an A. pulchra specimen from Mo’orea, French Polynesia. From this single molecule sequencing data DNA methylation data was also called and quantified, and additional short-read Illumina RNASeq data was used for gene annotation. This producing an assembly size is 518 Mbp, with 174 scaffolds, and a scaffold N50 of 17 Mbp, and 40,518 protein-coding genes called. Peer review requested some improved benchmarking, and it is impressive to see from the results that the genome assembly represents the most complete and contiguous stony coral genome assembly to date. As an important indicator species and this data will hopefully serve as a resource to the coral and wider scientific community. Further quantification of the genome-wide methylation is needed aid the study epigenetics of non-model organisms, and specifically future analyses on methylation in coral.
  
  *This evaluation refers to version 1 of the preprint
Reviewed by GigaByte

This article has 2 evaluationsAppears in 2 listsLatest version Apr 10, 2025Latest activity Apr 13, 2025
CompactTree: a lightweight header-only C++ library and Python wrapper for ultra-large phylogenetics

This article has 1 author:
1. Niema Moshiri
This article has been curated by 1 group:
- Curated by GigaByte
  
  Editors Assessment:
  
  As volumes of viral and bacterial sequence data grow exponentially, the field of computational phylogenetics now demands resources to manage the burgeoning scale of this input data. This study introduces CompactTree, a C++ library designed for ultra-large phylogenetic trees with millions of tips. To address these scalability issues while maintaining ease of incorporation into external code bases, CompactTree is a header-only library with enhanced performance utilizing minimal dependencies, optimized node representation, and memory-efficient tree structure schemes. Resulting in significantly reduced memory footprints and improved processing times. Peer review requested some more detail on the functionality and some real-world examples, demonstrating the current utility of the tool. Although primarily supporting the (text-based) Newick format, the increased and extensibility scalability holds promise for multiple biological and epidemiological applications supporting more complex formats such as Nexus and NeXML. The tool is open source (GPLv3 licensed) and available in GitHub: https://niema.net/CompactTree
  
  This evaluation refers to version 1 of the preprint
Reviewed by GigaByte

This article has 2 evaluationsAppears in 2 listsLatest version Mar 7, 2025Latest activity Mar 23, 2025
Draft genome of the endangered visayan spotted deer (Rusa alfredi), a Philippine endemic species

This article has 8 authors:
1. Ma. Carmel F. Javier
2. Albert C. Noblezada
3. Persie Mark Q. Sienes
4. Robert S. Guino-o
5. Nadia Palomar-Abesamis
6. Maria Celia D. Malay
7. Carmelo S. del Castillo
8. Victor Marco Emmanuel N. Ferriols
This article has been curated by 1 group:
- Curated by GigaByte
  
  Editors Assessment:
  
  The Visayan spotted deer (Rusa alfredi), is a small, endangered, primarily nocturnal species of deer found in the rainforests of the Visayan Islands in the Philippines. The present study reports the first draft genome assembly for the species, addressing a critical gap in genomic data for this IUCN-redlisted cervid. Using Illumina sequencing, the resulting genome assembly spans 2.52 Gb in size with a BUSCO completeness score of 95.5% and encompasses 24,531 annotated genes. Phylogenetic analysis suggests a close evolutionary relationship between R. alfredi and Cervus species suggesting that the genus Rusa is sister to Cervus. Peer-review teased out more benchmarking results and the annotation files, demonstrating this genomic resource is useful and usable for advancing population genetics and evolutionary studies, thereby informing conservation strategies and enhancing breeding programs for the critically threatened species. Providing whole genome sequences for other native species of Rusa could further provide genomic resources for detecting hybrids, which will also help the management and monitoring of these species, especially for the reintroduction of captive populations in the wild.
  
  This evaluation refers to version 1 of the preprint
Reviewed by GigaByte

This article has 2 evaluationsAppears in 2 listsLatest version Feb 24, 2025Latest activity Mar 17, 2025

Page 1 of 11 Older

GigaByte

Featured lists

Endorsed by GigaByte

Latest preprint reviews

EMImR: a Shiny application for identifying transcriptomic and epigenomic changes

Curated by GigaByte

Aedes mosquito distribution across urban and peri-urban areas of Kinshasa city, Democratic Republic of Congo

Curated by GigaByte

Whole genome sequencing and assembly of the house sparrow, Passer domesticus

Curated by GigaByte

Chevreul: an R bioconductor package for exploratory analysis of full-length single cell sequencing

Curated by GigaByte

Chromosome-level genome assembly of the lemon sole, Microstomus kitt (Pleuronectiformes: Pleuronectidae)

Curated by GigaByte

Chromosome-level genome assemblies of five Sinocyclocheilus species

Curated by GigaByte

Efficiently constructing complete genomes with CycloneSEQ to fill gaps in bacterial draft assemblies

Curated by GigaByte

Genome assembly and annotation of Acropora pulchra from Mo’orea French Polynesia

Curated by GigaByte

CompactTree: a lightweight header-only C++ library and Python wrapper for ultra-large phylogenetics

Curated by GigaByte

Draft genome of the endangered visayan spotted deer (Rusa alfredi), a Philippine endemic species

Curated by GigaByte