tidygenclust : Clustering for Population Genetics in R

Eirlys E. Tysall
Anahit Hovhannisyan
Evelyn J. Carter
Cecilia Padilla-Iglesias
Margherita Colucci
Andrea Vittorio Pozzi
Michela Leonardi
Aramish Fatima
Ondrej Pelanek
Nile P. Stephenson
Andrea Manica

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background

Population structure analysis is crucial for evolutionary research and medical genomics. Clustering methods, broadly categorized as model-based (e.g. ADMIXTURE) or non-model-based (e.g. SCOPE), differ in their methodology and computational efficiency. Recently, fastmixture , a model-based approach, has improved scalability and performance, while replicate alignment tools, such as Clumppling, extend previous methods by also aligning the modes across K values. However, all the existing tools are standalone and generate numerous untracked text files, as well as offering limited plot customisability.

Results

We introduce an R package, tidygenclust , which brings the functionalities of the original ADMIXTURE, fastmixture and Clumppling software into R, enabling a streamlined and integrated workflow. By integrating with tidypopgen , a package designed to handle large SNP datasets, these new tools maintain metadata, simplify data handling, and produce results as customisable ggplot2 objects for flexible visualisation.

Conclusions

The R package tidygenclust advances population genetic analysis by combining computational efficiency with reproducible workflows and user-friendly plotting. The source code and instructions can be accessed on https://github.com/EvolEcolGroup/tidygenclust .

Version published to 10.1101/2025.07.29.667403 on bioRxiv
Jul 31, 2025

TaxoFlow: The Tutorial. An Educational Nextflow Pipeline for Metagenomics Taxonomic Profiling

This article has 2 authors:
1. Jeferyd Yepes-García
2. Laurent Falquet
This article has no evaluationsLatest version Dec 22, 2025
Understanding Pathways in Bioinformatics, Genomics, and Health Applications

This article has 1 author:
1. Diptarup Mallick
This article has no evaluationsLatest version Jan 19, 2026
MiCoReCa (Microbiome Community Resource Catalogue) - Towards Centralized Curation And Integration Of Microbiome Bioinformatics Resources

This article has 8 authors:
1. Vivek Ashokan
2. Clara Emery
3. Agnès Barnabé
4. Valentin Loux
5. Christina Pavloudi
6. Paul Zierep
7. Nikolaos Strepis
8. Bérénice Batut
This article has no evaluationsLatest version Jan 6, 2026

Discuss this preprint

Listed in

Abstract

Background

Results

Conclusions

Article activity feed

Related articles

TaxoFlow: The Tutorial. An Educational Nextflow Pipeline for Metagenomics Taxonomic Profiling

Understanding Pathways in Bioinformatics, Genomics, and Health Applications

MiCoReCa (Microbiome Community Resource Catalogue) - Towards Centralized Curation And Integration Of Microbiome Bioinformatics Resources