MULTICLUST – Fast Multinomial Clustering of multiallelic genotypes to infer genetic population structure

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Identifying population structure from multilocus genotype data is key to down-stream genetic analyses, including analysis of genome-wide association (GWAS), genetic genealogy and phylogenetics, in the fields of conservation, forensics, evolution and more. While inference of population structure has been dominated by Bayesian methods, maximum likelihood methods have some benefits in terms of reliability, consistency and efficiency. Here we extend the methods of Tang et al. [53] and Alexander et al. [2] to handle multi-allelic ( e.g ., SNP, STR, allozyme) and polyploid loci with missing data to infer genetic admixture proportions and subpopulation allele frequencies. Comparative analyses of our method, MULTICLUST , and STRUCTURE [42] on both simulated and empirical data indicate comparable, fast, reproducible, and accurate estimates of population admixture proportions and allele frequencies using MULTICLUST.MULTICLUST is implemented in the C programming language and is publicly accessible via www.github.com/arunsethuraman/multiclust .

Article activity feed