SAGA (Simplified Association Genome-wide Analyses): a user-friendly Pipeline to Democratize Genome-Wide Association Studies

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Genome-wide association studies (GWAS) have enabled clinicians and researchers to identify genetic variants linked to complex traits and diseases(1–3). However, GWAS still face several challenges, particularly regarding accessibility and reproducibility (4–6). Conducting these analyses often requires substantial bioinformatics expertise for data preprocessing, software installation, and scripting(7–10). We then developed SAGA (“ Simplified Association Genome-wide Analyses” ), a BASH-based, open-source, fully automated pipeline that integrates three widely adopted tools—PLINK(11), GMMAT(12), and SAIGE(13)—for accessible, robust, and reproducible GWAS. After installation, users simply need to provide genotype and phenotype files in standard formats. The pipeline automates preprocessing, association testing, and visualization, outputting summary statistics, Manhattan plots, and quantile-quantile plots. SAGA enables robust GWAS for users without scripting experience, expanding access to complex genetic analyses.

Article activity feed