A genotype-phenotype transformer to assess and explain polygenic risk

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Genome-wide association studies have linked millions of genetic variants to biomedical phenotypes, but their utility has been limited by lack of mechanistic understanding and widespread epistatic interactions. Recently, Transformer models have emerged as a powerful machine learning architecture with potential to address these and other challenges. Accordingly, here we introduce the Genotype-to-Phenotype Transformer (G2PT), a framework for modeling hierarchical information flow among variants, genes, multigenic systems, and phenotypes. As proof-of-concept, we use G2PT to model the genetics of TG/HDL (triglycerides to high-density lipoprotein cholesterol), an indicator of metabolic health. G2PT predicts this trait via attention to 1,395 variants underlying at least 20 systems, including immune response and cholesterol transport, with accuracy exceeding state-of-the-art. It implicates 40 epistatic interactions, including epistasis between APOA4 and CETP in phospholipid transfer, a target pathway for cholesterol modification. This work positions hierarchical graph transformers as a next-generation approach to polygenic risk.

Article activity feed