Attentive-SPIDNA: Attention-based neural networks for population genetics

Théophile Sanchez
Pierre Jobic
Cyril Regan
Paul Verdu
Guillaume Charpiat
Flora Jay

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Artificial neural networks (ANNs) have recently offered new perspectives to solve inference problems from high dimensional data in numerous scientific fields, but it is yet unclear which architectures are the most suited to genomic data. Here, we present a new ANN architecture integrating attention mechanisms to infer effective population size history from genomic data. Built upon our previous exchangeable architecture SPIDNA, Attentive-SPIDNA adds attention layers that allow computing more expressive and complex features from combinations of haplotypes. The contribution of each haplotype to the features is learned automatically and depends on its content and affinity with the other haplotypes. Likewise, we use this mechanism to automatically perform a voting scheme that aggregates predictions from different genomic regions. This new architecture outperforms approximate Bayesian computation and previously published neural networks while relying directly on raw genetic data and being invariant to haplotype permutation in the input. As a proof-of-concept, we use this architecture to infer the effective population size history of 54 populations from the HGDP dataset (Bergström et al, 2020). This application highlights the ability of the network to handle data with a varying number of haplotypes and to quickly perform predictions for datasets including numerous populations. Therefore, the proposed mechanism could be integrated to various neural networks solving population genetics tasks.

Version published to 10.64898/2026.04.15.718687 on bioRxiv
Apr 18, 2026

Bio-BLIP: A Multimodal Architecture for Transferable Reasoning in Genomic Variant Interpretation

This article has 4 authors:
1. Anvita Gupta
2. Alejandro Buendia
3. Anshul Kundaje
4. Jure Leskovec
This article has no evaluationsLatest version May 15, 2026
Additive baselines furnish no evidence for epistasis learning by MULTI-evolve

This article has 3 authors:
1. Gian Marco Visani
2. Aayush Verma
3. William S. DeWitt
Reviewed by Arcadia Science

This article has 2 evaluationsAppears in 1 listLatest version Apr 24, 2026Latest activity May 18, 2026
gRely: Relyability for genome trained sequence-to-expression models

This article has 3 authors:
1. Abdul Muntakim Rafi
2. Gokcen Eraslan
3. Kipper Fletez-Brant
This article has no evaluationsLatest version May 27, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Bio-BLIP: A Multimodal Architecture for Transferable Reasoning in Genomic Variant Interpretation

Additive baselines furnish no evidence for epistasis learning by MULTI-evolve

gRely: Relyability for genome trained sequence-to-expression models