Inferring binding specificities of human transcription factors with the wisdom of crowds

Nikita Gryzunov
Dmitry Penzar
Vasilii Kamenets
Valery Vyaltsev
Ivan Kozin
Irina A. Eliseeva
Vladimir Nozdrin
Ilya E. Vorontsov
Sergey Bushuev
Vadim Strekalovskikh
Arsenii Zinkevich
Gregory Andrews
Matwej Bedarew
Ido Blass
Dmitry Frolov
Iuliia Lariushina
Jill Moore
Yaron Orenstein
German Roev
Danil Salimov
Noam Shimshoviz
Ido Tziony
Zhiping Weng
IBIS Consortium
GRECO-BIT/Codebook Consortium
Philipp Bucher
Bart Deplancke
Oriol Fornes
Jan Grau
Ivo Grosse
Arttu Jolma
Fedor A. Kolpakov
Vsevolod J. Makeev
Timothy R. Hughes
Ivan V. Kulakovskiy

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

DNA motif discovery and, particularly, computational modeling of transcription factor binding motifs, has been a mecca of algorithmic bioinformatics for several decades. Here, we report the results of the largest open community challenge in Inferring BInding Specificities (IBIS), where participants all over the world were invited to construct binding specificity models from multi-assay experimental data for poorly studied human transcription factors. The submissions were rigorously tested against a rich held-out dataset. Benchmarking demonstrated a consistent advantage of properly designed deep learning models over traditional positional weight matrices and other machine learning methods. Yet, the positional weight matrices displayed a surprisingly strong performance out of the box, being only slightly behind the best deep learning models. A post-challenge assessment of a selection of other deep learning methods further solidified this finding. IBIS highlights the power of benchmarking in finding adequate DNA motif representations, emphasizes the pros and cons of various machine learning methods applied to DNA motif modeling, and establishes a rich dataset, benchmarking protocols, and computational framework for a fair cross-platform evaluation of future models of transcription factor binding motifs in DNA sequences.

Version published to 10.1101/2025.11.16.688692 on bioRxiv
Nov 17, 2025

The Evolution of the AlphaFold Architecture

This article has 1 author:
1. Y.C.B.J. Dissanayaka
This article has no evaluationsLatest version Jan 9, 2026
Artificial Intelligence–Driven Structural Mining Enables Functional Inference in the Human Dark Proteome

This article has 7 authors:
1. Valentina Carbonari
2. Annamaria Defilippo
3. Ugo Lomoio
4. Caterina Francesca Perri
5. Barbara Puccio
6. Pierangelo Veltri
7. Pietro Hiram Guzzi
This article has no evaluationsLatest version Dec 23, 2025
GENERator: A Long-Context Generative Genomic Foundation Model

This article has 18 authors:
1. Qiuyi Li
2. Wei Wu
3. Yuanyuan Zhang
4. Zhihao Zhan
5. Ruipu Chen
6. Mingyang Li
7. Kun Fu
8. Junyan Qi
9. Yongzhou Bao
10. Chao Wang
11. Yiheng Zhu
12. Zhiyun Zhang
13. Jian Tang
14. Fuli Feng
15. Jieping Ye
16. Liu Yuwen
17. Hui Xiong
18. Zheng Wang
This article has no evaluationsLatest version Feb 4, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

The Evolution of the AlphaFold Architecture

Artificial Intelligence–Driven Structural Mining Enables Functional Inference in the Human Dark Proteome

GENERator: A Long-Context Generative Genomic Foundation Model