Exploring structural diversity across the protein universe with The Encyclopedia of Domains

Andy M. Lau
Nicola Bordin
Shaun M. Kandathil
Ian Sillitoe
Vaishali P. Waman
Jude Wells
Christine A. Orengo
David T. Jones

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

The AlphaFold Protein Structure Database (AFDB) contains more than 214 million predicted protein structures composed of domains, which are independently folding units found in multiple structural and functional contexts. Identifying domains can enable many functional and evolutionary analyses but has remained challenging because of the sheer scale of the data. Using deep learning methods, we have detected and classified every domain in the AFDB, producing The Encyclopedia of Domains. We detected nearly 365 million domains, over 100 million more than can be found by sequence methods, covering more than 1 million taxa. Reassuringly, 77% of the nonredundant domains are similar to known superfamilies, greatly expanding representation of their domain space. We uncovered more than 10,000 new structural interactions between superfamilies and thousands of new folds across the fold space continuum.

Version published to 10.1126/science.adq4946
Nov 1, 2024
Version published to 10.1101/2024.03.18.585509 on bioRxiv
Mar 19, 2024

The Evolution of the AlphaFold Architecture

This article has 1 author:
1. Y.C.B.J. Dissanayaka
This article has no evaluationsLatest version Jan 9, 2026
Artificial Intelligence–Driven Structural Mining Enables Functional Inference in the Human Dark Proteome

This article has 7 authors:
1. Valentina Carbonari
2. Annamaria Defilippo
3. Ugo Lomoio
4. Caterina Francesca Perri
5. Barbara Puccio
6. Pierangelo Veltri
7. Pietro Hiram Guzzi
This article has no evaluationsLatest version Dec 23, 2025
Emergence of Biological Structural Discovery in General-Purpose Language Models

This article has 1 author:
1. Liang Wang
This article has no evaluationsLatest version Jan 8, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

The Evolution of the AlphaFold Architecture

Artificial Intelligence–Driven Structural Mining Enables Functional Inference in the Human Dark Proteome

Emergence of Biological Structural Discovery in General-Purpose Language Models