Ultra-fast and highly sensitive protein structure alignment with segment-level representations and block-sparse optimization

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Deep learning models for protein structure prediction have given rise to extreme growth in 3D structure data. As a result, traditional methods for geometric structure alignment are too slow to effectively search modern structure libraries. In this study we introduce SPfast – a fully geometric method for structure-based alignment which accelerates search by more than 2 orders of magnitude while increasing sensitivity by 21% and 5% compared with foldseek and TMalign respectively. Using the significant speed of SPfast to conduct more than 100B pairwise comparisons between bona fide uncharacterized proteins and a large-scale, annotated structure library uncovers new biological insights relating to type III secretion in pathogenic bacteria and identifies novel toxin-antitoxin systems. Putative SPfast-based functional assignments are supported by orthogonal evidence including shared genomic context and high-confidence AlphaFold3 complex modelling.

Article activity feed