Benchmarking protein sequence and structure search methods for remote homology detection

Yuan Liu
Yingquan Zhou
Yan Huang
Hongyi Xin
Xiaoyong Pan
Hong-Bin Shen

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Background Protein sequence and structure similarity-based search is an important task, which underpins protein annotation, evolutionary analysis, large-scale functional inference, and the exploration of the protein “dark space”. The rapid growth of sequence and predicted structure databases has spurred diverse search methods, yet their evaluation remains limited to fold-level similarity and inconsistent benchmarking protocols. Results We present a unified benchmark for protein sequence and structure search. Using this framework, we evaluate 13 representative methods spanning sequence alignment, structure alignment, and representation-based approaches across multiple biologically relevant scenarios. Our results show pronounced and context-dependent differences among methods. Structure alignment methods excel at detecting fold-level and geometric similarity, while representation-based searching approaches show advantages in capturing functional similarity under low sequence identity and robustness to predicted structures. Notably, all evaluated methods show limited effectiveness on intrinsically disordered proteins. Conclusions This benchmark establishes a standardized framework for evaluating protein similarity search methods, providing a practical resource for method selection and a foundation for the development of next-generation approaches capable of addressing diverse homology search challenges.

Version published to 10.21203/rs.3.rs-8796067/v1 on Research Square
Feb 24, 2026

Metagenomic-scale analysis of the predicted protein structure universe

This article has 11 authors:
1. Martin Steinegger
2. Jingi Yeo
3. Yewon Han
4. Nicola Bordin
5. Andy Lau
6. Shaun Kandathil
7. Hyunbin Kim
8. Eli Levy Karin
9. Milot Mirdita
10. David Jones
11. Christine Orengo
This article has no evaluationsLatest version Mar 31, 2026
Alignment of RNA Secondary Structures with Arbitrary Pseudoknots using Structural Sequences

This article has 4 authors:
1. Luca Tesei
2. Francesca Levi
3. Michela Quadrini
4. Emanuela Merelli
This article has no evaluationsLatest version Mar 23, 2026
Multi-Functional Peptide Discover with Amino Acid-level Fusion of Sequence Information and Structure Feature

This article has 5 authors:
1. Zhiqiang Liang
2. Qiyu Wang
3. Xuhui Liao
4. Liwei Xiao
5. Junjie Chen
This article has no evaluationsLatest version Apr 6, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Metagenomic-scale analysis of the predicted protein structure universe

Alignment of RNA Secondary Structures with Arbitrary Pseudoknots using Structural Sequences

Multi-Functional Peptide Discover with Amino Acid-level Fusion of Sequence Information and Structure Feature