Characterising Protein Search Drift using exhaustive protein search and Alphafold2

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In this paper we present the first exhaustive analysis of iterative protein search drift and show how such results may impact downstream modelling. Assembling and extracting evolutionary information from families of related proteins is a core challenge in the studey of molecular evolution. For instance, iterative protein search is a common first step in a wide variety of bioinformatics tools and pipelines. And the output of such searches often form the inputs for modelling tools such as Alphafold2. Here we characterise profile drift; the tendency for some searches to become contaminated with sequences outside of the intended evolutionary family. We observe that drift occurs in nearly 15% of searches and can be observed to have measurable impacts on downstream predictive tasks such as structure prediction.

Article activity feed