CONSTRUCT: an algorithmic tool for identifying functional or structurally important regions in protein tertiary structure

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation

Evolutionary rates in protein-coding genes vary widely, reflecting functional and/or structural constraints. Essential or highly expressed proteins tend to evolve more slowly, and within a protein, different amino acid sites experience distinct selective pressures. Accurately modeling this variation is critical for identifying functional and/or structurally important amino acid sites. Standard methods assume independent substitution rates across sites, and the most conserved ones are widely distributed in protein tertiary structure. This is biologically unrealistic, as functional sites tend to cluster in 3D space.

Results

Here, we developed CONSTRUCT, an improved strategy for detecting functional and structurally important regions in protein tertiary structure. Given a set of orthologous sequences, CONSTRUCT first estimates site-specific substitution rates using the Rate4site model. These rates are then weighted by the rates of neighboring amino acid sites within an optimally defined window size, determined by the strongest spatial correlation. To refine clustering detection, CONSTRUCT can analyze either Cα atoms or the center of mass of amino acid sites, accounting for side chain orientation. Extensive simulations and validation on 14 functionally characterized proteins of diverse sizes, interspecies conservation levels, and taxonomic origins demonstrated the robustness of CONSTRUCT. The results highlight CONSTRUCT as a powerful tool for guiding site-directed mutagenesis experiments aimed at elucidating protein function.

Availability and implementation

The CONSTRUCT program and documentation are freely available at https://github.com/Rcoppee/CONSTRUCT.

Article activity feed