ELEN – Predicting Loop Quality in Protein Structure Models
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Typically, sequences designed de novo are assessed in silico using deep learning-based protein structure prediction methods prior to wetlab testing. While these deep learning (DL) models excel at predicting well-ordered regions, accurate prediction of loop regions, which often are flexible and crucial for protein function, remains a significant challenge. To address this, we introduce the Equivariant Loop Evaluation Network (ELEN), a local model quality assessment (MQA) method that is tailored towards evaluating the accuracy of protein loops at the per-residue level. ELEN jointly predicts three quality metrics, local Distance Difference Test (lDDT), Contact Area Difference Score (CAD-score), and Root Mean Squared Deviation (RMSD), by comparing predicted to experimental reference structures. Learning these metrics simultaneously enables ELEN to capture complementary structural insights, providing a richer assessment of model accuracy. The network operates at all-atom resolution and employs 3D equivariant group convolutions to learn the local geometric environment of each atom. By incorporating sequence embeddings from large language models (LLMs), such as SaProt, we enhance the sequence and evolutionary awareness of the model. Furthermore, by informing ELEN with per-residue physicochemical features, the model achieves competitive accuracy relative to state-of-the-art MQA methods on the Continuous Automated Model EvaluatiOn (CAMEO) benchmark. Although ELEN was primarily developed for assessing loop quality, its architecture also demonstrates strong potential for general MQA tasks. We used ELEN to perform detailed analysis, including identification of flexible or disordered regions and assessment of structural effects from single-residue mutations on three sets of redesigned enzymes. We show that for all sets ELEN successfully identifies poor design positions and thus serves as a powerful tool for advancing both the study and modeling of loops in protein structures.