Integrating AlphaFold2 Models and Clinical Data to Improve the Assessment of Short Linear Motifs (SLiMs) and Their Variants’ Pathogenicity
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Short Linear Motifs (SLiMs) are protein functionally relevant regions that mediate reversible protein-protein interactions. Variants that disrupt SLiMs can lead to numerous Mendelian diseases. Although various bioinformatic tools have been developed to identify SLiMs, most suffer from low specificity. In our previous work, we demonstrated that integrating sequence variant information with structural analysis can enhance the prediction of true functional SLiMs while simultaneously generating tolerance matrices that indicate whether each of the 19 possible single amino acid substitutions (SASs) is tolerated. However, the scarcity of representative crystallographic structures of SLiM-receptor complexes posed a significant limitation. In this study, we demonstrate that these interactions can be modeled using AlphaFold2 (AF2) to generate high-quality structures that serve as input for our MotSASi method. These AF2-derived structures show robust performance, both in reproducing known structures deposited in the Protein Data Bank (PDB) and in reflecting the deleterious effects of known sequence variants. This updated version of MotSASi expands the repertoire of high-confidence predicted SLiMs and provides a comprehensive catalog of variants located within SLiMs, along with their respective deleteriousness assessments. When compared to AlphaMissense, MotSASi demonstrates superior performance in predicting variant deleteriousness. By contributing to the accurate identification and interpretation of variants, this work aligns with ACMG/AMP standards and aims to improve diagnostic rates in clinical genomics.
Author Summary
Proteins interact with each other in highly specific ways to carry out vital biological functions. Short Linear Motifs (SLiMs) are small regions within proteins that mediate many of these reversible interactions. Changes in SLiMs can disrupt these interactions and lead to severe genetic disorders. Identifying SLiMs accurately has been a longstanding challenge, as many computational tools suffer from low specificity. Previously, we developed a method, MotSASi, that combines sequence variation data and structural analysis to improve SLiM prediction and assess the impact of single amino acid substitutions (SASs). However, the lack of available structural data limited its application. In this study, we demonstrate that structures generated using AlphaFold2 (AF2) can overcome this limitation. These high-quality AF2 models reliably reproduce known structures and capture the harmful effects of sequence variations. By integrating AF2 models, the updated MotSASi method identifies more high-confidence SLiMs and provides detailed assessments of the variants within them. MotSASi outperforms existing tools, such as AlphaMissense, in predicting the impact of genetic variants, offering insights aligned with clinical standards. This advancement can aid in understanding disease mechanisms and improving genetic diagnostics in clinical genomics.