FiGS-MoD: Feature-informed Gibbs Sampling Motif Discovery Algorithm for Mapping Human Signaling Networks

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Motivation

Short linear motifs (SLiMs) are short sequence patterns that mediate transient protein-protein interactions, often within disordered regions of proteins. SLiMs play central roles in signaling, trafficking, and post-translational regulation, but their short length and low complexity make them difficult to identify both experimentally and computationally. Since the latest release of motif discovery tools like MEME Suite, the availability of protein-protein interaction data (e.g., BioGRID) has increased by more than fivefold, providing richer network contexts where SLiMs can be inferred from recurring patterns of interaction. Combined with recent advances in machine learning, this creates new opportunities for large-scale, high-resolution motif discovery.

Results

We present FiGS-MoD, a F eature- i nformed G ibbs S ampling Mo tif D iscovery algorithm with two key innovations: (i) incorporating biased sampling informed by residue-level features, including Protein Language Model (PLM) embeddings, AlphaFold2-derived disorder and solvent accessibility, and evolutionary conservation, and (ii) replacing the traditional position-specific scoring matrix (PSSM) with a Hidden Markov model (HMM) to accommodate insertions and deletions. We applied our algorithm to 12,765 sub-networks from the human interactome and provided 221,840 human SLiM predictions and quality scores as a public resource, along with the tool itself. Our method outperformed MEME in terms of recovering known motifs from the Eukaryotic Linear Motif (ELM) database and phosphosites from PhosphoGRID. Through three case studies, we further highlight the biological relevance of our results and the generalizability of the method to diverse motif classes.

Availability and Implementation

Source code and a predicted SLiM dataset using FiGS-MoD are freely available at https://github.com/Eric3939/FiGS-MoD

Article activity feed