G-SPRI: A Structure-Centric Graph Model for Comprehensive Prediction of Cancer Driver Events from Missense Mutations

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In silico approaches for predicting the functional impact of missense mutations are critical for interpreting personal genomes and identifying disease-related biomarkers. Existing methods largely rely on sequence-based information or intuitive structural features, but often overlook the complex biophysical patterns encoded in protein 3D structures. Here, we present G-SPRI, a multilevel framework built on a novel alpha-shape protein graph that accurately captures residue connectivity from atomic-resolution geometry and enables precise message passing around mutation sites. Using this graph representation, G-SPRI integrates wild-type structural properties and mutation-specific perturbation signals derived from the Protein Data Bank (PDB) universe to support graph-based learning for distinguishing pathogenic from benign missense variants. G-SPRI performs strongly across multiple key tasks. On the binary prediction benchmark, G-SPRI delivers improved pathogenicity prediction for individual mutations. By integrating mutation recurrence across the pan-cancer cohort, G-SPRI recovers more known cancer driver genes than state-of-the-art methods from more than 2.3 million mutations. Furthermore, by jointly quantifying site-specific pathogenicity and co-clustering influence within higher-order structural organization units, G-SPRI provides comprehensive evidence for pinpointing likely driver mutations and structurally susceptible regions within disease genes.

Article activity feed