Localized Reactivity on Proteins as Riemannian Manifolds: A Quantum-Inspired Geometric Model for Deterministic, Metal-Aware Reactive-Site Prediction
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We introduce a unified framework for analysing molecular reactivity based on a geometric, quantum-inspired environment representation and a fully deterministic, metalaware implementation. Proteins and ribonucleoprotein complexes are treated as configurations in ℝ 3 × T , and each residue or nucleotide p is mapped to an environment vector E p that encodes a coarse-grained, DFT-inspired density surrogate together with metal/phosphate fields, solvent exposure and local geometry.
A block-streamed, GPU-optional Python pipeline maps arbitrary PDB/mmCIF structures to fixed-dimensional environment vectors without stochastic training and scales to supramolecular assemblies: the 6Q97 tmRNA–SmpB–ribosome rescue complex (11,618 residues) can be processed in a single pass on commodity cloud hardware, demonstrating practical feasibility at ribosome scale. In a strict unbound, zero-shot setting on the Docking Benchmark 5.5 (DB5.5), a simple classifier trained on top of E p achieves a macro-averaged area under the precision–recall curve of ~0.53 and a ROC–AUC of ~0.86 for residue-level interface vs. non-interface classification, competitive with specialised interface-prediction architectures despite using no evolutionary profiles, MSAs or task-specific retraining on DB5.5.
Across mechanistically curated case studies (Rubisco, GroEL/GroES, SecA, p53– DNA and ribosomal pockets), untuned Random Forests used purely as probes under site-grouped cross-validation yield ROC–AUC values exceeding 0.95 for catalytic and anchor cores (e.g., SecA ATPase, GroES IVL), while diffuse regulatory and fitness-defined labels are substantially harder to separate. For 6Q97, a Tier 1/Tier 2 labelling scheme over tmRNA/SmpB pockets, decoding-centre rRNA, the 23S peptidyl transferase centre and helicase-like uS3/uS4/uS5 pockets, together with a curated hard-negative panel of 323 buried hydrophobic, electrostatic and stacking decoys, yields global AUCs of ~0.94 (Tier 1+2 vs. all) and ~0.98 (Tier 1+2 vs. hard negatives). These results support the view that the environment representation defines an interpretable “reactivity manifold” in which genuinely functional pockets occupy regions that cannot be mimicked by generic dense or charged environments, and that this structure remains accessible even for full ribosomes on modest hardware.