BravizineDSE: A Multi-Input Deep Transformer for Receptor-Context-Aware GPCR Signaling Bias Prediction and Derivation of a Six-Feature Molecular Selectivity Rule
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
G protein-coupled receptors (GPCRs) transduce signals through at least two structurally and functionally distinct pathways—G-protein activation and β-arrestin recruitment—whose selective engagement determines both therapeutic efficacy and side-effect liability. Despite the clinical importance of pathway-selective (“biased”) agonism, no publicly available computational tool has integrated receptor context as an architecturally mandatory input to bias prediction. Here we present BravizineDSE, a multiinput deep transformer trained on 126 million molecule–receptor pairs spanning 100 human GPCRs. The model simultaneously encodes a 160-token SMILES string, a 256-bit circular fingerprint, ten physicochemical descriptors, and a 48-dimensional receptor sequence feature vector through a cross-attention fusion head that renders receptor identity non-optional at inference time. On a scaffold-holdout evaluation the model achieves AUROC = 0.978 and balanced accuracy (BA) = 0.909; generalization degrades gracefully to BA = 0.751 under leaveone- publication-out (LOPO) and BA = 0.627 under leave-one-receptor-out (LORO) protocols, isolating distinct failure modes. Mechanistic interpretability analysis via SHAP across 1412 high-confidence molecule–receptor pairs spanning all five GPCR families yields a transferable, human-readable selectivity rule—Aaryan’s Rule of 6—specifying six molecular thresholds whose satisfaction score predicts DRD2 β-arrestin bias with called-only BA = 0.702 and 83.8% corpus coverage. Surrogate circularity is quantified via descriptor-only, fingerprint-only, and permutedlabel control models; the permuted-label AUROC collapses to 0.513, confirming genuine label-associated signal. The rule identifies sp3 carbon fraction, stereocenter count, aromatic ring count, hydrogen bond acceptor count, topological polar surface area, and molecular weight as jointly sufficient to stratify DRD2 signaling selectivity, and further proposes that nitrogen substitution within an aromatic ring system constitutes a transferable selectivity-encoding motif at protein–ligand interfaces beyond the GPCR superfamily.