On the applicability domain of HADDOCK3 for protein-aptamer docking: documented failure modes from a 5×7 cross-target screening matrix and a 1676 aa receptor case study (P01031)
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
We screened a 5-receptor x 7-aptamer = 35-cell cross-target screening matrix with HADDOCK3 under blind ambiguous-interaction-restraint (AIR) protocols on AlphaFold-modelled receptors. The 35-cell matrix is primarily a cross-target / decoy screening matrix rather than a 35-cognate-pair benchmark: it contains an n = 4 K_D-calibration subset under matched assay conditions, at least six biological cognate or intended-cognate cells, and the remaining cells are intentional non-target pairings used to characterise score-distribution behaviour. The screen surfaced 12 operationally distinct failure modes that collapse into five broad conceptual groups. The principal case study is P01031 (complement C5, 1676 aa, ≥ 12 structural domains): all seven panel members produced positive HADDOCK3 top-1 scores under a scale-adaptive AIR. Score-term decomposition locates the anomaly in the AIR term (+217 to +268 to top-1 score). With AIR zeroed, scores fall to -131 to -74 -- the small-receptor regime. Boltz-2 cofolding chain-pair ipTM (cpi_AB) is an independent channel: P01031 shows the lowest median cpi_AB (0.211; 0/7 above the 0.5 confident-interface threshold). To our knowledge, this is an early documented case study of a 1676 aa multi-domain receptor exhibiting this signature under a blind scale-adaptive AIR workflow -- an n = 1 mechanistic case, not a statistical generalisation. We adapt the QSAR applicability-domain concept to in silico aptamer screening. We report an empirical Mode 1 mitigation, a pLDDT-aware AIR prefilter, with cohort Jaccard recovery of ∼10x. The n = 4 K_D-calibration Spearman ρ shift is reported as exploratory cross-method convergence, not as a calibration claim.