Preserve, don't prune: why strategic dormancy is an architectural necessity for clinical AI governance
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Current AI reasoning frameworks permanently prune low-confidence reasoning paths, discarding hypotheses that fall below confidence thresholds. This paper makes two distinct claims, one empirical and one conceptual. Empirical claim. In the Enhanced Mycelium of Thought (EMoT) framework, disabling strategic dormancy (the capacity to preserve and later reactivate low-confidence nodes) causes catastrophic quality collapse: from 4.20 to 1.00 on a 5-point scale across all three complex evaluation cases, producing no meaningful output. The collapse occurred in every case and on every evaluation criterion. This pattern is consistent with an architectural failure rather than performance degradation, though the small sample (n = 3) precludes definitive classification; replication with larger samples and alternative architectures would need to confirm the distinction. Conceptual claim. We argue, by structural analogy, that this architectural property is governance-relevant for clinical AI. Systems that prune diagnostic uncertainty are structurally aligned with administrative optimization; systems that preserve it are aligned with Clinical-first governance (Stummer, 2026b). We do not claim clinical performance; our claims concern architectural properties that a Clinical-first system would need to exhibit. Computational pre-study. To bridge the gap between the reasoning-architecture ablation and the clinical governance argument, we present a pruning simulation on a synthetic EHR dataset (Synthea, n = 11,475 patients, 309 unique SNOMED-CT conditions). At a 1% prevalence threshold, pruning eliminates 46.9% of the diagnostic vocabulary and affects 24.4% of patients. Of the 145 pruned conditions, 34 are clinically actionable (associated with active pharmacological treatment), including heart failure, asthma, and atrial fibrillation. 80% of rare conditions have diagnostic bridges to common conditions that pruning severs. These results do not validate the governance argument but demonstrate that the architectural choice between pruning and preserving has measurable consequences for clinical information integrity in realistic patient populations. The ablation uses three complex cases evaluated via LLM-as-Judge (Claude Sonnet 4, 6-criterion rubric). Results are descriptive only; EMoT underperforms on a 15-item short-answer benchmark (27% accuracy). No clinical systems were deployed, and the governance argument is by analogy rather than empirical validation.