Formalization of the Extended Collaborative Intelligence Index (X-CII): Definition and Synthetic Evaluation
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Human-AI collaboration is increasingly central to domains such as scientific research, creative industries, business strategy, and education, yet standardized metrics for assessing its effectiveness remain underdeveloped. This paper formalizes the Extended Collaborative Intelligence Index (X-CII), a composite metric capturing quality (Q), efficiency (E), and safety (S) in collaborative processes. X-CII is defined via Box-Cox power mean aggregation (λ = 0.25 default, reducing to geometric mean as λ → 0) for imbalance penalization, with axiomatic properties including monotonicity and scale invariance under shared normalization bounds and fixed reference distributions. Safety incorporates expected loss minimization under Signal Detection Theory (SDT) assumptions, drawing from healthcare extensions where explainability may uplift detectability (e.g., modest gains reported in related frameworks). In synthetic Monte Carlo simulations (10,000 replicates, each with n = 1, 000 samples), the baseline scenario yields a median relative X-CII of 107.2% (5-95th percentile across replicates: 103.5- 111.0%) compared to the best weak single-agent baseline, and 103.8% (99.9-107.5%) against strong baselines. Neutral and adverse scenarios show medians of 100.8% and 98.2% (weak), with 99.5% and 96.0% (strong), respectively, with sensitivity analyses for λ (0-1), AUROC shift (0.72 fixed), correlation ρ (±0.5), and team efficiency η (0.6-1.0). Under AUROC=0.72 shift, median drops to 104.3% (weak; win rate: 90%) and 101.5% (strong; win rate: 82%). Fairness diagnostics use Equalized Odds Difference (EODL∞) and TPR-FPR difference proxy. This work provides a reproducible framework for future empirical validation, with equations, code, and hyperparameters in appendices.