Complexity-Aware Symbolic Solvers for Sudoku: Towards Interpretability and Cognitive Alignment in Constraint Problem Solving
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
SAT and Constraint Programming (CP) remain the fastest exact Sudoku solvers, on average taking milliseconds to solve even hard puzzles. They are, however, black-box optimizers and hence lack symbolic interpretability. The present article describes a complementary deterministic approach that explains reasoning with four symbolic solvers: Dynamic Constraint Propagation Backtracking (DCPB), Constraint Density–Guided Search (CDGS), Logical Inference Depth (LID) Solver, and Guessing-Complexity-Optimized Solver (GCOS). Each solver is governed by formally stated complexity indicators Constraint Density Metric (CDM), Logical Inference Depth (LID), Constraint-Graph Tightness (CGT), and Guess Entropy (local/global) such that inference effort is transparently computed. Our contributions are threefold. First, we formalize reproducible symbolic complexity measures that capture inference effort and search ambiguity. Second, we implement four deterministic solvers each balancing runtime against interpretability while producing traceable reasoning traces. Third, we introduce the Solver–Complexity Alignment Index (SCAI), a new measure for the extent to which solver effort aligns with intrinsic puzzle hardness. On the Kaggle Sudoku dataset (1M puzzle–solution pairs), we find that while the CP and SAT baselines are unparalleled for runtime efficiency, symbolic solvers dominate in interpretability, reproducibility, and solver–complexity alignment. Most balanced in terms of runtime and SCAI alignment is the LID solver, while best at compressing entropy is GCOS. The work constitutes a contribution to explainable constraint solving and advances the alignment between solver effort and human intuitions on puzzle hardness. The concepts are generalizable to the broader constraint satisfaction, planning, and scheduling applications where interpretability must be as valuable as speed. Our solvers are tested on the 1 million–puzzle Kaggle Sudoku dataset (1,000,000 puzzle–solution pairs; Bryan Park), with detailed experimentation on a stratified subset (10 Easy, 10 Medium, 10 Hard).