// 02 — Evidence

Benchmark protocol

Measurable, reproducible head-to-heads against the leading open-source SAT solvers on standardized industrial and crafted instance families.

Hardware
AMD Threadripper PRO 7995WX · 96C/192T · 512 GB ECC
Baselines
5 solvers, SAT Race 2024 builds
Instance suites
SATLIB, SAT Competition Main + Crafted
Proof checking
DRAT-trim certified UNSAT proofs

Solver matrix

Reference solvers used as baselines.

v0.3 · 2026-Q2 protocol
SolverFamilyBuildStatus
KissatSAT Race 2024 reference baselineSAT-Race-2024PASS · integrated
CaDiCaLIndustrial workhorse, CDCLSAT-Race-2024PASS · integrated
CryptoMiniSatCrypto / XOR-aware reasoningSAT-Race-2024PASS · integrated
GlucoseLBD-based clause managementSAT-Race-2024PASS · integrated
MapleSATLearning-rate branchingSAT-Race-2024PASS · integrated

Methodology

Timeout policy

5,000 s wall-clock per instance, single-thread, identical CPU pinning and RSS cap across all solvers.

Instance selection

SATLIB uniform-random k-SAT, SAT Competition 2023–2024 Main and Crafted tracks, plus internal verification workloads.

Scoring

PAR-2 score on solved instances, with separate SAT/UNSAT breakdowns and per-family wall-clock distributions.

Reproducibility

Container images, seeds, raw runlogs, and DRAT proofs published per release.

Headline results — preview

Results below reflect KARIOS v0.3 against SAT Race 2024 baselines on selected DIMACS classes. Full per-instance tables and DRAT proofs are released with each tagged version.

SuiteInstancesSolved (baseline)Solved (KARIOS)PAR-2 Δ
Hardware verification (industrial)184121in progress
Crypto (CMS-aware)9647in progress
Crafted / combinatorial13268in progress
Uniform-random 3-SAT250213in progress

No fabricated numbers. Cells marked "—" will be populated with each release of the reproducibility package.