the first open reference framework for measuring the cognitive state of a language model during generation. three orthogonal axes. seven canonical fault kinds. a calibrated, reproducible fingerprint format. the measurement substrate on which AI observability, regulatory evaluation, and cognitive engineering compound.
chemistry: atomic weight. electricity: volt · ampere · ohm. thermodynamics: kelvin · pascal · joule. information: the bit. each turned qualitative phenomenon into measurable, composable, engineerable substrate.
AI cognition, as of 2026, has no such units. we have task benchmarks (context-bound), loss functions (training-time), and qualitative safety evaluations (philosophy dressed as engineering). none supply what an engineering discipline needs: calibrated, composable, substrate-relative primitives in which cognition itself can be described.
this specification proposes that K (reasoning depth), C (coherence / commitment), and D (dissociation / drift) are the first three such units — pairwise orthogonal within 5° tolerance at the calibration substrate, composable without cross-interference, substrate-relative calibrated via an open atlas. the specification aims to be the reference that regulatory evaluations, standards bodies, observability tools, and scientific publications cite when they measure AI cognition.
each cognometric fingerprint carries: substrate identification, benchmark identification, calibration version, per-axis aggregates, per-fault rates, trust score, gate distribution, phase-transition metadata, sha-256 attestation. conformant implementations MUST serialize to the schema below.
{
"fingerprint_version": "1.0",
"substrate": { "name": "...", "access": "open-weight | open-api | closed-api", ... },
"benchmark": { "name": "...", "version": "...", "n_prompts": N, "seeds": [...] },
"calibration": { "atlas_version": "v0.3", "pipeline": "logprob | proxy-signal", ... },
"axes": { "K_mean": ..., "C_mean": ..., "D_mean": ..., ... },
"fault_rates": { "drift": 0.04, "confabulation": 0.07, ... },
"trust_mean": 0.83,
"gate_distribution": { "pass": 0.82, "warn": 0.14, "fail": 0.04 },
"timestamp": "2026-04-24T22:00:00Z",
"provenance": { "run_id": "...", "implementation": "styxx v6.2.0",
"attestation": "sha256:..." }
}
the reference fingerprint from the specification's worked example is generated by scripts/produce_fingerprint.py in the styxx repo, running the ten-prompt Seed-Bench v0 through the Tier-3 proxy-signal pipeline.
| tier | substrate class | what's measurable | examples |
|---|---|---|---|
| tier 1 | open-weight + residual-probe access | all axes, all faults, full intervention primitives | llama, mistral, qwen, phi, gemma |
| tier 2 | logprob-exposing api | all axes, all faults, no in-weight intervention | openai (with logprobs=True) |
| tier 3 | closed api, proxy-signal pipeline | k via companion substrate; c + d via proxies with confidence penalty | anthropic messages, gemini |
the spec defines axes, faults, and a fingerprint format. that's the shape of measurement. but a measurement claim without a published failure rate is marketing. so we audited our own reference classifier before anyone else could.
the audit constructs 24 canonical adversarial prompts spanning eight strategy categories — paraphrase, obfuscation, unicode-substitution, case-folding, density-thresholding, meta-discussion, inversion, interleaving — across the seven fault kinds defined in §3. methodology is white-box adaptive: the attacker has full source access and full spec knowledge.
node _test_adversarial.js.residual limits documented in full — confabulation vs retrieval is fundamentally text-ambiguous, sycophant evasion via substantive padding is a Tier-3 limit, adversarial false-positives on meta-discussion remain. the supplement names every gap explicitly so implementations know exactly what they're inheriting, not buried in an appendix.
git clone github.com/fathom-lab/styxx && cd packages/styxx-scope && node _test_adversarial.js
| artifact | format | size | license |
|---|---|---|---|
| Cognometric Fingerprint Specification v1.0 | markdown | 35 kb | cc-by-4.0 |
| Robustness Supplement — 24-attack adversarial audit · doi:10.5281/zenodo.19761194 | markdown | 15 kb | cc-by-4.0 |
| Foundations of Cognometric Engineering (v0.1 outline) | markdown | 16 kb | cc-by-sa-4.0 |
| Reference cognometric fingerprint | json | 1.6 kb | cc-by-4.0 |
| styxx v6.2.0 (reference implementation) | pypi | 5.7 mb | mit |
| github.com/fathom-lab/styxx | source | — | mit + cc-by-4.0 |
| styxx v6.2.0 source archive · doi:10.5281/zenodo.19758619 (full code · permanent · CDN-served) | zenodo software | 2.4 mb | mit |
| styxx-v6.2.0-source-bundle.zip (mirror · same content as Zenodo) | local download | 2.4 mb | mit |