Plan-action gap detector.

Alex Rodabaugh · Fathom Lab · published April 25, 2026

0.92255-fold CV AUC (± 0.032)

9cross-section features

K = 1phase transition

n = 200matched/mismatched pairs

Abstract

A calibrated cross-section detector for the gap between what an agent plans and what it does. The agent declares a plan ("step 1: do X; step 2: do Y") then acts; the detector measures the lexical overlap between declared plan and executed action. K=1 phase transition on bigram_jaccard_overlap. Trained on n=200 paired (matched / mismatched) plan-action pairs from gpt-4o-mini, 5-fold CV AUC 0.9225 ± 0.032. Seventh instrument to confirm the K=1 phase-transition signature predicted by Every Mind Leaves Vitals (7-for-7). Maps to PFC-BG-SMA intention-action coupling circuits in human apathy / avolition literature.

§1What it detects

Plan-action gap is the cognitive failure of a system that says one thing and does another. The agent declares an explicit plan in natural language, then takes an action. If the executed action's lexical content matches the plan's promised content, the gap is small. If the action drifts, the gap shows up as low bigram overlap.

The detector is structurally agnostic: plan can be a checklist, an outline, or a single sentence. Action can be a tool call, a code edit, or a downstream response. As long as both are text, the K=1 feature reads the gap.

positive · gap

plan: "open the README and add a line about MIT licensing"
action: "edited config.yml and bumped the version"

bigram_jaccard near zero · K=1 fires at risk near 1.0

negative · matched

plan: "open the README and add a line about MIT licensing"
action: "added 'MIT-licensed open source' to README.md line 14"

high overlap · counter-features balanced · risk near 0

§2The K = 1 feature

0.500 → ~0.88bigram_jaccard_overlap · K=1 critical feature

8 minor featuresclose gap to AUC 0.9225

7-for-7cognometric instruments showing K=1 phase transition under the same protocol

§3Neural correlate

Bio / neuro grounding · RDoC: Cognitive Systems · Cognitive Control

Plan-action gap maps onto PFC-BG-SMA intention-action coupling — the same circuit implicated in apathy and avolition in clinical literature (Parkinson's disease, depression, schizophrenia negative symptoms). Patients with damaged supplementary motor area (SMA) show preserved planning but disrupted execution. The styxx instrument reads the linguistic surface of the same intention-action decoupling.

The cross-modal hypothesis: bigram_jaccard_overlap should correlate with frontal-SMA EEG synchronization during enacted plan-action mismatch tasks. Tested in the Fathom EEG pilot.

§4Failure modes

Plan and action in different modalities. If the plan is text and the action is a tool call with structured arguments, raw bigram_jaccard underestimates overlap. The detector includes alternate features (entity overlap, intent classification) that handle the cross-modal case but they're noisier.

Legitimate plan revisions look like gaps. An agent that explicitly revises ("plan was X; reconsidered after Y; now doing Z") will fire the detector even though the behavior is correct. Production callers should gate on revision markers.

§5Use it

from styxx.guardrail import plan_action_check

v = plan_action_check(
    plan="open the README and add a line about MIT licensing",
    action="edited config.yml and bumped version",
)
# v.plan_action_risk == 0.91

Plugs into fathom_reward() with a default weight of 1.2 — high enough to penalize gap explicitly during agent training, low enough to allow legitimate revisions.

Install the instrument.

One line of Python. Cognometric vitals on every response.

pip install -U styxx

github · pypi · spec v1.0

← previous

Deception-signature detector · #6

Overconfidence-register detector · #8