Skip to content

Hooks

The hook system is how EdgeVox lets you change agent-loop behaviour without patching core. A hook is a small callable that fires at one or more of six fire points inside LLMAgent.run(); it inspects the payload, optionally modifies it, and tells the loop whether to continue, replace the payload, or end the turn.

Everything in hooks_builtin.py + hooks_slm.py is a composition of exactly this contract.

The six fire points

PointWhenTypical payloadTypical use
ON_RUN_STARTbefore any LLM call{"task": str}reset per-turn counters, inject memory, safety input-rail
BEFORE_LLMeach hop, before llm.complete{"messages": list, "tools": list, "hop": int}mutate system prompt, enforce token budget
AFTER_LLMeach hop, after parsing{"content": str, "tool_calls": list, "hop": int}rewrite reply, detect echoed payloads, output-rail
BEFORE_TOOLper tool, pre-dispatchToolCallRequestrequire confirmation, loop-hint, skip dispatch
AFTER_TOOLper tool, post-dispatchToolCallResulttruncate, log episode, schema-retry enrichment
ON_RUN_ENDonce, after the turn resolvesAgentResultpersist session, audit, emit metrics

Hook Protocol

python
class Hook(Protocol):
    points: frozenset[str]                       # which fire points to receive

    def __call__(
        self,
        point: str,                              # the exact fire point name
        ctx: AgentContext,                       # run context (see below)
        payload: Any,                            # point-specific shape
    ) -> HookResult | None: ...

points is a frozenset of fire-point constants from edgevox.agents.hooks. Returning None means "continue, no changes" (equivalent to HookResult.cont()).

Any callable that matches this shape works — no inheritance required. The @hook(...) decorator wraps a function:

python
from edgevox.agents.hooks import hook, BEFORE_LLM

@hook(BEFORE_LLM)
def add_system_note(ctx, payload):
    msgs = list(payload["messages"])
    msgs[0]["content"] += "\nRemember: be brief."
    payload = dict(payload)
    payload["messages"] = msgs
    return HookResult.replace(payload, reason="brief reminder")

HookResult

Three constructors for the three outcomes the loop honours:

ConstructoractionWhat the loop does
HookResult.cont()CONTINUEproceed with the original payload
HookResult.replace(payload, reason=...)MODIFYswap the payload in-flight
HookResult.end(reply, reason=...)END_TURNbail out, use reply as the final reply

reason is a short string surfaced in AgentEvent.payload["reason"] and AgentResult.hook_ended — use it for debugging and audit trails.

Where hooks live

Two layers, fired in order at every point:

  1. Agent-levelLLMAgent(..., hooks=[...]). Shared across every run() of that agent.
  2. Context-levelctx.hooks.register(h). Scoped to one AgentContext / conversation.

Agent-level hooks fire first; ctx-level hooks see any modifications they made. This lets you ship agent-specific defaults while letting callers layer session-specific behaviour on top.

Hook-owned state

Hooks that need per-turn state (fingerprint counters, retry budgets) store it under ctx.hook_state[id(self)]. Keying by id(self) gives each instance its own bag, so two LoopDetectorHook() objects on one context never share counts.

python
class MyHook:
    points = frozenset({ON_RUN_START, AFTER_TOOL})

    def __call__(self, point, ctx, payload):
        if point == ON_RUN_START:
            ctx.hook_state[id(self)] = {"seen": 0}
            return None
        bag = ctx.hook_state[id(self)]
        bag["seen"] += 1
        ...

This replaced the old ctx.session.state["__xxx__"] magic-key pattern, which leaked hook internals into the user-visible ctx.state dict and caused silent collisions between two instances of the same hook class.

Typed ctx fields

Hooks reach the running tool registry and LLM via typed fields on AgentContext, not scratchpad keys:

FieldWho sets itTypical use
ctx.tool_registryLLMAgent.run()schema lookup for error-repair hooks
ctx.llmLLMAgent.run()tokenizer-exact estimate_tokens, compaction
ctx.interruptcallerread cancel_token, subscribe to events
ctx.memory, ctx.artifacts, ctx.blackboardcallerlong-term memory, artifact store, shared state

ctx.state is now user-only scratch. Hooks must not write framework plumbing there.

Built-in hooks (hooks_builtin.py)

HookFires atPurpose
SafetyGuardrailHook(blocklist=…)ON_RUN_STARTblock-list / allow-list input rail
PlanModeHook(confirm=[…], approver=…)BEFORE_TOOLrequire confirmation before sensitive tools
TokenBudgetHook(max_context_tokens=…)BEFORE_LLMhard context-window cap with tokenizer-exact count
ToolOutputTruncatorHook(max_chars=…)AFTER_TOOLtruncate oversized tool results
MemoryInjectionHook(memory_store)BEFORE_LLMappend facts/episodes to system prompt (idempotent per turn)
NotesInjectorHook(notes)BEFORE_LLMinject the tail of a NotesFile
ContextCompactionHook(compactor)ON_RUN_STARTLLM-summarise middle turns when over budget
EpisodeLoggerHook(memory_store)AFTER_TOOLrecord tool outcomes as episodes
AuditLogHook(path)AFTER_LLM/AFTER_TOOL/ON_RUN_ENDJSONL event log for offline replay
PersistSessionHook(session_store, session_id)ON_RUN_ENDsave Session to disk
TimingHook()before/after LLM + toolcollect wall-clock timings
EchoingHook()all sixprint every fire point — debugging

SLM-hardening hooks (hooks_slm.py)

HookFires atPurpose
LoopDetectorHook(hint_after=1, break_after=2)ON_RUN_START, BEFORE_TOOLfingerprint identical (tool, args) calls; hint on the 2nd, end-turn on the 3rd
EchoedPayloadHook(fallback=…)AFTER_LLMsubstitute a human-readable fallback when the model echoes a tool-result payload (markdown-fence aware)
SchemaRetryHook(max_retries_per_tool=1)ON_RUN_START, AFTER_TOOLrewrite argument-shape errors into a human-readable schema hint so the next hop can retry

Compose the bundle with default_slm_hooks(); it's what you want on any model <4B that hasn't been specifically tool-call-finetuned.

Hook order

Within one fire point, hooks fire in priority order — higher priority first, ties broken by registration order. Declare priority as a class or instance attribute, or pass it to register():

python
class MySafetyHook:
    points = frozenset({ON_RUN_START})
    priority = 100  # safety tier

agent.register_hook(MySafetyHook())
# or
ctx.hooks.register(MyObserver(), priority=0)

Recommended scale (convention, not enforced):

PriorityTierExamples
100Safety / railsSafetyGuardrailHook, PlanModeHook, future LlamaGuard
80Input-shapeMemoryInjectionHook, NotesInjectorHook, TokenBudgetHook
60DetectionLoopDetectorHook, EchoedPayloadHook
40MutationSchemaRetryHook, ToolOutputTruncatorHook
0Observability (default)AuditLogHook, TimingHook, EpisodeLoggerHook

The @hook(point, priority=…) decorator accepts the same kwarg. HookRegistry.at(point) returns hooks in firing order — useful for introspection.

Testing hooks

tests/harness/conftest.py ships a ScriptedLLM that returns pre-declared responses, so hook behaviour can be exercised deterministically:

python
from tests.harness.conftest import ScriptedLLM, reply, call

def test_my_hook_short_circuits():
    llm = ScriptedLLM([reply("should not reach")])
    agent = LLMAgent("t", "", "", llm=llm, hooks=[MyHook()])
    result = agent.run("trigger")
    assert result.hook_ended == "blocked"

See also

Offline voice agent framework for robots