Classical fuzzing — libFuzzer, AFL, honggfuzz — treats a target as a byte sink with a coverage signal. You feed it inputs, you measure branch coverage, you mutate the inputs that hit new edges, and the corpus grows until the target either crashes or reaches a saturation plateau. The loop has been refined for a decade and it is the reason every browser, kernel, and TLS stack worth shipping has a corpus of millions of inputs sitting next to its test suite.
Agentic systems do not fit that model cleanly. An agent target is not a function from bytes to bytes; it is a planner that emits tool calls, reads memory, mutates state, and produces side effects. The interesting failure is rarely a segfault — it is the agent obediently calling send_email to an attacker-controlled address because a string four levels deep in a retrieved document said it should. We needed a fuzz loop, but the operators, the coverage signal, and the determinism contract all had to be rebuilt from scratch.
This post is the engineering walkthrough of the mutator engine that ships in AgentGuardian Open Source today: five deterministic operators in strategies/fuzz.py, a coverage signal defined over plan depth and tool-call entropy, and a stub-mode contract that makes scan outputs hash-stable for external signing. The same engine is what lets the 96-probe corpus, sharded across the ten OWASP ASI 2026 categories, behave like a real coverage-guided fuzzer instead of a static checklist.
Why agentic fuzzing needs a different shape
libFuzzer assumes three things that do not survive contact with an agent target. First, the target is fast and pure — millions of executions per second, no global state. An agent execution is one to thirty seconds and writes to memory, vector stores, and external APIs by design. Second, coverage is a syntactic property of the program counter: SanitizerCoverage instruments basic blocks, and a new edge is unambiguous. For an agent, the equivalent of an edge is a plan transition or a tool call, and "new" is fuzzy — the same tool with different arguments may or may not count. Third, a classical crash is a clear ground truth. An agent failure is a policy violation, which is something a judge model has to evaluate against a written rule.
The implication is that you cannot lift libFuzzer wholesale. You keep its loop — corpus, mutate, dispatch, measure, prioritise — and replace every piece in between. You also keep its discipline: determinism is non-negotiable. A fuzz output that is not reproducible cannot be triaged, cannot be regressed, and cannot anchor a signed evidence pack.
The mutator engine has to be deterministic end-to-end. Same seed, same input, same output bytes — across machines, across releases.
The five shipped operators
AgentGuardian Open Source ships five mutators in the _MUTATORS tuple inside src/agent_guardian/strategies/fuzz.py. They are intentionally small, intentionally orthogonal, and intentionally cheap. Cheap matters: the fuzz loop runs every mutator against every selected probe in the corpus, and a slow operator turns a ten-minute smart scan into an hour.
1. oversize
Concatenates the input to itself between five and twenty times. This is the simplest possible length-amplification primitive and it hits more bugs than its triviality suggests. Context-window truncation, system-prompt eviction, and naive token budgeting all fail visibly under repeated input. The operator is also a cheap denial-of-wallet probe — long inputs drive long completions, which drive bills.
2. control_chars
Injects null bytes, BOM, CR/LF, tab, and right-to-left override characters at a random offset. The target classes are policy filters that strip on a regex, JSON parsers that accept