A2A Signed-Message Replay: Why a Valid Signature Is Not a Safe Message

In August 2025, security researcher Johann Rehberger ran what he called the Month of AI Bugs: one disclosure a day, every day, against shipping AI systems. Buried in that month was a class of finding that did not get the headline treatment but quietly worried the people who run multi-agent platforms in production — replayed tool calls between agents. The original call was issued by a legitimate supervisor, signed correctly, and delivered to a peer that verified the signature. The attacker simply captured the wire payload and resent the same bytes. Every check passed. The downstream side effect fired twice.

This is the disquieting property of cryptographic identity in agent-to-agent (A2A) protocols. A signature answers a narrow question — did the holder of this key produce this exact byte sequence? It does not answer freshness, ordering, or single-use semantics. The distance between "signed" and "safe" is the entire attack surface this post is about, and it is the surface that OWASP's Agentic Top 10 and MITRE's ATLAS catalogue both spent 2025 and 2026 formalising.

What A2A v0.3 actually signs

The Agent2Agent (A2A) protocol, promoted to a Linux Foundation project during 2025, gives multi-agent meshes a common envelope for inter-agent calls — an Agent Card schema, a JSON-RPC over HTTP transport, and an optional JWS-based signature for the Agent Card itself. Implementations that opt in get a credible answer to the discovery-time question "is this peer who it claims to be?" The card is signed, the key chains back to an issuer, the verifier can decide whether to trust the chain.

What the v0.3 specification deliberately leaves to the application layer is the rest of the message-safety story: nonce uniqueness, signed timestamp freshness, scope binding to a specific task, and idempotency keys for side-effecting intents. These are not protocol bugs — they are scope decisions. A transport that mandated all of these would not fit the variety of meshes the working group is targeting. But scope decisions in a security-relevant protocol have a way of becoming the things adversaries operate inside. The default behaviour of most A2A adapters we have read through 2026 is to verify the Agent Card on discovery and then accept any well-formed envelope from that peer for the lifetime of the session. That is the gap.

The replay primitive and the idempotency illusion

A replay attack against an A2A message needs three preconditions and no cryptographic capability whatsoever. First, visibility on the message channel — a sidecar, a logging proxy, a misconfigured OpenTelemetry span exporter, or any of the indirect-injection channels Lakera Guard now catalogues as the "hidden threat" surface of modern AI systems (Lakera, 2025). Second, a receiving agent that does not enforce single-use semantics on inbound envelopes. Third, a downstream side effect — a payment, a ticket transition, a database write, an email — gated only by the inbound A2A identity check.

Teams hear "replay" and reach for the word idempotency. The instinct is understandable; the framing is wrong. HTTP idempotency keys exist to make retries safe for the sender. Replay protection exists to make duplicate delivery unsafe for the receiver. A treasury agent that treats two byte-identical settlement instructions as "the same legitimate request retried" is doing exactly what idempotency promises — and exactly the wrong thing when the second copy came from an attacker. Nonces, signed timestamps, and per-intent scope claims close the gap because they make the second copy distinguishable from the first to the verifier, not just to the sender.

Where this lives in the taxonomies

Signed-message replay sits at the intersection of two industry frameworks that practitioners are now expected to cite by name in board papers and in audit responses. The OWASP Top 10 for Agentic Applications places it under ASI07 (Agent-to-Agent Compromise) — risk language aimed at AppSec and platform owners. OWASP's broader 2025 LLM Top 10 update is already explicit that techniques marketed as safety features "do not actually solve the core vulnerability" of injection-class attacks; they merely ground the model (OWASP, 2025). The same caveat applies to A2A signing: it grounds identity, it does not secure the message.

MITRE ATLAS v5.1.0, refreshed in November 2025 with fourteen new agent-specific techniques contributed jointly with Zenity Labs, carries the corresponding adversarial framing as AML.T0057 (LLM Trusted Output Replay) (MITRE ATLAS). The ATLAS entry treats it as a known adversarial technique that reuses previously-authorised LLM or agent output to drive downstream actions. The Cloud Security Alliance's Agentic AI Red Teaming Guide (May 2025) catalogues the same defect class under its multi-agent exploitation category, alongside agent untraceability and confused-deputy delegation — the three failure modes that recur in CSA's 62-page test manual on inter-agent trust. One control surface, three reference taxonomies. Carry all three tags on the finding and the risk team, the detection team, and the auditor can each filter on the one they care about.

Defensive controls a verifier must add

Four controls, applied together, close the replay class on top of A2A's identity layer. None of them require changing the protocol; all of them require the receiving agent to do work the protocol does not mandate.

—Per-message nonce inside the signed payload. Persist a seen-set keyed by sender DID with a TTL aligned to the freshness window. Reject any envelope whose nonce has been observed. This single control defeats the cheapest, byte-replay attack.
—Signed iat/exp claims with a window measured in seconds, not hours. For an internal mesh, 30 seconds is generous. Reject envelopes with an expired exp or an iat too far in the future for plausible clock skew.
—Scope binding inside the JWS payload. The signature must cover the specific intent, the resource identifier, and any amount or quantity field. A signature good for one settle_invoice is then not good for another, even between the same agents.
—Mutual TLS at the message bus with certificate pinning to the Agent Card's declared key. This raises the cost of capture from "configure a logger" to "compromise a certificate authority" — it does not remove the need for the first three controls, but it removes the easy precondition for them.

Two additional controls show up in mature deployments. Capability tokens scoped per-task move authority off the channel and onto a short-lived bearer that the receiver redeems exactly once — concretely closing the confused-deputy variant where a captured envelope is replayed against a different downstream tool. Append-only, tamper-evident audit logs of every accepted envelope give the detection team a deterministic after-the-fact signal that the seen-set was bypassed or that two distinct-but-equivalent envelopes were accepted. The CSA Agentic AI Red Teaming Guide tests for both of these explicitly under its agent untraceability category — the testable proposition is that a replayed envelope leaves a distinguishable trace in the log even when it was wrongly accepted by the verifier.

Defence in depth, in order

Nonce, signed timestamp, scope binding, mTLS, capability tokens, audit log. Each control fails independently; together they reduce the attack from a one-line curl to a cryptographic compromise of the issuing key. That is the gap that turns "signed" into "safe."

How the CSA guide tests this category

The Cloud Security Alliance's Agentic AI Red Teaming Guide is the most operational document we have on what a real test of this defect class looks like. It is not a policy paper. It organises agent risk into twelve categories, gives concrete test recipes for each, and expects the tester to produce reproducible evidence. For signed-message replay specifically, the guide walks through three test patterns. The first is the byte-replay case described above — capture, resend, observe duplicate side effects. The second is the mutation case — change a field the signature does not cover and observe whether the verifier still accepts. The third, the one most production teams fail, is the cross-context case — replay an envelope captured in one transaction against an unrelated transaction on the same agent pair. That last test exercises the scope-binding control specifically, and it is the one that shows up cleanly in CI because it produces a deterministic failure signal.

Microsoft's AI Red Team made a similar point in its January 2025 paper Lessons from Red-Teaming 100 Generative AI Products: red-teaming an agent is a system-level practice, not a model-level benchmark, and the most productive findings tend to be the boring mechanical ones — protocol-layer defects, missing freshness checks, over-broad tool scopes — rather than novel jailbreaks. Replay testing sits exactly in that category. It is mechanical, it is automatable, and it produces a clear pass/fail signal per envelope schema.

A checklist for the security architect

Architects reviewing an A2A deployment can get most of the way to a credible answer with a short checklist drawn from the controls above. None of these questions require reading the protocol specification; all of them produce a yes-or-no answer that is auditable.

—Does every inbound envelope carry a nonce inside the signed payload, and does the verifier persist a seen-set keyed by sender DID?
—Are iat and exp signed claims with a freshness window measured in seconds, and is clock skew bounded explicitly?
—Does the JWS payload bind the signature to the specific intent and the resource identifiers — not just to the agent pair?
—Is the message bus mTLS with certificate pinning to the Agent Card key, or is a logger or sidecar a plausible capture vector?
—Does the audit log produce a distinguishable trace for two byte-identical envelopes, and is that log append-only and tamper-evident?

A deployment that answers "yes" to all five has closed the cheap variants of this attack. A deployment that answers "no" on any of the first three has a critical-severity finding waiting to be written up. That is the language a board paper can take — the mesh accepts identity-verified but unbounded-replayable instructions on critical side-effecting paths — and it is the language an internal audit team can map to ISO/IEC 42001 clause 8.2 (operational planning and control) and NIST AI RMF's MANAGE-2.3 (incident response readiness for AI system risks) without further translation.

Identity check is not a freshness check is not a scope check. The A2A protocol answers the first; verifiers that conflate the three accept replays for as long as the signing key lives.

Operationalising replay testing in CI

The practical takeaway is short. Replay testing belongs in continuous integration, not in an annual pen-test. The test inputs are stable — the protocol envelope schema does not change between runs — and the failure signal is deterministic, which makes the test a natural fit for a CI gate. Teams that have moved this into CI report two second-order benefits: protocol regressions get caught before they ship, and the security finding becomes a build-system finding, which tends to get fixed inside a sprint rather than inside a quarter.

For teams operationalising this surface today, the open-source AgentGuardian CLI ships a probe family that exercises the three CSA test patterns above against an A2A mesh and produces a deterministic pass/fail with an AIVSS-weighted score; the enterprise platform adds continuous scanning, a signed evidence pack, and a runtime policy decision point that can refuse to forward a replayed envelope at the gateway. Both are tools, not arguments. The argument — that a valid signature is not a safe message — is the part worth carrying out of this post regardless of what you run it with.

Signed-message replay in agent-to-agent protocols.