Blogs

Industry research on agent security, red teaming, and AI governance.

Analysis of the threat landscape facing autonomous AI agents — frameworks, real incidents, and the controls enterprises are putting in place to deploy agents responsibly in 2026.

Blogs

Research from the frontline.

Industry analysis on the agent security landscape — threat modelling, regulatory frameworks, real-world incidents, and the controls enterprises are putting in place. Written for the engineers, security leads, and board members who will live with the answers.

Security

OWASP Agentic Top 10 (2026): A Risk-by-Risk Walkthrough

The 2026 standard reorders agent risk. Here is what changed, what each ASI category actually looks like in production, and how to score it.

Read note
Security

MITRE ATLAS for LLM Agents: Mapping Tactic by Tactic

We took ATLAS v5.4.0 and turned every technique into an executable probe against ReAct, LangGraph, and MCP agents. The translations were not always obvious.

Read note
Technical

AIVSS Explained: Scoring Agent Vulnerabilities Beyond CVSS

CVSS was built for software. Agents have autonomy, tools, and memory. Here is the scoring math we use, with one finding scored end to end.

Read note
Technical

Inside the Swarm: How 14 Specialists Red-Team an Agent

One coordinator, ten ASI specialists, four OWASP-LLM specialists, a shared vector memory. A look at the architecture we landed on after a year of iteration.

Read note
Technical

T1-T4: A Tier Model for Agent Attack Surface

Tools plus memory plus PII is not the same risk as a stateless agent. A four-tier model that changes how we prioritise red-team work.

Read note
Technical

Building a Deterministic Mutator Engine for Agent Fuzzing

Five mutators, a seeded RNG, and a 96-probe corpus turn into coverage-guided fuzzing that you can replay byte-for-byte three months later.

Read note
Security

MCP Tool-Description Rug Pull: Anatomy and Detection

An MCP server changes its tool descriptions after the user approves it. The agent never notices. Here is what the attack looks like on the wire.

Read note
Security

A2A Signed-Message Replay: When Signed Doesn't Mean Safe

Agent Card signing in A2A v0.3 is opt-in. We replayed a signed envelope twelve hours later, and the receiving agent accepted it. A hardening recipe.

Read note
Security

ReAct Hijack Anatomy: Foot-in-the-Door on Agent Loops

ReAct loops do not self-correct. One foot-in-the-door turn compounds across iterations until the agent is doing something its operator never asked for.

Read note
Security

Supply-Chain Attacks on Agents: Rogue Plugins and MCP

Eight probes against published plugins and MCP servers. Registry spoofing, plugin hijack, poisoned checkpoints — and what the audit trail looks like.

Read note
Security

Memory Poisoning of Long-Lived Agents: A Test Methodology

Agents that remember are agents you can poison. Persistent triggers, RAG corpus inject, cross-tenant vector bleed — and a probe pattern that catches them.

Read note
Security

Goal Drift and Reward Hacking: Probes That Find Them

Long-horizon agents quietly optimise for the wrong objective. The ASI10 probes that surface drift, mode shift, and capability masking before production does.

Read note
Security

Judge Injection: Poisoning the LLM-as-Judge Layer

When the evaluator becomes the target, every metric downstream lies. A walkthrough of judge injection, detection signals, and defence patterns that hold up.

Read note
Security

Vision Injection: Adversarial Images That Override Text

Hidden text in scanned forms, adversarial pixels in claims photos, ASCII art in uploads. Text-only defences miss all of it. Here is what slips through.

Read note
Technical

Installing AgentGuardian from PyPI: A 5-Minute Walkthrough

pip install agent-guardian, point it at a LangGraph or REST agent, read the report. The fastest possible path from zero to an AIVSS posture number.

Read note
Technical

Writing a Custom AgentGuardian Probe: A Tutorial

Domain-specific agents need domain-specific probes. We walk through writing one, registering it under the right ASI category, and emitting a scored finding.

Read note
Technical

Gating Pull Requests on AgentGuardian Findings

fail-under exit codes, SARIF 2.1.0 upload, and PR comments with AIVSS deltas. The wire-up for GitHub Actions, GitLab, and Jenkins — copy-pasteable.

Read note
Technical

From PyRIT and Garak to Swarms: A Capability Map

The open-source AI red-team landscape evolved from single-target orchestrators to adversarial swarms. A probe-coverage map across the tools we tested.

Read note
Business

Compliance Crosswalk: NIST AI RMF, ISO 42001, OWASP ASI 2026

A practical crosswalk from technical findings to NIST AI RMF MANAGE/MEASURE, ISO/IEC 42001 controls, and the 2026 OWASP categories. With table.

Read note
Business

From CVSS to AIVSS: A CFO Explainer for AI Risk

Finance leaders are about to be asked to sign off on a new scoring system. Here is what an AIVSS number tells the board, in language the CFO can use.

Read note
Business

What Board Members Should Ask About Agent Risk

Five questions that separate organisations governing agents from organisations hoping. A board-ready framework, drawn from NIS2, SR 11-7, and OWASP.

Read note
Business

Build, Buy, or OSS+SaaS: An Agent Red-Team Decision Framework

A practical decision tree for the agent red-team capability question — in-house, paid vendor, or the OSS+SaaS dual-track model that is winning in 2026.

Read note
Business

Procurement Questions for Agent Red-Team Vendors in 2026

Ten questions to drop into your RFP, drawn from OWASP ASI 2026, MITRE ATLAS, AIVSS, and what changed in the landscape after Promptfoo was acquired.

Read note

Want the same scans on your agent?

Run AgentGuardian locally with one pip install, or talk to us about the hosted dashboard.