Blogs

Industry research on agent security, red teaming, and AI governance.

Analysis of the threat landscape facing autonomous AI agents — frameworks, real incidents, and the controls enterprises are putting in place to deploy agents responsibly in 2026.

pip install agent-guardian Use Cases

Blogs

Research from the frontline.

Industry analysis on the agent security landscape — threat modelling, regulatory frameworks, real-world incidents, and the controls enterprises are putting in place. Written for the engineers, security leads, and board members who will live with the answers.

SecurityJun 2, 2026

OWASP Agentic Top 10 (2026): A Risk-by-Risk Walkthrough

The 2026 standard reorders agent risk. Here is what changed, what each ASI category actually looks like in production, and how to score it.

SecurityJun 2, 2026

MITRE ATLAS for LLM Agents: Mapping Tactic by Tactic

We took ATLAS v5.4.0 and turned every technique into an executable probe against ReAct, LangGraph, and MCP agents. The translations were not always obvious.

TechnicalJun 2, 2026

AIVSS Explained: Scoring Agent Vulnerabilities Beyond CVSS

CVSS was built for software. Agents have autonomy, tools, and memory. Here is the scoring math we use, with one finding scored end to end.

TechnicalJun 2, 2026

Inside the Swarm: How 14 Specialists Red-Team an Agent

One coordinator, ten ASI specialists, four OWASP-LLM specialists, a shared vector memory. A look at the architecture we landed on after a year of iteration.

TechnicalJun 2, 2026

T1-T4: A Tier Model for Agent Attack Surface

Tools plus memory plus PII is not the same risk as a stateless agent. A four-tier model that changes how we prioritise red-team work.

TechnicalJun 2, 2026

Building a Deterministic Mutator Engine for Agent Fuzzing

Five mutators, a seeded RNG, and a 96-probe corpus turn into coverage-guided fuzzing that you can replay byte-for-byte three months later.

SecurityJun 2, 2026

MCP Tool-Description Rug Pull: Anatomy and Detection

An MCP server changes its tool descriptions after the user approves it. The agent never notices. Here is what the attack looks like on the wire.

SecurityJun 2, 2026

A2A Signed-Message Replay: When Signed Doesn't Mean Safe

Agent Card signing in A2A v0.3 is opt-in. We replayed a signed envelope twelve hours later, and the receiving agent accepted it. A hardening recipe.

SecurityJun 2, 2026

ReAct Hijack Anatomy: Foot-in-the-Door on Agent Loops

ReAct loops do not self-correct. One foot-in-the-door turn compounds across iterations until the agent is doing something its operator never asked for.

SecurityJun 2, 2026

Supply-Chain Attacks on Agents: Rogue Plugins and MCP

Eight probes against published plugins and MCP servers. Registry spoofing, plugin hijack, poisoned checkpoints — and what the audit trail looks like.

SecurityJun 2, 2026

Memory Poisoning of Long-Lived Agents: A Test Methodology

Agents that remember are agents you can poison. Persistent triggers, RAG corpus inject, cross-tenant vector bleed — and a probe pattern that catches them.

SecurityJun 2, 2026

Goal Drift and Reward Hacking: Probes That Find Them

Long-horizon agents quietly optimise for the wrong objective. The ASI10 probes that surface drift, mode shift, and capability masking before production does.

SecurityJun 2, 2026

Judge Injection: Poisoning the LLM-as-Judge Layer

When the evaluator becomes the target, every metric downstream lies. A walkthrough of judge injection, detection signals, and defence patterns that hold up.

SecurityJun 2, 2026

Vision Injection: Adversarial Images That Override Text

Hidden text in scanned forms, adversarial pixels in claims photos, ASCII art in uploads. Text-only defences miss all of it. Here is what slips through.

TechnicalJun 2, 2026

Installing AgentGuardian from PyPI: A 5-Minute Walkthrough

pip install agent-guardian, point it at a LangGraph or REST agent, read the report. The fastest possible path from zero to an AIVSS posture number.

TechnicalJun 2, 2026

Writing a Custom AgentGuardian Probe: A Tutorial

Domain-specific agents need domain-specific probes. We walk through writing one, registering it under the right ASI category, and emitting a scored finding.

TechnicalJun 2, 2026

Gating Pull Requests on AgentGuardian Findings

fail-under exit codes, SARIF 2.1.0 upload, and PR comments with AIVSS deltas. The wire-up for GitHub Actions, GitLab, and Jenkins — copy-pasteable.

TechnicalJun 2, 2026

From PyRIT and Garak to Swarms: A Capability Map

The open-source AI red-team landscape evolved from single-target orchestrators to adversarial swarms. A probe-coverage map across the tools we tested.

BusinessJun 2, 2026

Compliance Crosswalk: NIST AI RMF, ISO 42001, OWASP ASI 2026

A practical crosswalk from technical findings to NIST AI RMF MANAGE/MEASURE, ISO/IEC 42001 controls, and the 2026 OWASP categories. With table.

BusinessJun 2, 2026

From CVSS to AIVSS: A CFO Explainer for AI Risk

Finance leaders are about to be asked to sign off on a new scoring system. Here is what an AIVSS number tells the board, in language the CFO can use.

BusinessJun 2, 2026

What Board Members Should Ask About Agent Risk

Five questions that separate organisations governing agents from organisations hoping. A board-ready framework, drawn from NIS2, SR 11-7, and OWASP.

BusinessJun 2, 2026

Build, Buy, or OSS+SaaS: An Agent Red-Team Decision Framework

A practical decision tree for the agent red-team capability question — in-house, paid vendor, or the OSS+SaaS dual-track model that is winning in 2026.

BusinessJun 2, 2026

Procurement Questions for Agent Red-Team Vendors in 2026

Ten questions to drop into your RFP, drawn from OWASP ASI 2026, MITRE ATLAS, AIVSS, and what changed in the landscape after Promptfoo was acquired.

Want the same scans on your agent?

Run AgentGuardian locally with one pip install, or talk to us about the hosted dashboard.

Try Open Source Book a Demo