Adversarial red-teaming for the agents you ship.
One pip install. A 14-specialist swarm fires 96 probes against your agent and writes a SARIF report your CI can fail on. Apache-2.0, deterministic by default, no LLM key required.
pip install agent-guardianv1.0.0Install from PyPI.
AgentGuardian ships as a single PyPI package with a deterministic offline mode. No LLM key, no outbound network, no signup. Currently version 1.0.0 — formerly tracked on PyPI as rc2.
$ pip install agent-guardian
$ agent-guardian --version
agent-guardian 1.0.0
$ agent-guardian probes list
96 probes across asi01..asi10--version and probes list verify the install and the probe corpus before the first scan.
Stub mode runs the entire swarm against deterministic fixtures with no LLM key and no network. Output is byte-stable, hash-comparable, and safe for CI runners without secrets.
Switch to live mode by exporting an ANTHROPIC_API_KEY or OPENAI_API_KEY — the same CLI, the same probes, real model judgements on the same fixtures.
First scan in five minutes.
Point the CLI at a local HTTP endpoint, read the five report artefacts, and you have an AIVSS posture number with per-finding severity rolled up by tier weights.
# 1. install
$ pip install agent-guardian
# 2. point at your agent — any HTTP endpoint, MCP server, or framework adapter
$ agent-guardian scan --target http://localhost:8080 --mode fast
# 3. read the five artefacts
$ ls reports/
report.html report.pdf report.sarif report.json evidence/
# 4. gate CI on the AIVSS posture number
$ agent-guardian scan --target http://localhost:8080 --fail-under 70Run against your endpoint.
--target http://localhost:8080 --mode fast walks every probe in the asi01..asi10 corpus, with target-fingerprinting recon to pick relevant attack vectors.
Read the five artefacts.
report.html, report.pdf, report.sarif, report.json, and an evidence/ directory with per-probe transcripts.
Read the posture number.
AIVSS is a deterministic 0–100 score. Per-finding severity rolls up via tier weights — T1 controls dominate, T4 nits are floor noise. The same target on the same probes always returns the same number.
The repo and the 96-probe corpus.
The probe corpus, the mutator engine, and the specialist swarm all live in the OSS tree. Nothing is hidden behind a binary — every probe trigger, mutator, and judge predicate is a Python module you can read and fork.
src/agent_guardian/probes/Ten directories asi01..asi10, 96 probes total. Manifest at probes/_meta/version.yaml.
src/agent_guardian/strategies/fuzz.pyFive deterministic mutators: oversize, control_chars, truncate, type_confusion, encoding.
src/agent_guardian/agents/Coordinator, ten ASI specialists, four OWASP-LLM specialists, an evaluator, and a shared VectorMemory for cross-probe context.
src/agent_guardian/scoring/aivss.pyThe full AIVSS formula in one file — vector parsing, severity bucketing, tier-weighted rollup, posture compositing.
src/agent_guardian/adapters/First-class adapters for langgraph, crewai, openai-agents, autogen, adk, strands, and plain http.
src/agent_guardian/judges/Per-probe success predicates: regex, structured-JSON, semantic similarity, tool-call observation, plus a JDG meta-judge for cross-probe consistency.
Every framework you actually use.
Adapters normalise wildly different agent runtimes into one probe surface. Write a probe once, run it against LangGraph, CrewAI, an MCP server, or a raw HTTP endpoint without rewriting the trigger.
MCP servers and RAG applications are covered via the http adapter with target-fingerprinting recon. The custom adapter SDK requires one Python class, three methods, and a fixture for stub-mode determinism.
CI gating with --fail-under.
Wire AgentGuardian into a pull-request check the same way you wire unit tests. One threshold flag, one exit code, one SARIF upload — and the entire scan stays hermetic.
# .github/workflows/agent-guardian.yml
name: agent-guardian
on: [pull_request]
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: pip install agent-guardian
- run: agent-guardian scan \
--target http://localhost:8080 \
--fail-under 70
- uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: reports/report.sarif--fail-under 70 exits non-zero when the AIVSS posture drops below the threshold. Promote the threshold over time as findings get fixed.
Upload to GitHub Code Scanning, GitLab vulnerability reports, or Sonar. Every finding carries the ASI category, MITRE ATLAS technique ID, and AIVSS sub-score.
Stub mode is the CI default — deterministic fixtures, no LLM key, no outbound network. CI runs are byte-stable across machines and across re-runs.
Three taxonomies on every finding.
Every AgentGuardian finding lands with the OWASP ASI category, the MITRE ATLAS technique ID, the CSA Agentic AI Red Teaming Guide bucket, and an AIVSS sub-score — not one of them, all of them.
Top 10 for Agentic Applications — the primary taxonomy. Every probe is filed under one ASI category.
Technique ID on every finding. Tactic groupings (reconnaissance, initial access, persistence, exfiltration) align with classic adversary modelling.
CSA Agentic AI Red Teaming Guide category on every finding — the bridge between probe corpus and cloud-security review workflows.
Deterministic posture score. Formula in src/agent_guardian/scoring/aivss.py — read it, fork it, audit it.
Probes, adapters, and docs.
Apache-2.0 with patent grant. No CLA — DCO sign-off only: git commit -s. The contribution flow is documented, the issue tracker tags first-timers, and design discussion happens on Discord.
Contribute a probe.
Choose the ASI category. Write the trigger and the success predicate. Add a stub fixture so the probe is deterministic in CI. Open a PR with DCO sign-off.
See CONTRIBUTING.md · probes/_meta/version.yaml
Contribute an adapter.
One Python class. Three methods: connect(), send(), fingerprint(). One stub fixture for the determinism contract.
Good-first-issue tag · adapter coverage track
Contribute docs.
Quickstarts, adapter walkthroughs, mutator hardening notes, regulator-mapping explainers — all live under docs/. RELEASING.md documents the cut, sign, and publish flow.
CONTRIBUTING.md · RELEASING.md
What is not in OSS, and why Enterprise.
The OSS distribution is the engine — the swarm, the probes, the mutators, the scoring, the reports. Enterprise adds the hosted control plane, the policy engine, the regulator-pack templates, and the advanced mutator catalogue.
AgentGuardian is not a runtime guardrail. It is not a managed SaaS in the OSS distribution. It is not a guarantee of absence of vulnerabilities. It is a deterministic, auditable adversarial-swarm framework that exercises your agent against published taxonomies and writes a posture number you can defend.
Public roadmap — 1.x and 2.0.
The roadmap is on GitHub. Every milestone ties back to a published taxonomy update or a concrete interop target. NIST AI RMF and EU AI Act are explicitly not mapped today — they land in 2.0 with the externalised scoring backend.
Probe corpus to ASI v2.
Expand the corpus as OWASP publishes ASI v2 categories. Add CSA Agentic-RT mappings as the guide updates. Inspect AI interop for shared eval workflows.
Adversarial-image probes.
A probe family for multimodal agents — vision injection, OCR poisoning, steganographic payloads. MITRE ATLAS v5.5 sync as the techniques publish.
Pluggable judge interface.
Pluggable judges with calibration tests. Externalised scoring backends. NIST AI RMF and EU AI Act mapping land here — today they are explicitly not mapped.
Three things only AgentGuardian ships in OSS.
Only OSS tool shipping a multi-agent swarm by default.
Fourteen specialists with a shared VectorMemory across probes. Other OSS scanners ship a single judge — AgentGuardian ships a coordinated adversarial team.
Only OSS entry mapping every finding to all three.
OWASP ASI 2026, MITRE ATLAS v5.4.0, and the CSA Agentic AI Red Teaming Guide — together, on every finding, not one of them at a time.
Only OSS tool publishing the formula with a stub mode.
Hash-stable, signable outputs for the same target on the same probes. Reproducibility is a feature of the scoring layer, not an afterthought.
One pip install. Five report artefacts.
Install AgentGuardian, point it at a local endpoint, read the SARIF, gate CI on the AIVSS posture number. Five minutes from pip install to first finding.
Need hosted dashboards, Cedar 4.5 policy enforcement, KMS-signed evidence, and regulator packs? Explore AgentGuardian Enterprise.