From CVSS to AIVSS: A CFO Explainer for AI Risk

Most finance leaders already have a column in the quarterly risk pack for CVSS. It is the Common Vulnerability Scoring System: a 0–10 number that lets you compare a Windows kernel bug to an OpenSSL bug without being a security engineer. It has been part of board-level risk reporting since CVSS v2 landed in 2007, and CVSS v4.0 was published by FIRST in November 2023. Your CISO uses it; your auditors expect it; your cyber insurer benchmarks against it.

AIVSS — the AI Vulnerability Scoring System, currently tracked under OWASP — is the equivalent number for agentic AI systems. It is new, the methodology is still moving, and most CFOs have not yet seen it on a slide. This explainer is the version that should land on the audit committee deck: what changes, what a single AIVSS score actually tells you, and how to wire it into the same quarterly reporting muscle that already consumes CVSS.

Why CVSS works on a board slide

CVSS is a known quantity for one reason: it normalises wildly different failure modes into a single comparable number. A 9.8 is “drop everything” whether the root cause is a buffer overflow, a misconfigured S3 bucket, or an outdated TLS library. That property — comparability across heterogeneous risk — is what makes it survive the trip from a security ticket to a board paper.

Underneath the score is a fixed model: an attack vector, an attack complexity, a privilege requirement, a scope effect, and impact on confidentiality, integrity, and availability. The model is closed, public, and cited in NIST SP 800-30 risk guidance. A CFO does not need to know the sub-metrics; the CFO needs to know that the same input always produces the same number, and the number is comparable across vendors.

What changes when an agent can act, not just leak

CVSS was designed for software that processes input and returns output. Its three impact metrics — confidentiality, integrity, availability — assume the worst case is the loss, alteration, or unavailability of data. Agentic AI breaks that assumption in three places that matter to finance:

—Autonomy. An agent decides which tool to invoke. A traditional bug doesn't choose; an agent does, and that choice is the attack surface.
—Tool-use scope. The blast radius of an agent compromise is not the data the agent reads — it is every action the agent's tools can execute, including writes to financial systems, messages to customers, and irreversible API calls.
—Non-determinism. The same input does not always produce the same behaviour. CVSS's exploit-complexity sub-metric does not capture probabilistic exploitation: a jailbreak that works 30% of the time is still operationally severe.

AIVSS keeps the CVSS shape — a single number, deterministic from published inputs — but adds factors that capture those three properties. The OWASP AIVSS working draft adds ten agentic risk amplification factors on top of the CVSS base: autonomy, tool scope, multi-agent interactions, non-determinism, self-modification, memory persistence, dynamic identity, oversight reduction, adaptability, and adversarial attack surface. The implementation we ship is a 0–100 posture score with the same property CVSS has: same input, same number, and the formula is published so external auditors can recompute it without trusting the vendor.

What a single AIVSS number tells the board

On the audit-committee slide, you do not want fourteen sub-metrics. You want one number per agent and a tier label. Concretely:

Agent                          Tier   AIVSS   Trend (QoQ)
customer-refund-agent           T1     78      ▼ 6
collections-outreach-agent      T1     71      ▼ 4
internal-it-helpdesk-agent      T2     54      ▼ 12
sales-research-assistant        T3     38      ▲ 3
marketing-brief-generator       T4     22      ▼ 1

That table is what your CISO should hand you each quarter. It is the AI equivalent of “number of unpatched criticals ≥ CVSS 7” in your current cyber pack. The unit of measurement is the agent, not the model; the tier reflects what the agent is allowed to touch; and the trend column lets the audit committee see whether the AI risk surface is improving or eroding quarter over quarter.

Working with the CISO on tier weights

AIVSS is multiplied by a tier weight before it lands on the slide. The tier is not a technical question; it is a policy decision the CFO and CISO own jointly. The tier methodology that has converged in practice:

Tier	What the agent can reach	Examples
`T1`	Tools that move money, write to systems of record, touch PII or customer-identifying data, plus persistent memory.	Refunds, collections, KYC review, claims adjudication.
`T2`	Tools and persistent memory, but no direct customer-facing or monetary action.	Internal helpdesk, knowledge-base maintenance, vendor triage.
`T3`	Tools only, no persistent memory.	Research assistants, internal analytics queries.
`T4`	Prompt-only. No tool execution, no memory.	Drafting and summarisation copilots.

The CFO question on tiering is not technical — it is financial: what does it cost us to keep an agent at T1 rather than T2? The honest answer is that T1 demands a tighter red-team cadence, a runtime policy decision point in front of every tool call, evidence-pack archival, and named human reviewers. That has a real budget line. T2 is roughly half the cost; T3 and T4 are close to noise. Most teams discover, the first time they do this exercise, that they have classified far too many agents as T1 because nobody asked the question.

Putting AIVSS next to operational-loss data

AIVSS is most useful when it sits next to numbers the CFO already trusts. Three pairings work consistently:

—AIVSS vs MTTR (mean time to remediate). Two agents at AIVSS 70 are not equal if one ships fixes in three days and the other in six weeks. The trend column on MTTR is the operational-discipline signal.
—AIVSS vs fraud-loss. For T1 customer-facing agents, AIVSS movement should be reconcilable with the fraud-loss line. A widening gap — score improving, loss flat — typically means the wrong vulnerabilities are being fixed.
—AIVSS vs claims-leakage or chargeback rate. For insurance, refunds, and payments agents, the score change should explain part of the leakage variance. If it doesn't, the score is not yet calibrated against the real failure mode.

AIVSS is not a forecast of loss. It is a leading indicator of where the loss is most likely to come from next quarter.

Procurement: requiring AIVSS in third-party assessments

Most enterprise CFOs already require a SOC 2 Type II report and a current CVSS-scored penetration-test summary from material vendors. The minimal AI equivalent looks like this in the procurement questionnaire:

AI Vendor Assessment — required artefacts

1. List of AI agents in scope, with tier (T1–T4) and AIVSS posture
   score per agent at most 90 days old.
2. Red-team summary mapped to OWASP Agentic Top 10 (ASI01–ASI10) and
   MITRE ATLAS technique IDs.
3. Evidence pack (signed, timestamped) for each T1/T2 agent in scope.
4. Statement of conformance against ISO/IEC 42001 controls A.6 and A.8,
   and NIST AI RMF GOVERN-1.7 and MEASURE-2.7.

The point is not to make procurement adversarial. The point is to make the AI risk a vendor introduces comparable. Two vendors that both claim “enterprise-grade AI security” produce wildly different AIVSS profiles once you actually look. That comparison is hard to make from marketing pages; it is straightforward from a scored evidence pack.

Audit trail: KMS-signed evidence packs

The third leg of the CVSS-to-AIVSS analogy is evidence. A CVSS number on a board slide without a vulnerability-management workflow underneath it is decoration. The same is true for AIVSS. What unlocks the score for external auditors is the evidence pack behind it.

A useful evidence pack contains five things, all machine-verifiable:

—Findings file in SARIF 2.1.0 — the same format your code-scanning tools already emit, so existing audit pipelines consume it without bespoke work.
—Posture score in JSON with the published AIVSS formula version and the per-finding inputs, so an auditor can recompute the score independently.
—PDF/A-3 report for human review, with the JSON sidecar embedded.
—ECDSA-P384 signature against a per-tenant KMS key, RFC 3161 timestamp, hash-chain anchor.
—Mapping table: each finding tagged with OWASP Agentic Top 10 (ASI01–ASI10) and MITRE ATLAS techniques.

That bundle is what an external auditor or a regulator can use without calling the engineering team. It is also the artefact that satisfies the evidence requirements of ISO/IEC 42001 (the AI management-system standard published October 2023) and the documentation expectations of NIST AI RMF and the EU AI Act’s high-risk system obligations. AgentGuardian Open Source produces this bundle locally for every scan; the Enterprise tier archives it under WORM storage with a seven-year retention window and attaches it to the audit-committee pack automatically.

How to put it in a quarterly risk pack

One slide. Five rows. The rows are: total agents in production by tier; weighted AIVSS posture score, current and trend; number of T1/T2 agents with an open finding above the AIVSS materiality threshold; mean time to remediate T1 findings; evidence-pack coverage (percentage of T1/T2 agents with a signed pack less than 90 days old).

That is the same shape as the cyber slide that already exists in your pack. It uses metrics the audit committee already knows how to interrogate. And it replaces the current placeholder — usually a vague paragraph about “responsible AI” — with a number the CFO can defend.

The CFO’s job is not to score agents. The CFO’s job is to insist that the AI risk reported to the board is denominated in the same units as every other risk on the page. CVSS did that for software vulnerabilities a decade ago. AIVSS is doing it for agentic AI now. The sooner it gets on the slide, the sooner the conversation moves from sentiment to numbers.

From CVSS to AIVSS: a CFO explainer for AI risk.