Advanced

securityzero-trustspiffeidentitycapability-tokens

Zero Trust & Identity-First Agent Security

An AI agent with a long-lived API key and no oversight is a serious security risk. This pattern removes standing credentials entirely and requires a fresh, single-use token for every action an agent takes.

Used in

Zero Trust Agent Security

Layer 04 · Agentic

Agent Workload

Zero-secret container

Executes without hardcoded keys

🪪

SPIRE Agent

Attestation + SVID fetch

service

Verifies the agent node identity

🏛️

SPIRE Server

Issues SPIFFE ID

service

Trust anchor issuing certificates

Cryptographic, short-lived

Workload Identity (SVID)

Secure X.509/JWT identity token

🔁

Token Exchange

RFC 8693 broker

service

Trades SVID for an access token

🎟️

Capability Token

Single-use, tight scope

storage

Grants temporary permission to act

⚖️

Policy Engine

OPA / Cedar evaluate

service

Validates requested action against policy

🛡️

Action Proxy

Approval gate + audit

service

Intercepts requests to external tools

Layer 04 · Agentic

Human Approver

High-risk review

Blocks sensitive actions until acknowledged

🔧

External Tool

SaaS / API

service

Target resource being modified

📊

Audit Log

Per-action trail

storage

Immutable record of agent actions

🚫

Revoker

Instant kill switch

service

Revokes JWT on bad behavior

Pan, zoom, and explore. Click export to download as PNG.

Interactive diagram — pan, zoom, and explore. Click export to download as PNG.

The "infinite session" problem — and how Zero Trust fixes it

Workload Identity with SPIFFE / SPIRE

Capability Tokens

Policy Engine & Approval Proxy

When to Use This Pattern

Trade-offs

📐Why long-lived credentials and autonomous agents are a dangerous combination

The "infinite session" problem — and how Zero Trust fixes it

The security model most teams start with looks like this: create a service account, generate an API key, paste it into the agent's environment as an environment variable. The agent uses this key for everything — reading data, writing data, calling external APIs — and the key stays valid indefinitely unless someone remembers to rotate it.

This creates what security practitioners call the "infinite session" problem. A compromised agent, or a misbehaving one, has full access to everything that key grants — for as long as the key exists. There is often no record of which specific actions the agent took. There is no limit on what it can do next. If the key leaks — and long-lived keys have a way of showing up in logs, error messages, and version control — an attacker gains persistent access.

Zero Trust agent security is built on a different principle: no agent should have standing permission to do anything. Instead of a stored credential, each agent has a cryptographic identity — a verifiable proof of who it is, derived from its runtime environment rather than a secret it carries. When it needs to take an action, it exchanges that identity for a token that is valid only for that one specific action and expires in seconds. A policy engine reviews every request. High-risk actions require a human to confirm before they proceed.

The result: a compromised agent's blast radius is limited to the single token it currently holds — one action, expiring in moments.

Next up:Workload Identity with SPIFFE / SPIRE

🔧Cryptographic proof of who an agent is, with no stored secrets

Workload Identity with SPIFFE / SPIRE

The first step is giving each agent a verifiable identity — not a username and password that any process with the right environment variable can use, but a cryptographic credential tied to the specific workload running at a specific time in a specific place. SPIFFE (Secure Production Identity Framework for Everyone) is an open standard for this. Each agent gets a SPIFFE ID — a URI like `spiffe://company.org/agents/data-analyst` — backed by a short-lived X.509 certificate or JWT token. Identity is verified from runtime signals: which Kubernetes pod the agent is running in, what hardware measurements the host machine produces. There are no passwords, no API keys, no secrets to manage. The certificate rotates automatically every few minutes.

Workload Identity with SPIFFE / SPIRE

🚫

Agent Pod

No baked secrets

service

🪪

SPIRE Agent

Node attestor

service

🏛️

SPIRE Server

SPIFFE CA

service

🔐

SVID

X.509 or JWT identity

storage

🔁

Rotation

Minutes — auto-renew

service

Pan, zoom, and explore. Click export to download as PNG.

🏛️

SPIRE Server

The certificate authority at the center of the identity system. Receives attestation evidence from agents and issues SPIFFE identity documents after verifying that the agent is genuinely the workload it claims to be.

🪪

SPIRE Agent

A daemon running on each host machine that collects attestation evidence (Kubernetes pod spec, TPM measurements, process metadata) and uses it to fetch identity documents on behalf of workloads on that host.

🔐

X.509 / JWT SVID

The identity document issued to an agent — a short-lived certificate or JWT containing the SPIFFE ID. Rotated automatically every few minutes. A leaked certificate is worthless within moments.

🚫

Zero-Secret Container

The agent container contains no API keys, no passwords, and no environment variable secrets. All credentials are derived at runtime from the attested identity. There is nothing useful to steal from the container image or its configuration.

Next up:Capability Tokens

🔧Single-use permissions that expire in seconds

Capability Tokens

Having an identity is not the same as having permission to act. An agent knowing that it is "data-analyst-42" does not mean it should be allowed to delete production data. Capability tokens separate the question of identity (who are you?) from authorization (what are you allowed to do right now?). When the agent needs to call a tool, it presents its SPIFFE identity to a token exchange service. The service checks whether this identity is allowed to perform this specific action on this specific resource at this moment. If yes, it issues a capability token — a signed JWT containing exactly those permissions, valid for seconds, bound to a single use. The tool accepts the token, verifies it, performs the action, and the token is consumed. Stolen tokens are useless almost immediately.

Capability Tokens

🔐

Workload SVID

Agent identity

storage

🔁

Token Exchange

RFC 8693 broker

service

🎟️

Scoped Claims

tool + resource + action

storage

⏱️

Short TTL

Seconds / minutes

service

💥

Single-Use

Consumed on first call

service

Pan, zoom, and explore. Click export to download as PNG.

🎟️

Scoped Claims

The token's payload specifies exactly one action on exactly one resource — "read this specific file", "call this API endpoint with this HTTP method". Nothing broader than what was explicitly requested.

⏱️

Short TTL

Tokens expire after seconds to minutes. Even if intercepted in transit, the window for misuse is extremely short — typically shorter than the time it would take to use the token for anything harmful.

🔁

Token Exchange Broker

The service that receives a SPIFFE identity assertion and issues a scoped capability token in return, implementing the RFC 8693 token exchange standard. It enforces which identities are allowed to request which capabilities.

💥

Single-Use

After the token is used for the action it was issued for, it cannot be reused. The tool server marks it as consumed. Replay attacks — intercepting a valid token and trying to use it again — fail immediately.

Next up:Policy Engine & Approval Proxy

🔧Evaluating every request and gating the high-risk ones

Policy Engine & Approval Proxy

Capability tokens control what an agent is technically permitted to do. The policy engine adds a business-reasoning layer: should this agent be allowed to do this, given the current context? The difference matters. An agent might hold a valid token to write to a database, but a policy might say that write operations affecting more than 1,000 rows always need a human review. Or that certain operations are only allowed during business hours. These are rules that change over time and belong in a policy language, not hardcoded in agent logic. A policy engine evaluates these rules on every request, and an approval proxy inserts a human checkpoint for actions that exceed a risk threshold.

Policy Engine & Approval Proxy

Layer 04 · Agentic

Agent Request

Carries capability token

⚖️

OPA / Cedar

Policy evaluation

service

🛡️

Action Proxy

Mediates + audits

service

📊

Audit Log

Tamper-evident

storage

Layer 04 · Agentic

Human Approver

High-risk HITL

🔧

External Tool

Real API call

service

Pan, zoom, and explore. Click export to download as PNG.

⚖️

OPA / Cedar

Declarative policy languages for writing authorization rules in a structured, reviewable format — "agents of type X can write to database Y unless the operation affects more than 1,000 rows." Rules live outside application code and can be updated without a deployment.

🛡️

Action Proxy

Every request from an agent to a tool passes through this proxy. It verifies the capability token, runs the policy evaluation, logs the action, and either forwards the request or blocks it with a reason.

👤

Human Approver

For actions the policy marks as high-risk — deleting data, making financial transactions, calling external APIs — the proxy holds the request and sends an approval notification to a human. The action only proceeds when a human approves it.

📊

Tamper-Evident Audit Log

Every action — whether allowed, blocked, or approved — is written to an append-only log with a cryptographic hash chain linking entries together. Impossible to silently delete or modify past entries without detection.

Next up:When to Use This Pattern

🎯Signs this is the right architecture for your situation

When to Use This Pattern

Agents operate across organizational or tenant boundaries where a credential leak would have cross-tenant consequences

Regulatory requirements demand a per-action audit trail showing exactly which agent did what, when, and with what authorization

You are replacing long-lived service account keys that cannot be safely scoped, rotated frequently, or limited in scope

Multi-agent systems where parent agents spawn sub-agents and need to safely delegate a subset of their own authority

Next up:Trade-offs

⚖️What you gain — and what it costs

Trade-offs

Benefit

Cost

No stored secrets means there are no long-lived credentials to leak, steal, or forget to rotate

Running SPIFFE/SPIRE adds operational complexity — certificate rotation, attestation setup, and high availability for the SPIRE server

Each action is scoped to exactly what was requested — a compromised agent can do very little damage

Token exchange adds a network round-trip to every tool call, increasing latency on hot paths

Cryptographic identity creates a trustworthy, verifiable audit trail that holds up to scrutiny

Tools that accept only static API keys need a compatibility shim before they can participate in this model

The architecture works consistently across cloud providers and on-premises environments

Writing OPA or Cedar policies is a new skill — expect a learning curve and an ongoing policy review process