Governance Framework
Five layers of governance for AI agents. The framework the market hasn't built.
Delegation · Authorization · Runtime · Model Integrity · Accountability

Your system can have perfect access control, complete logging, and clear accountability — and still be compromised. Because the threat isn't in the infrastructure. It's in the model.
DARMA is the only governance framework that asks: is the model itself integer?
Each layer covers a specific governance domain. The order is not arbitrary.
Who approves, escalates, and intervenes? No agent operates without a human-defined escalation path.
Delegation Audit
Learn more →Who has access to what — and is it enforced with policies? Access control and tamper-proof logging.
Agent Containment + OPA
Learn more →What happens when the agent executes? Is the system fail-closed? Real-time behavioral monitoring.
Honeypot Stress Test
Learn more →Is the model integer? Does it behave consistently with its specification? Detects drift, sabotage, and subversion.
Agent Shield Runtime
Who owns the responsibility when it fails? Immutable logging ready for EU AI Act Art. 12.
Audit Ledger
Learn more →DARMA is not a checklist. The sequence is the architecture.
Delegation comes first because no agent should act without a defined human intervention point. Before anything else, the system must know when to stop and ask.
Authorization comes second because principles must be encoded, not declared. A governance policy that lives in a document does not constrain an agent at runtime.
Runtime comes third because what an agent is permitted to do and what it actually does are not the same thing. Enforcement requires observation at the execution boundary.
Model Integrity comes fourth because the three layers above assume the model behaving is the model that was verified. That assumption must be tested, not trusted.
Accountability closes the loop because governance without consequence is theater. Every action requires a trace. Every trace requires an owner.
Traditional governance assumes the model is trustworthy. DARMA verifies it. Two threats that none of the other four layers catch:
The model's capabilities degrade over time — it gets worse without visible cause. Your logs show everything is running. But quality is declining.
Hidden behavioral patterns are embedded in the model. A sleeper agent that actively serves the attacker's interests — while appearing to be normal operation.
Passive
Periodically run identical test prompts against the model and log whether the output distribution changes. A compromised model will show statistical deviation.
Active
New model version? Sandbox with reduced permissions for the first N calls. Compare against historical baseline. Full authorization only when behavior is verified.
Research basis: IAPS — AI Integrity (2026), Microsoft Security Blog (2026), Frontiers in AI (2026)
Consider an organisation that deploys an AI agent for customer support. Authorization is in place — only approved staff have access. Runtime logging is active. The Audit Ledger documents every session. Delegation is defined. Everything looks correct.
Three months later, the LLM provider updates the model. No one in the organisation is notified. The agent's responses shift subtly — it starts recommending products it never recommended before, or it handles complaints with a different tone. Logs still show AUTHORIZED. Because nothing in the other four layers checks whether the model's behaviour has changed.
With Model Integrity, Agent Shield Runtime would have flagged the process as DRIFTING within the first 24 hours after the model update. The organisation would have known before their customers noticed.
DRIFTING is not an error. It is the signal that tells you something has changed — before it becomes a problem.
Hypothetical scenario illustrating the Model Integrity layer.
One command that tells you what no dashboard shows: how many AI agents are running on your system — and which ones are authorized.
$ npm run agent-shield:runtime ┌──────────────────────────────────────────────────────────────┐ │ AGENT SHIELD — Runtime Governance Audit DARMA Runtime │ ├──────────────────────────────────────────────────────────────┤ │ PID PROCESS LLM ENDPOINT STATUS │ ├──────────────────────────────────────────────────────────────┤ │ 4821 support-bot Anthropic (Claude) AUTHORIZED │ │ 5190 data-agent OpenAI (GPT) UNAUDITED │ │ 5344 mcp-relay MCP Server EXPIRED │ │ 6012 rag-pipeline Google (Gemini) DRIFTING │ └──────────────────────────────────────────────────────────────┘ 4 agents running. 1 authorized. 1 unaudited. 1 expired. 1 drifting.
AUTHORIZED
Authorized in Audit Ledger
UNAUDITED
No authorization found
EXPIRED
Authorization older than 24 hours
DRIFTING
Behavior changed since last scan
Runtime governance operates at three points in the execution flow:
1 — Pre-execution
Sequence pattern matching against known multi-step attack trajectories before the first tool call proceeds.
2 — At irreversibility boundary
Authority and dependency re-validation at the exact moment an action becomes irreversible — not at policy check, but at commit.
3 — Post-execution
Full tool call chain logged as trajectory, not individual events.
Fail-closed means: if conditions cannot be confirmed at commit-time, the action does not proceed.
Read the full analysis of the DARMA framework — what it solves, and why existing frameworks fall short.
Read the article on Substack →