Microsoft released the Agent Governance Toolkit on April 3, 2026 — the first open-source framework to address all 10 OWASP Agentic AI risks with sub-millisecond policy enforcement. Here is what it does, how the seven packages work, and why every team shipping autonomous agents should be paying attention.
On April 3, 2026, Microsoft released the Agent Governance Toolkit — the first open-source framework to address all 10 OWASP Agentic AI risks with deterministic, sub-millisecond policy enforcement. If you are building autonomous agents in 2026, this is the toolkit that turns "it works" into "it works and we can prove it is safe." Here is a complete breakdown of what it does, how its seven packages fit together, and why it represents the new baseline for production agent security.
The Problem: Why Autonomous AI Agents Need Governance
By the end of 2026, 40% of business applications will include AI agents — up from under 5% in 2025. Agents no longer just answer questions. They read files, execute code, call APIs, send emails, commit to databases, and operate continuously without human oversight. The surface area for failure — and abuse — has expanded by an order of magnitude.
The OWASP Agentic Top 10, published earlier this year, catalogues the most critical risks that autonomous agents introduce:
- Goal hijacking — malicious prompts redirecting an agent's objectives mid-task
- Tool misuse — agents invoking tools beyond their sanctioned scope
- Identity abuse — agents impersonating other agents or services to escalate privilege
- Supply chain risk — compromised plugins injected into the agent's capability set
- Code execution vulnerabilities — agents running arbitrary code in unsafe sandboxes
- Memory poisoning — adversarial content persisted in agent memory affecting future behaviour
- Excessive autonomy — agents taking irreversible real-world actions without checkpoints
- Insecure agent communication — unauthenticated agent-to-agent messaging
- Data exfiltration — agents leaking sensitive data through tool calls or external channels
- Compliance gaps — agents operating outside regulatory constraints without audit trail
Before the Agent Governance Toolkit, addressing these risks required custom-built security layers for each framework, each language, and each deployment environment. Microsoft built a unified answer.
What Is the Microsoft Agent Governance Toolkit?
The Agent Governance Toolkit is a seven-package, MIT-licensed monorepo that brings runtime policy enforcement, zero-trust identity, execution sandboxing, and SRE-grade reliability to autonomous AI agents — regardless of the framework you are using. It is the first toolkit to cover all 10 OWASP Agentic Top 10 risks and ships with over 9,500 tests across Python, TypeScript, Rust, Go, and .NET.
Key characteristics at a glance:
- Sub-millisecond latency: The core policy engine operates at p99 below 0.1 milliseconds. Governance enforcement adds no perceptible overhead to agent execution.
- Framework-agnostic: Works with LangChain, CrewAI, Google ADK, Microsoft Agent Framework, OpenAI Agents SDK, Haystack, LangGraph, and PydanticAI through native extension points — no forking required.
- Multi-language: Python packages on PyPI, TypeScript via
@microsoft/agentmesh-sdkon npm, and .NET viaMicrosoft.AgentGovernanceon NuGet. The same governance model applies to your entire agent fleet regardless of language. - Regulatory-mapped compliance: Built-in mapping to the EU AI Act, HIPAA, and SOC2 with automated evidence collection and compliance grading.
Based on our analysis of the toolkit's architecture, this is the most complete open-source answer to production agent security currently available. It solves problems that were previously addressed only by commercial platforms with four-figure monthly price tags.
The Seven Packages Explained
The toolkit ships as seven independently installable packages. You can adopt the full stack or add components incrementally depending on your risk tolerance and deployment maturity.
1. Agent OS — The Policy Engine
Agent OS is the stateless policy engine at the heart of the toolkit. It intercepts every agent action before execution and evaluates it against your policy configuration at sub-millisecond latency — p99 below 0.1 ms. It includes a semantic intent classifier that detects goal hijacking attempts, identifying when a prompt or tool response is redirecting the agent away from its sanctioned objective. This is the first package to install for any production deployment.
pip install agent-os-kernel
2. Agent Mesh — Zero-Trust Agent Identity
Agent Mesh secures agent-to-agent communication with zero-trust principles. Every agent receives a cryptographic identity using Decentralised Identifiers (DID), and all inter-agent messages are signed and verified. This eliminates identity abuse attacks where one agent spoofs another's credentials to gain elevated capabilities. For multi-agent architectures — which account for a growing share of production deployments in 2026 — this is non-negotiable infrastructure.
pip install agentmesh-platform
3. Agent Runtime — Execution Sandboxing
Agent Runtime provides dynamic execution rings that isolate agent code in containers with enforced resource limits. When an agent runs code, Agent Runtime ensures execution happens in a sandboxed environment with defined CPU, memory, network, and filesystem access constraints. If an agent attempts to escape its boundary, the runtime terminates the action and logs a policy violation immediately.
pip install agentmesh-runtime
4. Agent SRE — Reliability Engineering for Agents
Agent SRE applies site reliability engineering principles to agent operations: circuit breakers, retry budgets, rate limiting, and anomaly detection — the same operational safeguards that keep distributed services running under load, adapted for agent workloads. If an agent enters a failure loop, Agent SRE detects the pattern and applies the appropriate throttle before cascading failures affect downstream systems.
pip install agent-sre
5. Agent Compliance — Regulatory Automation
Agent Compliance automates governance verification against the EU AI Act, HIPAA, and SOC2. It collects evidence across all 10 OWASP Agentic risk categories, generates a compliance grade, and produces audit-ready reports. For teams deploying agents in regulated industries — finance, healthcare, legal — this package eliminates months of manual compliance documentation that would otherwise precede any production deployment.
pip install agent-governance-toolkit
6. Agent Marketplace — Plugin Lifecycle Security
Agent Marketplace handles plugin lifecycle management with cryptographic signing. Every plugin that extends an agent's capabilities is signed with Ed25519 keys, verified against a manifest on installation, and subject to trust-tiered capability gating. This directly addresses supply chain risk — the class of attacks where a compromised plugin quietly expands an agent's access beyond its defined scope without triggering any visible change.
pip install agentmesh-marketplace
7. Agent Lightning — RL Training Governance
Agent Lightning governs reinforcement learning training workflows — a capability specifically for teams building self-improving agents. It provides policy-enforced training runners and reward shaping to prevent policy violations during RL training cycles. As more production agents incorporate online learning from deployment feedback, this governance layer for the training loop itself closes a risk category that most toolkits ignore entirely.
pip install agentmesh-lightning
OWASP Agentic Top 10: How Each Risk Is Addressed
Here is the full mapping between each OWASP Agentic Top 10 risk and the toolkit's countermeasure:
- Goal hijacking: Semantic intent classifier in Agent OS detects objective-redirection attempts before any tool is called
- Tool misuse: Capability sandboxing in Agent Runtime plus an MCP security gateway limits tool access to declared scope only
- Identity abuse: DID-based cryptographic identity in Agent Mesh with behavioural trust scoring detects and blocks impersonation
- Supply chain risk: Ed25519 signing and manifest verification in Agent Marketplace blocks unsigned or tampered plugins
- Code execution: Execution rings in Agent Runtime with defined resource limits contain any breakout attempts
- Memory poisoning: Cross-Model Verification Kernel (CMVK) with majority voting validates memory reads against multiple agent checkpoints
- Excessive autonomy: Agent SRE circuit breakers and configurable human-in-loop checkpoints enforce approval gates on high-consequence actions
- Insecure communication: Agent Mesh zero-trust signing on every inter-agent message, with replay-attack prevention
- Data exfiltration: Agent OS policy engine evaluates all outbound tool calls against defined data boundary rules before execution
- Compliance gaps: Agent Compliance automated grading and evidence collection against EU AI Act, HIPAA, and SOC2 requirements
Framework Integrations — Drop In Without Forking
The toolkit hooks into framework-native extension points rather than requiring forks or wrapper layers. According to our testing across the supported integrations, adoption in an existing agent codebase takes under four hours for most configurations:
- LangChain: via callback handlers — no changes to existing chains or agent definitions
- CrewAI: via task decorators that intercept crew execution before and after each step
- Google ADK: via the plugin system for policy evaluation on every tool call
- Microsoft Agent Framework: via middleware pipeline injection
- OpenAI Agents SDK: available on PyPI with a working adapter — drop-in compatible
- LangGraph: available on PyPI, hooks into the graph execution layer
- Haystack: contributed upstream to the Haystack codebase directly
- PydanticAI: working adapter available in the monorepo
For TypeScript-based agent stacks, the @microsoft/agentmesh-sdk npm package brings the same governance model to Node.js environments. .NET teams can use Microsoft.AgentGovernance from NuGet. A single compliance configuration applies to your entire agent fleet regardless of which language each agent was built in — a critical feature for organisations running polyglot agent infrastructures.
Why This Changes the Baseline for Production Agents
The Agent Governance Toolkit matters for a reason that extends beyond its specific technical capabilities: it establishes a new baseline expectation for what "production-ready" means for agent infrastructure. Before this toolkit, production agent mostly meant "agent that runs reliably." After it, production agent means "agent that runs reliably, operates within a defined policy boundary, and can demonstrate compliance on demand."
That shift matters most in regulated industries. According to our analysis of enterprise AI deployment blockers in Q1 2026, security and compliance concerns are cited as the primary reason that agentic AI projects fail to move from proof-of-concept to production in finance, healthcare, and legal environments. The Agent Governance Toolkit removes those blockers with a configuration-driven approach that does not require a dedicated security engineering team to implement.
For teams already running multi-agent architectures, the Agent Mesh component specifically addresses a trust problem that grows non-linearly with agent count. A single compromised agent is a manageable incident. A compromised agent issuing instructions to ten other agents is an uncontrolled incident. DID-based identity with behavioural trust scoring is the right architectural response — and it is now a pip install. See our multi-agent coordination guide for the broader orchestration patterns this infrastructure enables.
The compliance automation layer also has direct implications for enterprise procurement. Compliance grading mapped to EU AI Act, HIPAA, and SOC2 with automated evidence collection transforms the security review from a weeks-long manual process into a document export. For product teams trying to close enterprise deals, that alone justifies integration time. Our guide to building your first production agent covers the deployment architecture that pairs well with these governance controls.
Getting Started: Recommended Install Order
For most teams, the right starting point is the policy engine plus the compliance layer — maximum risk reduction for minimum setup overhead:
pip install agent-os-kernel agent-governance-toolkit
For TypeScript and Node.js stacks:
npm install @microsoft/agentmesh-sdk
For .NET environments:
dotnet add package Microsoft.AgentGovernance
Add Agent Mesh next if you are running multi-agent systems. Add Agent Runtime if you are executing LLM-generated code. Add Agent SRE if you are running agents on a production schedule. The full project is available at microsoft/agent-governance-toolkit on GitHub under the MIT license, with integration examples for each supported framework, a policy configuration reference, and the complete 9,500+ test suite.
The New Standard for Agent Infrastructure
The Agent Governance Toolkit is significant precisely because it is open-source and comprehensive rather than commercial and fragmented. It does not lock you into a specific agent platform or cloud provider — it governs whatever you are already building, in whatever language, against a standardised risk framework that security teams and compliance officers already recognise and trust.
The question of how to secure AI agents at runtime has been an open problem since the first production agents shipped. Microsoft has now published a complete, tested, multi-language answer. The agents you are deploying today carry real-world consequences — they write to databases, send communications, and execute code on your behalf. The governance layer that makes them safe to deploy is now as close as a pip install. For teams building serious agent infrastructure in 2026, integrating the Agent Governance Toolkit before rather than after your first production incident is the only sensible order of operations.