The Seven Packages Explained
The toolkit ships as seven independently installable packages. You can adopt the full stack or add components incrementally depending on your risk tolerance and deployment maturity.
1. Agent OS — The Policy Engine
Agent OS is the stateless policy engine at the heart of the toolkit. It intercepts every agent action before execution and evaluates it against your policy configuration at sub-millisecond latency — p99 below 0.1 ms. It includes a semantic intent classifier that detects goal hijacking attempts, identifying when a prompt or tool response is redirecting the agent away from its sanctioned objective. This is the first package to install for any production deployment.
pip install agent-os-kernel
2. Agent Mesh — Zero-Trust Agent Identity
Agent Mesh secures agent-to-agent communication with zero-trust principles. Every agent receives a cryptographic identity using Decentralised Identifiers (DID), and all inter-agent messages are signed and verified. This eliminates identity abuse attacks where one agent spoofs another’s credentials to gain elevated capabilities. For multi-agent architectures — which account for a growing share of production deployments in 2026 — this is non-negotiable infrastructure.
pip install agentmesh-platform
3. Agent Runtime — Execution Sandboxing
Agent Runtime provides dynamic execution rings that isolate agent code in containers with enforced resource limits. When an agent runs code, Agent Runtime ensures execution happens in a sandboxed environment with defined CPU, memory, network, and filesystem access constraints. If an agent attempts to escape its boundary, the runtime terminates the action and logs a policy violation immediately.
pip install agentmesh-runtime
4. Agent SRE — Reliability Engineering for Agents
Agent SRE applies site reliability engineering principles to agent operations: circuit breakers, retry budgets, rate limiting, and anomaly detection — the same operational safeguards that keep distributed services running under load, adapted for agent workloads. If an agent enters a failure loop, Agent SRE detects the pattern and applies the appropriate throttle before cascading failures affect downstream systems.
pip install agent-sre
5. Agent Compliance — Regulatory Automation
Agent Compliance automates governance verification against the EU AI Act, HIPAA, and SOC2. It collects evidence across all 10 OWASP Agentic risk categories, generates a compliance grade, and produces audit-ready reports. For teams deploying agents in regulated industries — finance, healthcare, legal — this package eliminates months of manual compliance documentation that would otherwise precede any production deployment.
pip install agent-governance-toolkit
6. Agent Marketplace — Plugin Lifecycle Security
Agent Marketplace handles plugin lifecycle management with cryptographic signing. Every plugin that extends an agent’s capabilities is signed with Ed25519 keys, verified against a manifest on installation, and subject to trust-tiered capability gating. This directly addresses supply chain risk — the class of attacks where a compromised plugin quietly expands an agent’s access beyond its defined scope without triggering any visible change.
pip install agentmesh-marketplace
7. Agent Lightning — RL Training Governance
Agent Lightning governs reinforcement learning training workflows — a capability specifically for teams building self-improving agents. It provides policy-enforced training runners and reward shaping to prevent policy violations during RL training cycles. As more production agents incorporate online learning from deployment feedback, this governance layer for the training loop itself closes a risk category that most toolkits ignore entirely.
pip install agentmesh-lightning
OWASP Agentic Top 10: How Each Risk Is Addressed
Here is the full mapping between each OWASP Agentic Top 10 risk and the toolkit’s countermeasure:
- Goal hijacking: Semantic intent classifier in Agent OS detects objective-redirection attempts before any tool is called
- Tool misuse: Capability sandboxing in Agent Runtime plus an MCP security gateway limits tool access to declared scope only
- Identity abuse: DID-based cryptographic identity in Agent Mesh with behavioural trust scoring detects and blocks impersonation
- Supply chain risk: Ed25519 signing and manifest verification in Agent Marketplace blocks unsigned or tampered plugins
- Code execution: Execution rings in Agent Runtime with defined resource limits contain any breakout attempts
- Memory poisoning: Cross-Model Verification Kernel (CMVK) with majority voting validates memory reads against multiple agent checkpoints
- Excessive autonomy: Agent SRE circuit breakers and configurable human-in-loop checkpoints enforce approval gates on high-consequence actions
- Insecure communication: Agent Mesh zero-trust signing on every inter-agent message, with replay-attack prevention
- Data exfiltration: Agent OS policy engine evaluates all outbound tool calls against defined data boundary rules before execution
- Compliance gaps: Agent Compliance automated grading and evidence collection against EU AI Act, HIPAA, and SOC2 requirements
Framework Integrations — Drop In Without Forking
The toolkit hooks into framework-native extension points rather than requiring forks or wrapper layers. According to our testing across the supported integrations, adoption in an existing agent codebase takes under four hours for most configurations:
- LangChain: via callback handlers — no changes to existing chains or agent definitions
- CrewAI: via task decorators that intercept crew execution before and after each step
- Google ADK: via the plugin system for policy evaluation on every tool call
- Microsoft Agent Framework: via middleware pipeline injection
- OpenAI Agents SDK: available on PyPI with a working adapter — drop-in compatible
- LangGraph: available on PyPI, hooks into the graph execution layer
- Haystack: contributed upstream to the Haystack codebase directly
- PydanticAI: working adapter available in the monorepo
For TypeScript-based agent stacks, the @microsoft/agentmesh-sdk npm package brings the same governance model to Node.js environments. .NET teams can use Microsoft.AgentGovernance from NuGet. A single compliance configuration applies to your entire agent fleet regardless of which language each agent was built in — a critical feature for organisations running polyglot agent infrastructures.
Why This Changes the Baseline for Production Agents
The Agent Governance Toolkit matters for a reason that extends beyond its specific technical capabilities: it establishes a new baseline expectation for what “production-ready” means for agent infrastructure. Before this toolkit, production agent mostly meant “agent that runs reliably.” After it, production agent means “agent that runs reliably, operates within a defined policy boundary, and can demonstrate compliance on demand.”
That shift matters most in regulated industries. According to our analysis of enterprise AI deployment blockers in Q1 2026, security and compliance concerns are cited as the primary reason that agentic AI projects fail to move from proof-of-concept to production in finance, healthcare, and legal environments. The Agent Governance Toolkit removes those blockers with a configuration-driven approach that does not require a dedicated security engineering team to implement.
For teams already running multi-agent architectures, the Agent Mesh component specifically addresses a trust problem that grows non-linearly with agent count. A single compromised agent is a manageable incident. A compromised agent issuing instructions to ten other agents is an uncontrolled incident. DID-based identity with behavioural trust scoring is the right architectural response — and it is now a pip install. See our multi-agent coordination guide for the broader orchestration patterns this infrastructure enables.
The compliance automation layer also has direct implications for enterprise procurement. Compliance grading mapped to EU AI Act, HIPAA, and SOC2 with automated evidence collection transforms the security review from a weeks-long manual process into a document export. For product teams trying to close enterprise deals, that alone justifies integration time. Our guide to building your first production agent covers the deployment architecture that pairs well with these governance controls.
Getting Started: Recommended Install Order
For most teams, the right starting point is the policy engine plus the compliance layer — maximum risk reduction for minimum setup overhead:
pip install agent-os-kernel agent-governance-toolkit
For TypeScript and Node.js stacks:
npm install @microsoft/agentmesh-sdk
For .NET environments:
dotnet add package Microsoft.AgentGovernance
Add Agent Mesh next if you are running multi-agent systems. Add Agent Runtime if you are executing LLM-generated code. Add Agent SRE if you are running agents on a production schedule. The full project is available at microsoft/agent-governance-toolkit on GitHub under the MIT license, with integration examples for each supported framework, a policy configuration reference, and the complete 9,500+ test suite.
The New Standard for Agent Infrastructure
The Agent Governance Toolkit is significant precisely because it is open-source and comprehensive rather than commercial and fragmented. It does not lock you into a specific agent platform or cloud provider — it governs whatever you are already building, in whatever language, against a standardised risk framework that security teams and compliance officers already recognise and trust.
The question of how to secure AI agents at runtime has been an open problem since the first production agents shipped. Microsoft has now published a complete, tested, multi-language answer. The agents you are deploying today carry real-world consequences — they write to databases, send communications, and execute code on your behalf. The governance layer that makes them safe to deploy is now as close as a pip install. For teams building serious agent infrastructure in 2026, integrating the Agent Governance Toolkit before rather than after your first production incident is the only sensible order of operations.
Comments · 0
No comments yet. Be the first to share your thoughts.