What is the OpenAI Agents SDK used for?

The OpenAI Agents SDK is a Python framework for building autonomous AI agents that can use tools, execute code, manage files, and complete multi-step tasks. The April 2026 update adds native sandbox execution across seven cloud providers, a standardized harness via AGENTS.

How does sandbox execution improve AI agent security?

Sandbox execution separates the harness layer (which holds credentials and orchestration logic) from the compute layer (where agent-generated code runs).

Which sandbox providers does the OpenAI Agents SDK support?

The SDK supports seven integrated cloud providers: E2B (microVM sandboxes), Vercel (serverless), Daytona (dev-containers), Cloudflare (edge Workers), Modal (GPU-capable), Blaxel (agent-native with observability), and Runloop (SOC 2 Type II enterprise).

What is the AGENTS.md harness in the OpenAI SDK?

AGENTS.md is a standardized convention for configuring AI agent behavior. Originally popularized by Claude Code, it is now natively supported in OpenAI's Agents SDK. Drop an AGENTS.

OpenAI Agents SDK April 2026 Update: What Changed and How to Use It

TL;DR

OpenAI Agents SDK April 2026 update adds sandbox execution, tracing, guardrails, and state snapshots. Complete guide with code examples for production AI agents.

The OpenAI Agents SDK received its most significant infrastructure update on April 15, 2026 — and if you are building autonomous AI agents in production, this is the release that changes your deployment architecture. The update introduces four interconnected capabilities: native sandbox execution across seven cloud providers, a model-native harness with standardized agent primitives, a portable Workspace Manifest abstraction for cloud storage, and long-running agent state with automatic snapshotting and rehydration. Together, these features push the Agents SDK from a helpful orchestration library into something that looks like production-grade agent infrastructure.

According to our analysis of the release notes and early developer feedback, this update directly addresses the three most painful gaps in agentic applications built before 2026: credentials leaking into model execution environments, agents dying mid-task when containers restart, and the impossibility of moving agent workloads between cloud providers without rewriting integration code. Each of those problems now has a first-class solution. Here is everything developers need to know.

The Four Pillars of the April 2026 SDK Update

OpenAI frames this as a “next evolution” rather than a breaking change, but the scope is substantial. The four additions each address a distinct layer of the agent execution stack.

1. Native Sandbox Execution

The headline feature is native support for running agent-generated code inside isolated sandboxes — controlled compute environments where the agent can execute shell commands, write files, install packages, and interact with tools without touching the host machine or production infrastructure. Developers can bring their own sandbox or choose from seven integrated providers:

E2B — microVM sandboxes optimized for short-lived coding tasks, with millisecond cold starts
Vercel — serverless sandbox environments with automatic scaling and edge proximity
Daytona — dev-container-based workspaces with persistent filesystem state
Cloudflare — Worker-based execution at the network edge for minimal latency
Modal — GPU-capable sandboxes for compute-intensive agent tasks, including ML inference
Blaxel — agent-native execution with observability built in from the ground up
Runloop — enterprise-focused, SOC 2 Type II compliant sandboxes for regulated industries

The SDK also ships with two local execution backends: UnixLocalSandboxClient for direct local development and DockerSandboxClient for containerized local environments. This gives teams a consistent development-to-production story without switching abstraction layers at deploy time.

2. Model-Native Harness

The harness is the standardized interface between the agent's reasoning layer and its execution environment. Before this update, every team built their own harness from scratch: how the model calls tools, how context is managed, how the agent reads custom instructions. The new harness standardizes these primitives at the SDK level:

Tool use via MCP — native Model Context Protocol integration means agents can discover and call tools from any MCP server without custom adapter code
Skills via progressive disclosure — agents load capability modules on demand rather than holding all available tools in context at once, reducing token consumption on complex tasks
Custom instructions via AGENTS.md — the same convention Claude Code popularized, now natively supported in OpenAI's SDK; drop an AGENTS.md file in a directory and the agent picks up custom rules, constraints, and context automatically
Code execution via the shell tool — the agent runs bash commands in the sandboxed environment with structured output and error handling
File edits via apply_patch — surgical file modification using a diff-based patching system that prevents overwrite errors on large files

The AGENTS.md support is particularly significant for teams already using Claude Code or other agent systems that follow this convention. The same instruction files that guide Anthropic's coding agent can now configure OpenAI's Agents SDK — a meaningful step toward interoperability between competing frameworks operating in the same repository.

3. Workspace Manifest

The Workspace Manifest is a declarative configuration object that describes everything an agent needs to operate: which files to mount, where to write outputs, and which external storage backends to connect. The abstraction makes agent workloads portable across providers without changing application code:

AWS S3 — mount S3 buckets as agent-accessible filesystems with IAM-controlled access
Google Cloud Storage — direct integration with GCS buckets, including Workload Identity Federation support
Azure Blob Storage — Managed Identity authentication with full container access
Cloudflare R2 — zero-egress-cost storage at the edge, ideal for cost-sensitive high-throughput pipelines

In practice, the Manifest means a team can define their agent workspace configuration once and deploy identically to E2B in development, Vercel in staging, and Modal on GPU instances in production — swapping the underlying compute with a single config change rather than rewriting storage integration code for each provider.

4. Long-Running Agent State

The most architecturally significant addition is externalized agent state. Previously, if a sandbox container was evicted, crashed, or hit a provider timeout mid-task, the agent's progress was lost and the task had to restart from scratch. The new SDK introduces automatic snapshotting of agent state to external storage, with rehydration into a fresh container on failure.

The mechanism: as an agent progresses through a multi-step task, the SDK periodically writes checkpoints to the connected storage backend. If the container is lost, the orchestrator detects the interruption, provisions a new container, restores the last checkpoint, and continues execution from that point. Long-running tasks — code review across large repositories, document processing pipelines, multi-hour research workflows — become reliable enough to deploy in production without custom recovery logic for the first time.

Why the Harness-Compute Separation Matters for Security

One of the most important but understated benefits of this architecture is what it does to your security posture. In the pre-sandbox model, agents typically ran in the same environment that held API keys, database credentials, and deployment tokens — because that was the environment they had access to. Agent-generated code that behaved unexpectedly could exfiltrate credentials or make unauthorized API calls in that shared environment.

The harness-compute split addresses this directly. The harness layer — which holds credentials, manages memory, and handles orchestration — runs in your secure environment. The compute layer — where model-generated code executes — runs in the isolated sandbox with no credentials in scope. Even if an agent's generated code contains malicious instructions from prompt injection, supply-chain attacks in dependencies, or unexpected tool behavior, the blast radius is limited to the sandbox. The credentials never enter the execution environment.

According to The New Stack's analysis of the release, this is the primary architectural motivation: “separating harness and compute helps keep credentials out of environments where model-generated code executes.” For any team running agents that process untrusted input — customer documents, external web content, or third-party data feeds — this separation is not optional. It is the minimum viable security posture for production agentic systems in 2026.

Provider	Best For	Cold Start	GPU Support	Compliance
E2B	Short-lived coding tasks, rapid iteration	<200ms	No	Basic
Modal	Compute-intensive tasks, ML inference	~1s	Yes (A100, H100)	SOC 2
Runloop	Enterprise, regulated industries	<500ms	No	SOC 2 Type II
Vercel	Web-adjacent tasks, serverless scale	~100ms	No	SOC 2
Daytona	Persistent dev environments, CI/CD	~2s	No	GDPR
Cloudflare	Edge tasks, minimal latency	<50ms	No	ISO 27001
Blaxel	Agent-native workflows, built-in observability	<300ms	No	SOC 2

The Four Pillars of the April 2026 SDK Update

1. Native Sandbox Execution

2. Model-Native Harness

3. Workspace Manifest

4. Long-Running Agent State

Why the Harness-Compute Separation Matters for Security

Try Our Free Tools

JSON Formatter & Validator

cURL to Code Converter

More from AI Tools & Tutorials

Imagen 3 & 4 Shut Down June 24: Migrate to Gemini Image (2026)

Sandbox Provider Comparison for Production Use

Getting Started: A Minimal Sandboxed Agent in Python

Python First, TypeScript Support Coming

Who Should Upgrade Now

Conclusion

People Also Ask

What is the OpenAI Agents SDK used for?

How does sandbox execution improve AI agent security?

Which sandbox providers does the OpenAI Agents SDK support?

What is the AGENTS.md harness in the OpenAI SDK?

Ready to ship faster?

One insight, every Monday. 7am IST. Zero fluff.

Comments · 0

Key takeaways · 5

Topics

Article stats

Regex Playground

Base64 Encoder / Decoder

UUID Generator

Grok Build Agent Dashboard: Run 8 Parallel Coding Agents From One Screen

Build an MCP Server in TypeScript (2026): Claude Code Guide

Income Tax Calculator India 2025-26: Complete Guide

OpenAI Codex Goal Mode Is Now GA — Multi-Hour Autonomous Coding Sessions

GitHub Copilot Token Billing Week 1: What Developers Are Actually Paying