On April 13, 2026, Cloudflare opened what it called Agents Week — a five-day series of product launches and infrastructure announcements targeting the fastest-growing problem in AI development: how to build agents that actually work in production. By the end of the week, the company had shipped nine distinct products spanning compute, memory, storage, networking, and developer tooling. Taken together, they form what Cloudflare is calling its Agent Cloud: a full-stack infrastructure platform for building, deploying, and scaling autonomous AI agents.
This guide covers every major announcement, explains the technical problems each one solves, and gives developers the practical information needed to decide where Cloudflare’s stack fits into their own agent architecture.
Why Cloudflare Is Going All-In on Agents
The positioning behind Agents Week is not subtle. Cloudflare believes the AI industry is in the middle of a transition from “AI that answers” to “AI that gets things done.” Answering questions requires a model and a prompt. Getting things done requires persistent state, durable compute, reliable memory across sessions, secure access to private infrastructure, and the ability to coordinate across tools and services over long time horizons.
Cloudflare’s existing platform already had most of the primitives these agents need: Workers for serverless compute, Durable Objects for stateful coordination, KV for global key-value storage, and R2 for object storage, running on a global network spanning over 330 cities. Agents Week is the company formalizing these primitives into a coherent agent platform, filling in the remaining gaps — memory, containers, Git-compatible storage — and publishing unified developer tooling around the complete stack.
Agent Memory: The Flagship Announcement
The most technically significant release from Agents Week is Agent Memory, currently in private beta. It is a managed service that gives AI agents persistent, retrievable memory across sessions without consuming context window space.
The Context Rot Problem
Every developer who has built a multi-turn AI agent has encountered context rot. The context window is finite — even at one million tokens, there is a ceiling. As conversations grow, older information gets pushed out. If you try to solve this by stuffing everything into context, the model’s ability to focus on what matters degrades. If you aggressively prune, you lose information the agent will need later. Neither option is satisfying, and for agents running over days or weeks, neither option scales.
Agent Memory solves this with a separate retrieval layer. Instead of keeping raw conversation history in context, the service extracts facts, preferences, and key events from conversations as they happen, stores them in a structured memory profile, and retrieves only what is relevant when the agent needs it. The result is an agent that gets smarter over time without its context window growing proportionally.
How the Ingestion Pipeline Works
When a conversation arrives for ingestion, it passes through a multi-stage extraction pipeline. The pipeline identifies information worth remembering — user preferences, stated goals, key facts, previous decisions — verifies the extracted memories against what is already stored to avoid redundancy or contradiction, classifies each memory by type and relevance, and writes the final set to the agent’s memory profile.
Cloudflare runs this extraction pipeline in the background, so ingestion does not block the agent’s response path. Memories accumulate over time and are continuously refined as new conversations arrive. The key design principle is that the agent’s working knowledge improves with usage, rather than growing stale or ballooning into an unmanageable context blob.
How Retrieval Works: Parallel Methods and Result Fusion
The retrieval architecture is built around one core insight: no single retrieval method works well across all query types. Keyword search works for specific named entities. Semantic vector search works for conceptual queries. Topic-based lookup works for thematic retrieval. Agent Memory runs all three in parallel and fuses the results.
The retrieval pipeline begins with concurrent query analysis and embedding generation. The query analyzer produces three distinct outputs: ranked topic keys for the memory profile, full-text search terms expanded with synonyms, and a HyDE document — a Hypothetical Document Embedding, which is a generated passage representing what a perfect answer would look like, used to improve semantic retrieval accuracy by anchoring the embedding search to the shape of an ideal result rather than the query itself. These three signals feed parallel search branches whose outputs are merged, ranked by relevance, and returned to the agent as a concise, prioritized list.
Developer Integration
Developers access Agent Memory through five core operations: ingest, remember, recall, list, and forget. The service is accessible via a binding from any Cloudflare Worker, or via REST API for agents running on other infrastructure:
// Bind to Agent Memory in your Worker
const memory = env.AGENT_MEMORY;
// After a conversation turn, extract and store memories
await memory.ingest({ sessionId: userId, messages: conversationTurn });
// Before a new turn, retrieve relevant context
const recalled = await memory.recall({ query: userMessage, limit: 10 });
All stored memories are fully exportable at any time. Cloudflare has committed to complete data portability — the knowledge your agents accumulate on the platform can leave with you if requirements change.
Compute: Sandboxes GA and Dynamic Workers
Two distinct compute primitives emerged from Agents Week, targeting different execution profiles and workload types.
Sandboxes Move to General Availability
Container-based Sandboxes — previously in beta — reached general availability during Agents Week. A Sandbox is a persistent, isolated Linux environment with a shell, a filesystem, and the ability to run background processes. An agent can clone a repository, install packages, run builds, execute tests, and iterate in a Sandbox with the same tight feedback loop a human developer gets at a local terminal.
Sandboxes are the appropriate primitive for long-running, stateful agent tasks: code generation and testing, data processing pipelines, research agents that need to install tools and maintain a working directory, and any workflow where the agent needs to persist state across multiple steps in a single session.
Dynamic Workers: Code Execution in Milliseconds
Dynamic Workers are a new isolate-based runtime that lets AI agents execute code generated on the fly in a secure, sandboxed environment. The critical differentiator from Sandboxes is startup time and horizontal scale: Dynamic Workers start in milliseconds — approximately 100 times faster than containers — use a fraction of the memory, and scale to millions of concurrent executions with no warm-up latency or concurrency ceiling.
The right use case for Dynamic Workers is any scenario where an agent generates short-lived code and needs to execute it immediately: evaluating expressions, running data transformations, testing generated code snippets, or serving agent-built UI components. Durable Object Facets extend Dynamic Workers further by allowing each isolate to instantiate its own isolated SQLite database, enabling stateful dynamic code platforms at scale without manual database provisioning.
Artifacts: Git-Compatible Versioned Storage for Agents
Artifacts is a new storage primitive designed for code and structured data produced by agents. It is Git-compatible — any standard Git client can interact with it directly. The service supports creating tens of millions of repositories, forking from any remote, and sharing outputs via URL with no additional hosting infrastructure.
The practical value is in closing a gap that plagues most agent architectures: when an agent produces a file, a patch, a dataset, or a codebase, it needs somewhere durable and versioned to put it. Artifacts fills that gap. A coding agent can push generated code to an Artifact, hand the URL to a review agent downstream, and the entire workflow becomes auditable and reproducible without any custom storage plumbing. For multi-agent pipelines where one agent’s output is another agent’s input, Artifacts provides the shared, persistent handoff layer.
Connectivity: Cloudflare Mesh and Browser Run
Cloudflare Mesh
Cloudflare Mesh is a secure private networking layer that connects AI agents to private infrastructure without manual tunnel configuration or VPN setup. It integrates with Cloudflare One’s Zero Trust platform and Workers VPC, allowing an agent to query a private database, call an internal API, or coordinate with another agent on a private subnet, with access control enforced at the edge rather than at the application layer.
For enterprises building agents that need to operate inside private network perimeters — accessing internal knowledge bases, ERP systems, or databases that cannot be exposed to the public internet — Mesh is the practical answer to what would otherwise require significant security and networking engineering to configure safely.
Browser Run
Cloudflare’s Browser Rendering product has been renamed Browser Run and received substantial upgrades. Browser Run now includes Live View (real-time visual inspection of browser sessions during development), Human in the Loop (the ability to pause an automated session and hand control to a human for specific sensitive interactions), full Chrome DevTools Protocol access, session recordings for debugging and audit trails, and 4x higher concurrency limits for agents running parallel browser workflows.
Browser Run is the infrastructure layer for web-browsing agents: research and data-gathering agents, form-filling automation, web scraping pipelines, and any agent workflow that requires interacting with a website as a user would. The Human in the Loop feature is particularly useful for agent workflows that encounter CAPTCHAs, two-factor authentication, or other interactions that require human judgment before proceeding.
AI Gateway: 70-Plus Models Through a Single API
Cloudflare AI Gateway, expanded during Agents Week, now provides unified access to over 70 AI models across 12 or more providers — including OpenAI, Anthropic, Google, Groq, xAI, Alibaba Cloud, and Bytedance — through a single API endpoint with unified billing. For agent developers managing multi-model routing logic, AI Gateway removes the overhead of maintaining separate API clients, credential sets, and per-provider billing cycles.
Combined with Workers and the Agent Memory service, AI Gateway allows developers to build model-agnostic agents that route to the best available model for each subtask, with memory and state persisted independently of the model choice. This separation of model selection from memory and compute is one of the cleaner architectural benefits of building on Cloudflare’s integrated stack.
Email Service and Registrar API
Two additional releases filled out the Agents Week announcement list. The Cloudflare Email Service entered public beta, giving agents the ability to send, receive, and process email natively from Workers without a third-party email provider. Agents that need to notify users, process inbound requests, or integrate with email-driven workflows can now do this entirely within the Cloudflare stack.
The Cloudflare Registrar API moved to beta, allowing agents — and the developers who build them — to search, check availability, and register domain names at cost directly from code. This is a narrow but useful primitive for agents that provision infrastructure or run SaaS onboarding flows where domain registration is a required workflow step.
The Complete Agent Stack
What Cloudflare assembled across Agents Week is a coherent answer to the infrastructure question that derails most agent projects at production scale: where does the agent actually run, what does it remember, where does its output live, and how does it securely access the systems it needs?
In Cloudflare’s model, the answer is: compute in Sandboxes or Dynamic Workers, memory in Agent Memory, code and data in Artifacts, private network access via Mesh, web access via Browser Run, email via Email Service, and model access via AI Gateway — all on top of the existing Workers, Durable Objects, KV, and R2 platform. For developers already running on Cloudflare, Agents Week is an upgrade of the infrastructure they already pay for. For developers stitching together separate managed services — a hosted vector database for memory, a container platform for compute, a Git host for artifacts — Cloudflare’s pitch is consolidation: one platform, one billing relationship, and a network layer that ties all the primitives together.
What Developers Should Do Now
The most actionable item from Agents Week is the Agent Memory private beta. If your agent workflows involve multi-session state, user preferences, or knowledge accumulation over time, apply for access immediately. Context rot is one of the most common failure modes in production agent systems, and a managed solution from a provider operating at Cloudflare’s infrastructure scale is worth serious evaluation against the self-managed alternatives.
For teams evaluating compute options, the GA of Sandboxes removes beta risk from container-based agent workloads. Dynamic Workers are worth prototyping for any use case where agents generate and immediately execute code, particularly at the concurrency levels that would overwhelm conventional container-based runtimes.
The broader takeaway from Agents Week is directional: Cloudflare has made a clear commitment to being the infrastructure layer for the agent era, and the April 2026 releases represent a substantial step toward a fully integrated agent platform. The company is not competing on model quality or frontier AI research — it is competing on the unglamorous but necessary infrastructure that makes agent systems reliable, scalable, and secure in production. Developers building serious agent applications in 2026 should have Cloudflare’s stack on their evaluation list, not necessarily as a replacement for specialized tools, but as a platform that increasingly handles the undifferentiated heavy lifting so that engineering effort can stay focused on the agent logic itself.
Written by
Anup Karanjkar
Expert contributor at WOWHOW. Writing about AI, development, automation, and building products that ship.
Ready to ship faster?
Browse our catalog of 1,800+ premium dev tools, prompt packs, and templates.
Comments · 0
No comments yet. Be the first to share your thoughts.