Technology RadarTechnology Radar

Cloudflare Code Mode

inferenceworkflowopen-source
Trial

Code Mode is a Cloudflare technique (and @cloudflare/codemode package) that slashes agent token usage by having the LLM write code against a typed SDK — then executing that code safely in an isolated sandbox — instead of calling individual MCP tools one by one. The Cloudflare API MCP server uses Code Mode to expose the entire Cloudflare API through just two tools in under 1,000 tokens, a 99.9% reduction from the 1.17M tokens a traditional per-endpoint MCP server would require.

Why It's in Trial

Code Mode addresses a real and growing problem: as MCP servers expose more tools, the token overhead of listing and invoking them becomes a significant cost and latency multiplier. The technique is novel but already in production on Cloudflare's own MCP server:

  • Concrete efficiency gains: 81% token reduction for complex batch operations, 32% for simple tasks (Cloudflare-measured). For large APIs, the gains are effectively unlimited — the Cloudflare API case is 99.9%.
  • Open package: @cloudflare/codemode is published on npm. The pattern is portable, not Cloudflare-locked.
  • Secure execution via Dynamic Workers: AI-generated code runs in V8 isolates, not containers — millisecond cold starts, megabyte memory footprint, 100x faster than equivalent container-based sandboxes.
  • Presented at MCP Dev Summit North America (April 2-3, 2026): Matt Carey's session "Every API Is a Tool for Agents" demonstrated the technique to the MCP practitioner community at the Linux Foundation event in NYC.
  • Broad applicability: The Figma presentation from the summit explicitly frames this as an alternative to raw MCP tool use — useful for any large API surface, not just Cloudflare services.

Stays in Trial (not Adopt) because: the technique is newly published, Dynamic Workers are in open beta, and community adoption outside Cloudflare itself is not yet measured.

When NOT to Use Code Mode

  • Small, stable APIs (< ~20 endpoints): the overhead of generating type definitions outweighs the savings
  • Agents without a sandboxed execution environment: Code Mode requires a secure runtime to execute LLM-generated code; running arbitrary AI code outside an isolate is unsafe
  • Non-TypeScript ecosystems: the package is TypeScript-first; other runtimes need adaptation
  • High-trust, low-risk tool use: if your MCP server has 3 tools, raw tool calling is simpler

How It Works

Traditional MCP:  [1 tool definition per endpoint] × N endpoints = N × ~100 tokens
Code Mode:        TypeScript types (one-time, ~1000 tokens) + execute(code) = fixed cost

The @cloudflare/codemode package:

  1. Generates TypeScript interfaces from your tool definitions (more token-efficient than OpenAPI for LLMs)
  2. Exposes two tools to the agent: search(query) for discovery, execute(code) for action
  3. Executes the returned code in a Dynamic Worker — a V8 isolate instantiated at runtime with only the APIs the code needs

The Dynamic Worker sandbox communicates with host services via Cap'n Web RPC bridges, which operate transparently across the security boundary. Credential injection happens on the way out — agent code never sees secrets directly.

Dynamic Workers Sandbox

Property Value
Isolation V8 isolate (same as Cloudflare Workers)
Cold start Milliseconds
Memory Megabytes per isolate
vs. containers ~100x faster startup, ~100x more memory-efficient
Security extras MPK (Memory Protection Keys), Spectre mitigations, dynamic risk-based cordoning
Availability Open beta, paid Workers plan
Pricing $0.002/unique Worker/day (waived in beta)

MCP vs. Code Mode vs. Skills (Summit Framing)

The Cloudflare MCP Dev Summit session framed three patterns for exposing APIs to agents:

Pattern Token cost Flexibility When to use
Raw MCP tools O(N endpoints) Per-operation Small, stable APIs
Code Mode Fixed (~1,000 tokens) Arbitrary composition Large, dynamic APIs
Skills Near-zero (instructions only) Opinionated workflows Teaching agents Cloudflare idioms

These aren't mutually exclusive — Cloudflare ships all three. The Skills plugin (github.com/cloudflare/skills) complements Code Mode by giving agents intent-level context.

Key Characteristics

Property Value
Package @cloudflare/codemode
Runtime Cloudflare Dynamic Workers (V8 isolates)
Token reduction 32–81% typical; up to 99.9% for large APIs
Presented MCP Dev Summit NA, April 2-3, 2026, NYC
License Open-source (Apache 2.0)
Requires Cloudflare Workers paid plan (for Dynamic Workers sandbox)

Further Reading