Caching with Cloudflare Workers: When to Use What
How caching behaves when Workers fetch from orange-clouded origins on different zones.
TL;DR: cf caching options (cacheTtl, cacheEverything, cacheTtlByStatus) are ignored for cross-zone orange-clouded origins. Use Cache API or KV instead.
Which caching approach to use
Section titled “Which caching approach to use”| Scenario | Approach | Effort |
|---|---|---|
| Non-CF or same zone | fetch() + cf options | Low |
| Cross-zone, simple needs | Cache API | Low |
| Cross-zone, global consistency | KV | High |
| Cross-zone, coordinated writes | KV + Durable Objects | Very high |
Why cross-zone caching is different
Section titled “Why cross-zone caching is different”From How the Cache works:
“First,
fetchchecks to see if the URL matches a different zone. If it does, it reads through that zone’s cache (or Worker). Otherwise, it reads through its own zone’s cache, even if the URL is for a non-Cloudflare site.”
Requests to cross-zone orange-clouded origins route to that zone’s edge, not your zone’s cache.
From Cache using fetch:
“Workers operating on behalf of different zones cannot affect each other’s cache. You can only override cache keys when making requests within your own zone… or requests to hosts that are not on Cloudflare. When making a request to another Cloudflare zone (for example, belonging to a different Cloudflare customer), that zone fully controls how its own content is cached within Cloudflare; you cannot override it.”
By design - one zone can’t control another’s cache.
How it works
Section titled “How it works”Three layers influence caching behavior:
- Origin response headers (
Cache-Control,Expires) - Cloudflare zone settings (Cache Rules, Edge TTL, Browser TTL)
- Worker
cfoptions (cacheTtl,cacheEverything,cacheTtlByStatus)
Default behavior (no cf options, no Cache Rules)
Section titled “Default behavior (no cf options, no Cache Rules)”When fetching from same-zone or non-CF origins without cf options, default caching applies:
| Scenario | Cached? | Why |
|---|---|---|
| Static file extension (.js, .css, .png, etc.) | Yes | Default cacheable extensions |
| HTML, JSON, or other content | No | Not in default extension list |
Non-default type with Cache-Control: public, max-age=3600 | No | Cloudflare caches by extension, not MIME type - need cf options |
Origin returns Cache-Control: no-store or private | No | Explicitly non-cacheable |
Response has Set-Cookie header | Depends | With cacheTtl: cached, cookie removed. With cacheEverything alone: not cached, cookie preserved. See docs |
Cache-Control headers control how long something is cached, not whether it gets cached. Non-default types need cf options or a Cache Rule.
With cf options (same zone or non-CF origin)
Section titled “With cf options (same zone or non-CF origin)”| cf Option | Effect |
|---|---|
cacheEverything: true | Cache regardless of file extension (respects origin’s Cache-Control for TTL) |
cacheTtl: 3600 | Force cache for 1 hour (implicit cacheEverything, ignores origin headers) |
cacheTtlByStatus: { "200-299": 3600 } | Override TTL by status code (does not implicitly enable cacheEverything) |
cacheTtl implicitly enables cacheEverything - the docs state it’s “equivalent to setting two Page Rules: Edge Cache TTL and Cache Level (to Cache Everything).”
cacheTtlByStatus does not implicitly enable cacheEverything. It only overrides TTL for responses that would already be cached (default cacheable extensions or when paired with cacheEverything). To cache non-default content types with status-based TTLs, combine both:
const response = await fetch(request, { cf: { cacheEverything: true, cacheTtlByStatus: { "200-299": 3600, "404": 60, "500-599": 0 }, },});TTL control: cacheEverything alone respects origin’s Cache-Control for TTL. cacheTtl/cacheTtlByStatus override it.
Cross-zone behavior
Section titled “Cross-zone behavior”When fetching cross-zone orange-clouded origins, the request goes to the origin zone’s edge - your cf options are ignored. The origin zone’s Cache Rules and settings apply instead.
To enable caching, the origin zone must configure it via:
- Cache Rules (for non-default content types)
- A Worker with
cfoptions - For default cacheable extensions, caching happens automatically;
Cache-Controlheaders control TTL
cf options compatibility
Section titled “cf options compatibility”| cf Option | Non-CF / Same Zone | Cross-Zone |
|---|---|---|
cacheTtl, cacheEverything, cacheTtlByStatus | Yes | No |
image | Yes | Yes |
polish, minify, mirage | Yes | No (origin zone’s settings apply) |
Custom cache keys
Section titled “Custom cache keys”The cacheKey option lets you control what makes two requests “the same” for caching purposes. The value is a string that becomes the cache key identifier.
// Example: Cache based on normalized URL (strip tracking params)const url = new URL(request.url);["utm_source", "utm_medium", "utm_campaign", "fbclid"].forEach((p) => url.searchParams.delete(p),);const normalizedKey = url.toString();
const response = await fetch(request, { cf: { cacheTtl: 3600, cacheKey: normalizedKey, },});// Example: Separate cache entries by device typeconst device = request.headers.get("CF-Device-Type") || "desktop";const deviceCategory = device === "desktop" ? "desktop" : "mobile";const cacheKey = `${request.url}-${deviceCategory}`;
const response = await fetch(request, { cf: { cacheTtl: 3600, cacheKey: cacheKey, },});// Example: Include language in cache keyconst lang = request.headers.get("Accept-Language")?.split(",")[0] || "en";const cacheKey = `${request.url}-${lang}`;
const response = await fetch(request, { cf: { cacheTtl: 3600, cacheKey: cacheKey, },});You can build arbitrarily complex cache keys by constructing the string yourself - include/exclude query params, add headers, cookies, or any request property. Cache Rules offer a no-code alternative for the same functionality.
Zone settings that affect caching
Section titled “Zone settings that affect caching”Several zone-level settings can prevent caching from working as expected:
Development Mode temporarily suspends edge caching for up to 3 hours. When enabled, all caching is bypassed - including Cache API operations. Check this first if caching suddenly stops working.
Cache Rules with “Bypass cache” can override Worker caching behavior. However, Workers can override Cache Rules if the appropriate compatibility flags are enabled:
| Flag | Purpose | Auto-enabled |
|---|---|---|
cache_api_compat_flags | Enables compatibility flag functionality for Cache API | compatibility_date >= 2025-04-19 |
cache_api_request_cf_overrides_cache_rules | Allows Cache API to override Cache Rules | compatibility_date >= 2025-05-19 |
If your Worker has an earlier compatibility date, add these flags manually in wrangler.toml:
compatibility_date = "2024-01-01"compatibility_flags = [ "cache_api_compat_flags", "cache_api_request_cf_overrides_cache_rules"]Paused Cloudflare - If Cloudflare is paused on the zone, traffic goes directly to origin and all Cloudflare services (including caching) are bypassed.
Cloudflare Access - Routes protected by Access return CF-Cache-Status: DYNAMIC and won’t be cached at the edge, regardless of Worker caching logic or Cache-Control headers. To enable caching, exclude the path from your Access application policy.
Tiered Cache and Cache Reserve
Section titled “Tiered Cache and Cache Reserve”Two additional caching features affect how content is stored and retrieved:
Tiered Cache reduces origin load by having upper-tier data centers serve as intermediaries. When a lower-tier colo has a cache miss, it checks upper-tier colos before going to origin.
- Works with
fetch()andcfoptions (same zone/non-CF) - Does NOT work with Cache API - Cache API is per-colo only
- Enabled at the zone level, not per-request
Cache Reserve provides persistent storage for cached content, preventing eviction during traffic spikes or for infrequently accessed content.
- Extends cache retention beyond standard TTL limits
- Useful for large files or content with long TTLs
- Billed based on storage and operations
Cross-zone caching solutions
Section titled “Cross-zone caching solutions”When cf options don’t work (cross-zone fetches), you have two choices: Cache API for simplicity, or KV for global consistency.
Cache API
Section titled “Cache API”Since caches.default shares the same namespace as your zone’s CDN cache, you can work around cross-zone limitations by storing fetched responses locally with cache.put(). The cross-zone fetch still happens on cache miss, but subsequent requests in that colo hit your local cache instead.
TTL control: To honor origin’s TTL, preserve the Cache-Control header. To override, set your own.
async function fetchWithCache( request: Request, originUrl: string, ctx: ExecutionContext,): Promise<Response> { const cache = caches.default; const cacheKey = new Request(originUrl, { method: "GET" });
// Check cache first let cached = await cache.match(cacheKey);
if (cached) { // Handle cache bypass (e.g., browser refresh) const cacheControl = request.headers.get("Cache-Control"); const shouldBypass = cacheControl?.includes("no-cache");
if (shouldBypass) { // Cancel the body stream to avoid resource leaks if (cached.body) { await cached.body.cancel(); } cached = undefined; } else { return cached; } }
// Fetch from origin const originResp = await fetch(originUrl);
// Don't cache error responses if (!originResp.ok) { return originResp; }
// Prepare response for caching const headers = new Headers(originResp.headers); headers.delete("Set-Cookie"); // Cache API rejects responses with Set-Cookie
if (!headers.has("Cache-Control")) { headers.set("Cache-Control", "public, max-age=3600"); }
const response = new Response(originResp.body, { status: originResp.status, headers, });
// Store in cache using waitUntil (non-blocking, doesn't delay response) ctx.waitUntil(cache.put(cacheKey, response.clone()));
return response;}Handling large responses
Section titled “Handling large responses”Workers have a 128 MB memory limit. response.clone() buffers the entire body into memory. For large responses, use body.tee() instead:
async function fetchLargeWithCache( originUrl: string, ctx: ExecutionContext,): Promise<Response> { const cache = caches.default; const cacheKey = new Request(originUrl, { method: "GET" });
const cached = await cache.match(cacheKey); if (cached) return cached;
const originResp = await fetch(originUrl); if (!originResp.ok || !originResp.body) return originResp;
// tee() creates two streams from one - avoids buffering entire body const [stream1, stream2] = originResp.body.tee();
const headers = new Headers(originResp.headers); headers.delete("Set-Cookie"); if (!headers.has("Cache-Control")) { headers.set("Cache-Control", "public, max-age=3600"); }
const responseToCache = new Response(stream1, { status: originResp.status, headers, }); ctx.waitUntil(cache.put(cacheKey, responseToCache));
return new Response(stream2, { status: originResp.status, headers });}Caching R2 objects
Section titled “Caching R2 objects”When serving files from R2, use tee() to stream the response while caching:
async function serveR2WithCache( request: Request, env: Env, ctx: ExecutionContext,): Promise<Response> { const url = new URL(request.url); const key = url.pathname.slice(1); // Remove leading slash
const cache = caches.default; // Use the request URL as cache key const cacheKey = new Request(url.toString(), { method: "GET" });
// Check cache first const cached = await cache.match(cacheKey); if (cached) { return cached; }
// Fetch from R2 const obj = await env.R2.get(key); if (!obj) { return new Response("Not found", { status: 404 }); }
// tee() the R2 stream - one for cache, one for response const [stream1, stream2] = obj.body.tee();
const headers = new Headers(); headers.set("Content-Type", obj.httpMetadata?.contentType || "application/octet-stream"); headers.set("Content-Length", String(obj.size)); headers.set("Cache-Control", "public, max-age=3600"); headers.set("ETag", obj.httpEtag);
const responseToCache = new Response(stream1, { status: 200, headers });
// Store in cache without blocking the response ctx.waitUntil(cache.put(cacheKey, responseToCache));
return new Response(stream2, { status: 200, headers });}Observability
Section titled “Observability”Understanding how cache status is reported helps with debugging and analytics.
CF-Cache-Status header
Section titled “CF-Cache-Status header”When using Cache API with caches.default:
| Operation | CF-Cache-Status | Notes |
|---|---|---|
cache.match() returns cached response | HIT | Cloudflare adds this automatically |
cache.match() returns undefined | N/A | No response to add header to |
cache.put() | N/A | Storage operation, no response |
The CF-Cache-Status: HIT header is automatically added by Cloudflare when you retrieve a cached response via cache.match(). You don’t need to add your own header.
const cached = await cache.match(cacheKey);if (cached) { // cached.headers.get('CF-Cache-Status') === 'HIT' // This is added automatically by Cloudflare, not by your code return cached;}Logpush integration
Section titled “Logpush integration”Cache API operations generate separate log entries in the HTTP requests dataset with ClientRequestSource set to edgeWorkerCacheAPI (value 6). These entries have workersubrequest: true and link back to the parent request via parentrayid.
| Cache Operation | EdgeResponseStatus | CacheCacheStatus |
|---|---|---|
cache.match() HIT | 200 | hit |
cache.match() MISS | 504 | miss |
cache.put() | 204 | unknown |
The 504 on MISS is documented behavior:
cache.matchgenerates a 504 error response when the requested content is missing or expired. The Cache API does not expose this 504 directly to the Worker script, instead returningundefined. Nevertheless, the underlying 504 is still visible in Cloudflare Logs.
Filtering Logpush data:
- End-user traffic only: Filter on
ClientRequestSource = 1(eyeball) - Cache API calls only: Filter on
ClientRequestSource = 6(edgeWorkerCacheAPI) - Cache hit rate: Filter
ClientRequestSource = 6, then compareCacheCacheStatus = hitvsmiss
KV for global consistency
Section titled “KV for global consistency”KV provides global replication with eventual consistency. Changes may take up to 60 seconds or more to be visible in other locations. Using KV as a cache means building your own caching layer - you’re responsible for cache key generation, TTL management, invalidation, and purging.
Cache API vs KV trade-offs
Section titled “Cache API vs KV trade-offs”| Aspect | Cache API | KV |
|---|---|---|
| Consistency | Per-colo (different colos may have different content) | Global (eventually consistent across all colos) |
| TTL management | Automatic via Cache-Control headers | Manual via expirationTtl |
| Invalidation | cache.delete() per-colo only | KV.delete() propagates globally |
| Purge tooling | Built-in via Cloudflare dashboard/API | Roll your own or use cache-kv-purger |
| Value size limit | No hard limit (but cloning limited by 128 MB Worker memory) | 25 MB (chunking required for larger) |
When to use KV over Cache API
Section titled “When to use KV over Cache API”- Single global cache - Fewer cold cache misses, global invalidation
- Programmatic invalidation - Need to purge specific items globally (not just per-colo)
- Cross-Worker sharing - Multiple Workers need to share cached data
- Metadata-driven purging - Need to find and purge items by tags/metadata
Basic implementation
Section titled “Basic implementation”async function fetchWithKV( originUrl: string, env: Env, ctx: ExecutionContext,): Promise<Response> { const cacheKey = new URL(originUrl).pathname;
// Check KV first - store body as arrayBuffer, metadata separately const { value, metadata } = await env.CACHE_KV.getWithMetadata< ArrayBuffer, { contentType: string; status: number; cachedAt: number } >(cacheKey, { type: "arrayBuffer" });
if (value && metadata) { // Optional: refresh TTL on hit without rewriting value const age = Date.now() - metadata.cachedAt; if (age > 1800000) { // 30 min ctx.waitUntil(refreshTTL(env, cacheKey, metadata)); } return new Response(value, { status: metadata.status, headers: { "Content-Type": metadata.contentType }, }); }
// Fetch from origin const response = await fetch(originUrl);
if (!response.ok) { return response; }
// Store in KV: body as value, headers as metadata const body = await response.arrayBuffer(); const contentType = response.headers.get("Content-Type") || "application/octet-stream";
// Use waitUntil for non-blocking write ctx.waitUntil( env.CACHE_KV.put(cacheKey, body, { expirationTtl: 3600, metadata: { contentType, status: response.status, cachedAt: Date.now() }, }), );
return new Response(body, { status: response.status, headers: { "Content-Type": contentType }, });}
async function refreshTTL( env: Env, key: string, metadata: object,): Promise<void> { // KV doesn't support TTL refresh without rewriting - must read and write const { value } = await env.CACHE_KV.getWithMetadata(key, { type: "arrayBuffer", }); if (value) { await env.CACHE_KV.put(key, value, { expirationTtl: 3600, metadata: { ...metadata, cachedAt: Date.now() }, }); }}What you’re building
Section titled “What you’re building”KV caching requires implementing what Cache API gives you automatically:
- Cache key generation - Deterministic keys from request params (e.g.,
video:sample.mp4:w=1280:h=720) - Metadata storage - Headers, content-type, timestamps alongside value (KV metadata limited to 1KB)
- TTL management -
expirationTtlon write, trackingcachedAtfor refresh decisions - Chunking - Splitting files > 25 MB across multiple keys, reassembling on read
- Invalidation - By exact key, prefix/pattern (requires listing), or metadata tags
Advanced patterns
Section titled “Advanced patterns”Cache versioning - Atomic invalidation without listing keys
Instead of purging individual keys, increment a version number:
Request coalescing - Prevent duplicate origin fetches
When multiple requests arrive for the same uncached content:
const inFlight = new Map<string, Promise<Response>>();
async function fetchWithCoalescing( cacheKey: string, url: string,): Promise<Response> { if (inFlight.has(cacheKey)) { return (await inFlight.get(cacheKey))!.clone(); }
const promise = fetchAndCache(url); inFlight.set(cacheKey, promise); const response = await promise; inFlight.delete(cacheKey); return response;}Chunking - Files > 25 MB across multiple keys
KV has a 25 MB value limit. Split larger files into chunks with a manifest:
interface ChunkManifest { totalSize: number; chunkSize: number; chunks: string[]; // KV keys for each chunk contentType: string;}
async function storeLargeFile( key: string, data: ArrayBuffer, env: Env, ctx: ExecutionContext,): Promise<void> { const CHUNK_SIZE = 20 * 1024 * 1024; // 20 MB chunks (under 25 MB limit) const chunks: string[] = [];
for (let offset = 0; offset < data.byteLength; offset += CHUNK_SIZE) { const chunkKey = `${key}:chunk:${chunks.length}`; const chunk = data.slice(offset, offset + CHUNK_SIZE); ctx.waitUntil(env.CACHE_KV.put(chunkKey, chunk, { expirationTtl: 86400 })); chunks.push(chunkKey); }
const manifest: ChunkManifest = { totalSize: data.byteLength, chunkSize: CHUNK_SIZE, chunks, contentType: "application/octet-stream", };
ctx.waitUntil( env.CACHE_KV.put(key, JSON.stringify(manifest), { expirationTtl: 86400 }), );}For retrieval with range request support, see Media Transformation Architecture.
For production implementations of all patterns, see:
- video-resizer - Full implementation with versioning, coalescing, chunking
- Media Transformation Architecture - Detailed documentation
- cache-kv-purger - CLI for purging by tags/metadata
Content-addressable storage with instant invalidation
Section titled “Content-addressable storage with instant invalidation”KV’s eventual consistency (up to 60 seconds) makes cache invalidation challenging. Content-addressable storage sidesteps this: instead of updating content at existing keys, write new content to new keys (derived from content hashes) and update a version pointer.
Why this works: The key insight is that content-addressed keys are immutable - the same hash always returns the same content. You never update a key; you write to a new one. This means:
- No stale reads - if a key exists, its content is correct by definition
- No invalidation needed - old keys simply become orphaned, not stale
- Eventual consistency becomes irrelevant - you’re not waiting for updates to propagate, you’re waiting for new keys to appear (and the version pointer tells you when they’re ready)
This is how Workers Static Assets works internally - assets are stored by content hash in KV, and a manifest maps paths to hashes. The manifest is deployed atomically with the Worker, so there’s no propagation delay.
For centralized KV stores where you can’t redeploy Workers on every content change, use a Durable Object as the version pointer with Cache API to stay under the 1000 RPS limit.
Trade-off: You’re trading storage space for consistency. Old content accumulates until TTL expiration or explicit cleanup.
Read and write flows
Section titled “Read and write flows”Read flow:
Write flow:
The wait before flipping ensures the new manifest has propagated to KV edge locations. Without this, some colos might see v44 but fail to find manifest:v44.
Caching strategy
Section titled “Caching strategy”| Layer | Strategy | Why |
|---|---|---|
| Version pointer | DO + Cache API (5-10s TTL) | Strong consistency, cached to stay under 1000 RPS |
| Manifest | KV with versioned key + in-memory | Immutable per version, safe to cache forever |
| Content | KV with content-hash key | Immutable, deduplicated, 1 year cache |
RPS math: With Cache API caching the version for 5 seconds across ~300 colos, you get ~60 RPS to the DO - well under the 1000 limit.
Implementation
Section titled “Implementation”Durable Object - Version pointer only
import { DurableObject } from "cloudflare:workers";
export class ManifestVersion extends DurableObject { private version: string | undefined;
async getVersion(): Promise<string> { if (this.version === undefined) { this.version = (await this.ctx.storage.get<string>("version")) ?? "v0"; } return this.version; }
async setVersion(newVersion: string): Promise<string> { this.version = newVersion; await this.ctx.storage.put("version", newVersion); return this.version; }}Reader Worker - Cache version in Cache API, manifest in memory
interface Manifest { assets: Record<string, string>; // path -> content hash createdAt: number;}
interface Env { MANIFEST_VERSION: DurableObjectNamespace<ManifestVersion>; CACHE_KV: KVNamespace;}
// Per-isolate cache (survives across requests in same isolate)let manifestCache: { version: string; data: Manifest } | null = null;
async function getCurrentVersion( env: Env, ctx: ExecutionContext,): Promise<string> { const cache = caches.default; const cacheKey = new Request("https://internal/manifest-version");
// Check Cache API (per-colo, short TTL) const cached = await cache.match(cacheKey); if (cached) { return await cached.text(); }
// Miss - fetch from DO (strong consistency) const stub = env.MANIFEST_VERSION.getByName("global"); const version = await stub.getVersion();
// Cache for 5s - balances freshness vs DO RPS ctx.waitUntil( cache.put( cacheKey, new Response(version, { headers: { "Cache-Control": "max-age=5" }, }), ), );
return version;}
async function getManifest(env: Env, ctx: ExecutionContext): Promise<Manifest> { const version = await getCurrentVersion(env, ctx);
// Check per-isolate cache (fastest) if (manifestCache?.version === version) { return manifestCache.data; }
// Fetch from KV (versioned key = immutable, cache 1 year) const manifest = await env.CACHE_KV.get<Manifest>(`manifest:${version}`, { type: "json", cacheTtl: 31536000, });
if (!manifest) { throw new Error(`Manifest ${version} not found`); }
manifestCache = { version, data: manifest }; return manifest;}
async function getAsset( path: string, env: Env, ctx: ExecutionContext,): Promise<{ body: ArrayBuffer; contentType: string } | null> { const manifest = await getManifest(env, ctx); const contentHash = manifest.assets[path];
if (!contentHash) return null;
// Content-addressed = immutable, cache 1 year const { value, metadata } = await env.CACHE_KV.getWithMetadata< ArrayBuffer, { contentType: string } >(`content:${contentHash}`, { type: "arrayBuffer", cacheTtl: 31536000 });
if (!value) return null;
return { body: value, contentType: metadata?.contentType ?? "application/octet-stream", };}Publisher - Write content, then manifest, then flip version
async function publish( assets: Map<string, { content: ArrayBuffer; contentType: string }>, env: Env,): Promise<string> { const newVersion = `v${Date.now()}`; const manifest: Manifest = { assets: {}, createdAt: Date.now() };
// 1. Hash all content and build manifest const hashes = await Promise.all( [...assets.entries()].map(async ([path, { content }]) => ({ path, hash: await sha256(content), })), );
for (const { path, hash } of hashes) { manifest.assets[path] = hash; }
// 2. Check which content already exists (deduplication) const existenceChecks = await Promise.all( hashes.map(async ({ hash }) => ({ hash, exists: (await env.CACHE_KV.get(`content:${hash}`, { type: "arrayBuffer" })) !== null, })), );
const newHashes = new Set( existenceChecks.filter(({ exists }) => !exists).map(({ hash }) => hash), );
// 3. Write new content and manifest in parallel const writes: Promise<void>[] = [];
for (const [path, { content, contentType }] of assets) { const hash = manifest.assets[path]; if (newHashes.has(hash)) { writes.push( env.CACHE_KV.put(`content:${hash}`, content, { metadata: { contentType }, expirationTtl: 86400 * 30, }), ); } }
writes.push( env.CACHE_KV.put(`manifest:${newVersion}`, JSON.stringify(manifest), { expirationTtl: 86400 * 30, }), );
await Promise.all(writes);
// 4. Wait for KV propagation before flipping version await new Promise((r) => setTimeout(r, 5000));
// 5. Flip version pointer (instant, strongly consistent) const stub = env.MANIFEST_VERSION.getByName("global"); await stub.setVersion(newVersion);
return newVersion;}
async function sha256(data: ArrayBuffer): Promise<string> { const hash = await crypto.subtle.digest("SHA-256", data); return [...new Uint8Array(hash)] .map((b) => b.toString(16).padStart(2, "0")) .join("");}Consistency guarantees
Section titled “Consistency guarantees”| Event | Latency |
|---|---|
| Version flipped in DO | Instant (strongly consistent) |
| Worker sees new version | ≤5-10s (Cache API TTL) |
| Worker reads new manifest/content | Instant (versioned/hashed keys are immutable) |
Effective end-to-end latency: 5-10 seconds after publish completes. The 5s wait ensures KV has propagated the new manifest before the DO points to it.
Microservices architecture
Section titled “Microservices architecture”When building multi-Worker systems where a router Worker dispatches requests to origin Workers (often on different zones), you need to decide where caching happens. This affects latency, consistency, and operational complexity.
The key question: should the router cache responses from origins, or should each origin manage its own caching?
Centralized caching (router handles all)
Section titled “Centralized caching (router handles all)”Which caching mechanism to use in the router:
| Origin Type | Caching Mechanism | Why |
|---|---|---|
| Not proxied (grey-clouded, non-CF) | fetch() with cf options | Request uses your zone’s cache |
| Orange-clouded (same zone) | fetch() with cf options | Same zone, cf options work |
| Orange-clouded (cross-zone) | Cache API or KV | cf options ignored |
| Pros | Cons |
|---|---|
| Single cache management point | Router becomes bottleneck |
| Consistent behavior | Extra hop latency |
| Easier debugging | Tight coupling to origins |
| Centralized circuit breakers | Router must know how each origin behaves |
Distributed caching (origins decide)
Section titled “Distributed caching (origins decide)”| Pros | Cons |
|---|---|
| Each service owns its strategy | Inconsistent behavior |
| No single point of failure | Harder to debug system-wide |
| Independent deployments | Harder invalidation |
| Better separation of concerns | Duplicate logic |
Recommendation
Section titled “Recommendation”Prefer distributed caching for cross-zone Worker architectures:
- Router stays stateless - routing logic only
- Origins control their own caching - using
cfoptions or Cache Rules - If router needs caching, use Cache API (since origins are cross-zone)
Why: origins know their own caching needs, and a stateless router has fewer failure modes.
Origin-side caching
Section titled “Origin-side caching”Origin Workers cache upstream responses with cf options (same zone) and signal cacheability to downstream via headers:
// Origin Worker - fetches from upstream API and caches at origin's edgeexport default { async fetch(request: Request): Promise<Response> { // Fetch from upstream with cf options (same zone or non-CF = works) // cacheTtl implicitly enables cacheEverything (JSON not cached by default) const upstream = await fetch("https://api.example.com/data", { cf: { cacheTtl: 3600 }, });
// Worker-generated responses bypass CDN cache // Cache-Control tells downstream (router, browser) how long to cache return new Response(upstream.body, { status: upstream.status, headers: { "Content-Type": "application/json", "Cache-Control": "public, max-age=3600", }, }); },};Two levels of caching here:
- CDN cache (via
cfoptions) - caches the upstream API response at this zone’s edge - Downstream cache (via
Cache-Controlheader) - tells the calling Worker/browser how long to cache this response
For details on how Cache-Control headers interact with Cloudflare, see Origin Cache Control.