Caching with Cloudflare Workers: When to Use What

How caching behaves when Workers fetch from orange-clouded origins on different zones.

TL;DR: cf caching options (cacheTtl, cacheEverything, cacheTtlByStatus) are ignored for cross-zone orange-clouded origins. Use Cache API or KV instead.

Which caching approach to use

Scenario	Approach	Effort
Non-CF or same zone	`fetch()` + `cf` options	Low
Cross-zone, simple needs	Cache API	Low
Cross-zone, global consistency	KV	High
Cross-zone, coordinated writes	KV + Durable Objects	Very high

Why cross-zone caching is different

From How the Cache works:

“First, fetch checks to see if the URL matches a different zone. If it does, it reads through that zone’s cache (or Worker). Otherwise, it reads through its own zone’s cache, even if the URL is for a non-Cloudflare site.”

Requests to cross-zone orange-clouded origins route to that zone’s edge, not your zone’s cache.

From Cache using fetch:

“Workers operating on behalf of different zones cannot affect each other’s cache. You can only override cache keys when making requests within your own zone… or requests to hosts that are not on Cloudflare. When making a request to another Cloudflare zone (for example, belonging to a different Cloudflare customer), that zone fully controls how its own content is cached within Cloudflare; you cannot override it.”

By design - one zone can’t control another’s cache.

How it works

Three layers influence caching behavior:

Origin response headers (Cache-Control, Expires)
Cloudflare zone settings (Cache Rules, Edge TTL, Browser TTL)
Worker cf options (cacheTtl, cacheEverything, cacheTtlByStatus)

Default behavior (no `cf` options, no Cache Rules)

When fetching from same-zone or non-CF origins without cf options, default caching applies:

Scenario	Cached?	Why
Static file extension (.js, .css, .png, etc.)	Yes	Default cacheable extensions
HTML, JSON, or other content	No	Not in default extension list
Non-default type with `Cache-Control: public, max-age=3600`	No	Cloudflare caches by extension, not MIME type - need `cf` options
Origin returns `Cache-Control: no-store` or `private`	No	Explicitly non-cacheable
Response has `Set-Cookie` header	Depends	With `cacheTtl`: cached, cookie removed. With `cacheEverything` alone: not cached, cookie preserved. See docs

Cache-Control headers control how long something is cached, not whether it gets cached. Non-default types need cf options or a Cache Rule.

With `cf` options (same zone or non-CF origin)

cf Option	Effect
`cacheEverything: true`	Cache regardless of file extension (respects origin’s `Cache-Control` for TTL)
`cacheTtl: 3600`	Force cache for 1 hour (implicit `cacheEverything`, ignores origin headers)
`cacheTtlByStatus: { "200-299": 3600 }`	Override TTL by status code (does not implicitly enable `cacheEverything`)

cacheTtl implicitly enables cacheEverything - the docs state it’s “equivalent to setting two Page Rules: Edge Cache TTL and Cache Level (to Cache Everything).”

cacheTtlByStatus does not implicitly enable cacheEverything. It only overrides TTL for responses that would already be cached (default cacheable extensions or when paired with cacheEverything). To cache non-default content types with status-based TTLs, combine both:

const response = await fetch(request, {
  cf: {
    cacheEverything: true,
    cacheTtlByStatus: { "200-299": 3600, "404": 60, "500-599": 0 },
  },
});

TTL control: cacheEverything alone respects origin’s Cache-Control for TTL. cacheTtl/cacheTtlByStatus override it.

Cross-zone behavior

When fetching cross-zone orange-clouded origins, the request goes to the origin zone’s edge - your cf options are ignored. The origin zone’s Cache Rules and settings apply instead.

To enable caching, the origin zone must configure it via:

Cache Rules (for non-default content types)
A Worker with cf options
For default cacheable extensions, caching happens automatically; Cache-Control headers control TTL

`cf` options compatibility

cf Option	Non-CF / Same Zone	Cross-Zone
`cacheTtl`, `cacheEverything`, `cacheTtlByStatus`	Yes	No
`image`	Yes	Yes
`polish`, `minify`, `mirage`	Yes	No (origin zone’s settings apply)

Custom cache keys

The cacheKey option lets you control what makes two requests “the same” for caching purposes. The value is a string that becomes the cache key identifier.

// Example: Cache based on normalized URL (strip tracking params)
const url = new URL(request.url);
["utm_source", "utm_medium", "utm_campaign", "fbclid"].forEach((p) =>
  url.searchParams.delete(p),
);
const normalizedKey = url.toString();

const response = await fetch(request, {
  cf: {
    cacheTtl: 3600,
    cacheKey: normalizedKey,
  },
});

// Example: Separate cache entries by device type
const device = request.headers.get("CF-Device-Type") || "desktop";
const deviceCategory = device === "desktop" ? "desktop" : "mobile";
const cacheKey = `${request.url}-${deviceCategory}`;

const response = await fetch(request, {
  cf: {
    cacheTtl: 3600,
    cacheKey: cacheKey,
  },
});

// Example: Include language in cache key
const lang = request.headers.get("Accept-Language")?.split(",")[0] || "en";
const cacheKey = `${request.url}-${lang}`;

const response = await fetch(request, {
  cf: {
    cacheTtl: 3600,
    cacheKey: cacheKey,
  },
});

You can build arbitrarily complex cache keys by constructing the string yourself - include/exclude query params, add headers, cookies, or any request property. Cache Rules offer a no-code alternative for the same functionality.

Zone settings that affect caching

Several zone-level settings can prevent caching from working as expected:

Development Mode temporarily suspends edge caching for up to 3 hours. When enabled, all caching is bypassed - including Cache API operations. Check this first if caching suddenly stops working.

Cache Rules with “Bypass cache” can override Worker caching behavior. However, Workers can override Cache Rules if the appropriate compatibility flags are enabled:

Flag	Purpose	Auto-enabled
`cache_api_compat_flags`	Enables compatibility flag functionality for Cache API	`compatibility_date` >= 2025-04-19
`cache_api_request_cf_overrides_cache_rules`	Allows Cache API to override Cache Rules	`compatibility_date` >= 2025-05-19

If your Worker has an earlier compatibility date, add these flags manually in wrangler.toml:

compatibility_date = "2024-01-01"
compatibility_flags = [
  "cache_api_compat_flags",
  "cache_api_request_cf_overrides_cache_rules"
]

Paused Cloudflare - If Cloudflare is paused on the zone, traffic goes directly to origin and all Cloudflare services (including caching) are bypassed.

Cloudflare Access - Routes protected by Access return CF-Cache-Status: DYNAMIC and won’t be cached at the edge, regardless of Worker caching logic or Cache-Control headers. To enable caching, exclude the path from your Access application policy.

Tiered Cache and Cache Reserve

Two additional caching features affect how content is stored and retrieved:

Tiered Cache reduces origin load by having upper-tier data centers serve as intermediaries. When a lower-tier colo has a cache miss, it checks upper-tier colos before going to origin.

Works with fetch() and cf options (same zone/non-CF)
Does NOT work with Cache API - Cache API is per-colo only
Enabled at the zone level, not per-request

Cache Reserve provides persistent storage for cached content, preventing eviction during traffic spikes or for infrequently accessed content.

Extends cache retention beyond standard TTL limits
Useful for large files or content with long TTLs
Billed based on storage and operations

Cross-zone caching solutions

When cf options don’t work (cross-zone fetches), you have two choices: Cache API for simplicity, or KV for global consistency.

Cache API

Since caches.default shares the same namespace as your zone’s CDN cache, you can work around cross-zone limitations by storing fetched responses locally with cache.put(). The cross-zone fetch still happens on cache miss, but subsequent requests in that colo hit your local cache instead.

TTL control: To honor origin’s TTL, preserve the Cache-Control header. To override, set your own.

async function fetchWithCache(
  request: Request,
  originUrl: string,
  ctx: ExecutionContext,
): Promise<Response> {
  const cache = caches.default;
  const cacheKey = new Request(originUrl, { method: "GET" });

  // Check cache first
  let cached = await cache.match(cacheKey);

  if (cached) {
    // Handle cache bypass (e.g., browser refresh)
    const cacheControl = request.headers.get("Cache-Control");
    const shouldBypass = cacheControl?.includes("no-cache");

    if (shouldBypass) {
      // Cancel the body stream to avoid resource leaks
      if (cached.body) {
        await cached.body.cancel();
      }
      cached = undefined;
    } else {
      return cached;
    }
  }

  // Fetch from origin
  const originResp = await fetch(originUrl);

  // Don't cache error responses
  if (!originResp.ok) {
    return originResp;
  }

  // Prepare response for caching
  const headers = new Headers(originResp.headers);
  headers.delete("Set-Cookie"); // Cache API rejects responses with Set-Cookie

  if (!headers.has("Cache-Control")) {
    headers.set("Cache-Control", "public, max-age=3600");
  }

  const response = new Response(originResp.body, {
    status: originResp.status,
    headers,
  });

  // Store in cache using waitUntil (non-blocking, doesn't delay response)
  ctx.waitUntil(cache.put(cacheKey, response.clone()));

  return response;
}

Handling large responses

Workers have a 128 MB memory limit. response.clone() buffers the entire body into memory. For large responses, use body.tee() instead:

async function fetchLargeWithCache(
  originUrl: string,
  ctx: ExecutionContext,
): Promise<Response> {
  const cache = caches.default;
  const cacheKey = new Request(originUrl, { method: "GET" });

  const cached = await cache.match(cacheKey);
  if (cached) return cached;

  const originResp = await fetch(originUrl);
  if (!originResp.ok || !originResp.body) return originResp;

  // tee() creates two streams from one - avoids buffering entire body
  const [stream1, stream2] = originResp.body.tee();

  const headers = new Headers(originResp.headers);
  headers.delete("Set-Cookie");
  if (!headers.has("Cache-Control")) {
    headers.set("Cache-Control", "public, max-age=3600");
  }

  const responseToCache = new Response(stream1, {
    status: originResp.status,
    headers,
  });
  ctx.waitUntil(cache.put(cacheKey, responseToCache));

  return new Response(stream2, { status: originResp.status, headers });
}

Caching R2 objects

When serving files from R2, use tee() to stream the response while caching:

async function serveR2WithCache(
  request: Request,
  env: Env,
  ctx: ExecutionContext,
): Promise<Response> {
  const url = new URL(request.url);
  const key = url.pathname.slice(1); // Remove leading slash

  const cache = caches.default;
  // Use the request URL as cache key
  const cacheKey = new Request(url.toString(), { method: "GET" });

  // Check cache first
  const cached = await cache.match(cacheKey);
  if (cached) {
    return cached;
  }

  // Fetch from R2
  const obj = await env.R2.get(key);
  if (!obj) {
    return new Response("Not found", { status: 404 });
  }

  // tee() the R2 stream - one for cache, one for response
  const [stream1, stream2] = obj.body.tee();

  const headers = new Headers();
  headers.set("Content-Type", obj.httpMetadata?.contentType || "application/octet-stream");
  headers.set("Content-Length", String(obj.size));
  headers.set("Cache-Control", "public, max-age=3600");
  headers.set("ETag", obj.httpEtag);

  const responseToCache = new Response(stream1, { status: 200, headers });

  // Store in cache without blocking the response
  ctx.waitUntil(cache.put(cacheKey, responseToCache));

  return new Response(stream2, { status: 200, headers });
}

Observability

Understanding how cache status is reported helps with debugging and analytics.

CF-Cache-Status header

When using Cache API with caches.default:

Operation	`CF-Cache-Status`	Notes
`cache.match()` returns cached response	`HIT`	Cloudflare adds this automatically
`cache.match()` returns `undefined`	N/A	No response to add header to
`cache.put()`	N/A	Storage operation, no response

The CF-Cache-Status: HIT header is automatically added by Cloudflare when you retrieve a cached response via cache.match(). You don’t need to add your own header.

const cached = await cache.match(cacheKey);
if (cached) {
  // cached.headers.get('CF-Cache-Status') === 'HIT'
  // This is added automatically by Cloudflare, not by your code
  return cached;
}

Logpush integration

Cache API operations generate separate log entries in the HTTP requests dataset with ClientRequestSource set to edgeWorkerCacheAPI (value 6). These entries have workersubrequest: true and link back to the parent request via parentrayid.

Cache Operation	`EdgeResponseStatus`	`CacheCacheStatus`
`cache.match()` HIT	`200`	`hit`
`cache.match()` MISS	`504`	`miss`
`cache.put()`	`204`	`unknown`

The 504 on MISS is documented behavior:

cache.match generates a 504 error response when the requested content is missing or expired. The Cache API does not expose this 504 directly to the Worker script, instead returning undefined. Nevertheless, the underlying 504 is still visible in Cloudflare Logs.

Filtering Logpush data:

End-user traffic only: Filter on ClientRequestSource = 1 (eyeball)
Cache API calls only: Filter on ClientRequestSource = 6 (edgeWorkerCacheAPI)
Cache hit rate: Filter ClientRequestSource = 6, then compare CacheCacheStatus = hit vs miss

KV for global consistency

KV provides global replication with eventual consistency. Changes may take up to 60 seconds or more to be visible in other locations. Using KV as a cache means building your own caching layer - you’re responsible for cache key generation, TTL management, invalidation, and purging.

Cache API vs KV trade-offs

Aspect	Cache API	KV
Consistency	Per-colo (different colos may have different content)	Global (eventually consistent across all colos)
TTL management	Automatic via `Cache-Control` headers	Manual via `expirationTtl`
Invalidation	`cache.delete()` per-colo only	`KV.delete()` propagates globally
Purge tooling	Built-in via Cloudflare dashboard/API	Roll your own or use cache-kv-purger
Value size limit	No hard limit (but cloning limited by 128 MB Worker memory)	25 MB (chunking required for larger)

When to use KV over Cache API

Single global cache - Fewer cold cache misses, global invalidation
Programmatic invalidation - Need to purge specific items globally (not just per-colo)
Cross-Worker sharing - Multiple Workers need to share cached data
Metadata-driven purging - Need to find and purge items by tags/metadata

Basic implementation

async function fetchWithKV(
  originUrl: string,
  env: Env,
  ctx: ExecutionContext,
): Promise<Response> {
  const cacheKey = new URL(originUrl).pathname;

  // Check KV first - store body as arrayBuffer, metadata separately
  const { value, metadata } = await env.CACHE_KV.getWithMetadata<
    ArrayBuffer,
    { contentType: string; status: number; cachedAt: number }
  >(cacheKey, { type: "arrayBuffer" });

  if (value && metadata) {
    // Optional: refresh TTL on hit without rewriting value
    const age = Date.now() - metadata.cachedAt;
    if (age > 1800000) {
      // 30 min
      ctx.waitUntil(refreshTTL(env, cacheKey, metadata));
    }
    return new Response(value, {
      status: metadata.status,
      headers: { "Content-Type": metadata.contentType },
    });
  }

  // Fetch from origin
  const response = await fetch(originUrl);

  if (!response.ok) {
    return response;
  }

  // Store in KV: body as value, headers as metadata
  const body = await response.arrayBuffer();
  const contentType =
    response.headers.get("Content-Type") || "application/octet-stream";

  // Use waitUntil for non-blocking write
  ctx.waitUntil(
    env.CACHE_KV.put(cacheKey, body, {
      expirationTtl: 3600,
      metadata: { contentType, status: response.status, cachedAt: Date.now() },
    }),
  );

  return new Response(body, {
    status: response.status,
    headers: { "Content-Type": contentType },
  });
}

async function refreshTTL(
  env: Env,
  key: string,
  metadata: object,
): Promise<void> {
  // KV doesn't support TTL refresh without rewriting - must read and write
  const { value } = await env.CACHE_KV.getWithMetadata(key, {
    type: "arrayBuffer",
  });
  if (value) {
    await env.CACHE_KV.put(key, value, {
      expirationTtl: 3600,
      metadata: { ...metadata, cachedAt: Date.now() },
    });
  }
}

What you’re building

KV caching requires implementing what Cache API gives you automatically:

Cache key generation - Deterministic keys from request params (e.g., video:sample.mp4:w=1280:h=720)
Metadata storage - Headers, content-type, timestamps alongside value (KV metadata limited to 1KB)
TTL management - expirationTtl on write, tracking cachedAt for refresh decisions
Chunking - Splitting files > 25 MB across multiple keys, reassembling on read
Invalidation - By exact key, prefix/pattern (requires listing), or metadata tags

Advanced patterns

Cache versioning - Atomic invalidation without listing keys

Instead of purging individual keys, increment a version number:

Request coalescing - Prevent duplicate origin fetches

When multiple requests arrive for the same uncached content:

const inFlight = new Map<string, Promise<Response>>();

async function fetchWithCoalescing(
  cacheKey: string,
  url: string,
): Promise<Response> {
  if (inFlight.has(cacheKey)) {
    return (await inFlight.get(cacheKey))!.clone();
  }

  const promise = fetchAndCache(url);
  inFlight.set(cacheKey, promise);
  const response = await promise;
  inFlight.delete(cacheKey);
  return response;
}

Chunking - Files > 25 MB across multiple keys

KV has a 25 MB value limit. Split larger files into chunks with a manifest:

interface ChunkManifest {
  totalSize: number;
  chunkSize: number;
  chunks: string[]; // KV keys for each chunk
  contentType: string;
}

async function storeLargeFile(
  key: string,
  data: ArrayBuffer,
  env: Env,
  ctx: ExecutionContext,
): Promise<void> {
  const CHUNK_SIZE = 20 * 1024 * 1024; // 20 MB chunks (under 25 MB limit)
  const chunks: string[] = [];

  for (let offset = 0; offset < data.byteLength; offset += CHUNK_SIZE) {
    const chunkKey = `${key}:chunk:${chunks.length}`;
    const chunk = data.slice(offset, offset + CHUNK_SIZE);
    ctx.waitUntil(env.CACHE_KV.put(chunkKey, chunk, { expirationTtl: 86400 }));
    chunks.push(chunkKey);
  }

  const manifest: ChunkManifest = {
    totalSize: data.byteLength,
    chunkSize: CHUNK_SIZE,
    chunks,
    contentType: "application/octet-stream",
  };

  ctx.waitUntil(
    env.CACHE_KV.put(key, JSON.stringify(manifest), { expirationTtl: 86400 }),
  );
}

For retrieval with range request support, see Media Transformation Architecture.

For production implementations of all patterns, see:

video-resizer - Full implementation with versioning, coalescing, chunking
Media Transformation Architecture - Detailed documentation
cache-kv-purger - CLI for purging by tags/metadata

Content-addressable storage with instant invalidation

KV’s eventual consistency (up to 60 seconds) makes cache invalidation challenging. Content-addressable storage sidesteps this: instead of updating content at existing keys, write new content to new keys (derived from content hashes) and update a version pointer.

Why this works: The key insight is that content-addressed keys are immutable - the same hash always returns the same content. You never update a key; you write to a new one. This means:

No stale reads - if a key exists, its content is correct by definition
No invalidation needed - old keys simply become orphaned, not stale
Eventual consistency becomes irrelevant - you’re not waiting for updates to propagate, you’re waiting for new keys to appear (and the version pointer tells you when they’re ready)

This is how Workers Static Assets works internally - assets are stored by content hash in KV, and a manifest maps paths to hashes. The manifest is deployed atomically with the Worker, so there’s no propagation delay.

For centralized KV stores where you can’t redeploy Workers on every content change, use a Durable Object as the version pointer with Cache API to stay under the 1000 RPS limit.

Trade-off: You’re trading storage space for consistency. Old content accumulates until TTL expiration or explicit cleanup.

Read and write flows

Read flow:

Write flow:

The wait before flipping ensures the new manifest has propagated to KV edge locations. Without this, some colos might see v44 but fail to find manifest:v44.

Caching strategy

Layer	Strategy	Why
Version pointer	DO + Cache API (5-10s TTL)	Strong consistency, cached to stay under 1000 RPS
Manifest	KV with versioned key + in-memory	Immutable per version, safe to cache forever
Content	KV with content-hash key	Immutable, deduplicated, 1 year cache

RPS math: With Cache API caching the version for 5 seconds across ~300 colos, you get ~60 RPS to the DO - well under the 1000 limit.

Implementation

Durable Object - Version pointer only

import { DurableObject } from "cloudflare:workers";

export class ManifestVersion extends DurableObject {
  private version: string | undefined;

  async getVersion(): Promise<string> {
    if (this.version === undefined) {
      this.version = (await this.ctx.storage.get<string>("version")) ?? "v0";
    }
    return this.version;
  }

  async setVersion(newVersion: string): Promise<string> {
    this.version = newVersion;
    await this.ctx.storage.put("version", newVersion);
    return this.version;
  }
}

Reader Worker - Cache version in Cache API, manifest in memory

interface Manifest {
  assets: Record<string, string>; // path -> content hash
  createdAt: number;
}

interface Env {
  MANIFEST_VERSION: DurableObjectNamespace<ManifestVersion>;
  CACHE_KV: KVNamespace;
}

// Per-isolate cache (survives across requests in same isolate)
let manifestCache: { version: string; data: Manifest } | null = null;

async function getCurrentVersion(
  env: Env,
  ctx: ExecutionContext,
): Promise<string> {
  const cache = caches.default;
  const cacheKey = new Request("https://internal/manifest-version");

  // Check Cache API (per-colo, short TTL)
  const cached = await cache.match(cacheKey);
  if (cached) {
    return await cached.text();
  }

  // Miss - fetch from DO (strong consistency)
  const stub = env.MANIFEST_VERSION.getByName("global");
  const version = await stub.getVersion();

  // Cache for 5s - balances freshness vs DO RPS
  ctx.waitUntil(
    cache.put(
      cacheKey,
      new Response(version, {
        headers: { "Cache-Control": "max-age=5" },
      }),
    ),
  );

  return version;
}

async function getManifest(env: Env, ctx: ExecutionContext): Promise<Manifest> {
  const version = await getCurrentVersion(env, ctx);

  // Check per-isolate cache (fastest)
  if (manifestCache?.version === version) {
    return manifestCache.data;
  }

  // Fetch from KV (versioned key = immutable, cache 1 year)
  const manifest = await env.CACHE_KV.get<Manifest>(`manifest:${version}`, {
    type: "json",
    cacheTtl: 31536000,
  });

  if (!manifest) {
    throw new Error(`Manifest ${version} not found`);
  }

  manifestCache = { version, data: manifest };
  return manifest;
}

async function getAsset(
  path: string,
  env: Env,
  ctx: ExecutionContext,
): Promise<{ body: ArrayBuffer; contentType: string } | null> {
  const manifest = await getManifest(env, ctx);
  const contentHash = manifest.assets[path];

  if (!contentHash) return null;

  // Content-addressed = immutable, cache 1 year
  const { value, metadata } = await env.CACHE_KV.getWithMetadata<
    ArrayBuffer,
    { contentType: string }
  >(`content:${contentHash}`, { type: "arrayBuffer", cacheTtl: 31536000 });

  if (!value) return null;

  return {
    body: value,
    contentType: metadata?.contentType ?? "application/octet-stream",
  };
}

Publisher - Write content, then manifest, then flip version

async function publish(
  assets: Map<string, { content: ArrayBuffer; contentType: string }>,
  env: Env,
): Promise<string> {
  const newVersion = `v${Date.now()}`;
  const manifest: Manifest = { assets: {}, createdAt: Date.now() };

  // 1. Hash all content and build manifest
  const hashes = await Promise.all(
    [...assets.entries()].map(async ([path, { content }]) => ({
      path,
      hash: await sha256(content),
    })),
  );

  for (const { path, hash } of hashes) {
    manifest.assets[path] = hash;
  }

  // 2. Check which content already exists (deduplication)
  const existenceChecks = await Promise.all(
    hashes.map(async ({ hash }) => ({
      hash,
      exists:
        (await env.CACHE_KV.get(`content:${hash}`, { type: "arrayBuffer" })) !==
        null,
    })),
  );

  const newHashes = new Set(
    existenceChecks.filter(({ exists }) => !exists).map(({ hash }) => hash),
  );

  // 3. Write new content and manifest in parallel
  const writes: Promise<void>[] = [];

  for (const [path, { content, contentType }] of assets) {
    const hash = manifest.assets[path];
    if (newHashes.has(hash)) {
      writes.push(
        env.CACHE_KV.put(`content:${hash}`, content, {
          metadata: { contentType },
          expirationTtl: 86400 * 30,
        }),
      );
    }
  }

  writes.push(
    env.CACHE_KV.put(`manifest:${newVersion}`, JSON.stringify(manifest), {
      expirationTtl: 86400 * 30,
    }),
  );

  await Promise.all(writes);

  // 4. Wait for KV propagation before flipping version
  await new Promise((r) => setTimeout(r, 5000));

  // 5. Flip version pointer (instant, strongly consistent)
  const stub = env.MANIFEST_VERSION.getByName("global");
  await stub.setVersion(newVersion);

  return newVersion;
}

async function sha256(data: ArrayBuffer): Promise<string> {
  const hash = await crypto.subtle.digest("SHA-256", data);
  return [...new Uint8Array(hash)]
    .map((b) => b.toString(16).padStart(2, "0"))
    .join("");
}

Consistency guarantees

Event	Latency
Version flipped in DO	Instant (strongly consistent)
Worker sees new version	≤5-10s (Cache API TTL)
Worker reads new manifest/content	Instant (versioned/hashed keys are immutable)

Effective end-to-end latency: 5-10 seconds after publish completes. The 5s wait ensures KV has propagated the new manifest before the DO points to it.

Microservices architecture

When building multi-Worker systems where a router Worker dispatches requests to origin Workers (often on different zones), you need to decide where caching happens. This affects latency, consistency, and operational complexity.

The key question: should the router cache responses from origins, or should each origin manage its own caching?

Centralized caching (router handles all)

Which caching mechanism to use in the router:

Origin Type	Caching Mechanism	Why
Not proxied (grey-clouded, non-CF)	`fetch()` with `cf` options	Request uses your zone’s cache
Orange-clouded (same zone)	`fetch()` with `cf` options	Same zone, `cf` options work
Orange-clouded (cross-zone)	Cache API or KV	`cf` options ignored

Pros	Cons
Single cache management point	Router becomes bottleneck
Consistent behavior	Extra hop latency
Easier debugging	Tight coupling to origins
Centralized circuit breakers	Router must know how each origin behaves

Distributed caching (origins decide)

Pros	Cons
Each service owns its strategy	Inconsistent behavior
No single point of failure	Harder to debug system-wide
Independent deployments	Harder invalidation
Better separation of concerns	Duplicate logic

Recommendation

Prefer distributed caching for cross-zone Worker architectures:

Router stays stateless - routing logic only
Origins control their own caching - using cf options or Cache Rules
If router needs caching, use Cache API (since origins are cross-zone)

Why: origins know their own caching needs, and a stateless router has fewer failure modes.

Origin-side caching

Origin Workers cache upstream responses with cf options (same zone) and signal cacheability to downstream via headers:

// Origin Worker - fetches from upstream API and caches at origin's edge
export default {
  async fetch(request: Request): Promise<Response> {
    // Fetch from upstream with cf options (same zone or non-CF = works)
    // cacheTtl implicitly enables cacheEverything (JSON not cached by default)
    const upstream = await fetch("https://api.example.com/data", {
      cf: { cacheTtl: 3600 },
    });

    // Worker-generated responses bypass CDN cache
    // Cache-Control tells downstream (router, browser) how long to cache
    return new Response(upstream.body, {
      status: upstream.status,
      headers: {
        "Content-Type": "application/json",
        "Cache-Control": "public, max-age=3600",
      },
    });
  },
};

Two levels of caching here:

CDN cache (via cf options) - caches the upstream API response at this zone’s edge
Downstream cache (via Cache-Control header) - tells the calling Worker/browser how long to cache this response

For details on how Cache-Control headers interact with Cloudflare, see Origin Cache Control.