Skip to content

Media Transformation Architecture

How video-resizer handles media transformation on Cloudflare Workers.

Single endpoint for video resize, frame extraction, spritesheet generation, and audio extraction. Configuration lives in KV so you don’t need to redeploy when changing settings. Two-layer cache (CDN edge + KV) with param validation before hitting the cdn-cgi/media endpoint for transformations.


Four transformation modes:

  • Video: Resize, compress, adjust quality/playback
  • Frame: Extract stills at timestamps (jpg/png)
  • Spritesheet: Grid previews for scrubbing UIs
  • Audio: Extract m4a audio tracks

Config-driven mode enablement, strong validation, mode-specific strategies. Each mode handles its own param prep and validation (strategy pattern, nothing fancy).


StepActionDetails
1. ParseNormalize query paramsConfig, derivatives, IMQuery, aliases, mode inference
2. SelectChoose strategyVideo / Frame / Spritesheet / Audio
3. BuildCreate cdn-cgi URLCache version, origins, path resolution
4. CacheFetch & storeKV variants + CDN edge caching
5. ShapeFormat responseHeaders, filename, range request support
PatternPurposeBenefit
Strategy per modeEach mode = own validation + paramsAdd modes without touching handler
Config + schemaZod validation + runtime defaultsNew modes work even if config lags
Translation layerMap Akamai/short params to internalSupport multiple param conventions
Cache separationMode-scoped KV keysNo collision between variants
FeatureVideoFrameSpritesheetAudio
Dimensions✓ 10–2000px✓ 10–2000px✓ Required
Fit modes
Time✓ 0–10m✓ 0–10m✓ 0–10m✓ 0–10m
Duration✓ 1–300s✓ 1–300s✓ 1–300s
Format control✓ jpg/png❌ JPEG only✓ m4a only
Quality/compression
Playback params

ParamValuesRange/Rules
modevideo | frame | spritesheet | audioAuto-set to audio if format=m4a
formatjpg | png | m4aFrame: jpg/png, Audio: m4a, Video: ❌, Spritesheet: ❌
time0s10mDefault: 0s (start position)
duration1s300sOmit = full length (up to platform limits)
width102000Required for spritesheet
height102000Required for spritesheet
fitcontain | scale-down | coverVideo/Frame/Spritesheet only

KV keys: mode:path:param1=value1:param2=value2

audio:rocky.mp4:duration=120s:t=30s:f=m4a
video:sample.mp4:w=1280:h=720:q=high
frame:clip.mp4:t=5s:w=640:h=360:f=png

cdn-cgi URL: Includes all transform params + optional version for cache busting

Headers: Auto-corrected Content-Type per mode, optional Content-Disposition filename


TaskSteps
Add mode1. Implement strategy (prepareTransformParams, validateOptions, updateDiagnostics)
2. Add case to createTransformationStrategy
3. Add to config defaults/validOptions
4. Document
Add param1. Add to config validOptions/defaults
2. Handle in option parser
3. Map in param mapping
4. Validate in supporting strategies
Adjust limitsUpdate timeUtils validators → Update docs/tests
Terminal window
# Upload config
node tools/config-wrapper.js upload \
--config config/worker-config.json \
--env production --token <token> --force
# Deploy
npm run deploy:prod # requires CLOUDFLARE_API_TOKEN
# Debug
?debug=true # adds diagnostics to response

Terminal window
# Audio (auto mode via format=m4a)
curl -I "https://cdn.erfi.dev/rocky.mp4?format=m4a&filename=audio.m4a"
# Frame thumbnail (PNG, 640x360)
curl -I "https://cdn.erfi.dev/rocky.mp4?mode=frame&time=4s&width=640&height=360&fit=cover&format=png"
# Spritesheet (800x600, 60s window)
curl -I "https://cdn.erfi.dev/rocky.mp4?mode=spritesheet&width=800&height=600&duration=60s"
# Video (resized)
curl -I "https://cdn.erfi.dev/rocky.mp4?width=1280&height=720"

Video

640×360 size

Frame

Frame at 4s.png at 9s timestamp

Spritesheet

Spritesheet800×600 spritesheet

Audio

m4a audio clip

How a request flows through the system:

d2 diagram

Each mode has its own validation and parameter preparation:

d2 diagram

Two-layer caching with KV and CDN edge:

d2 diagram

Multi-source storage with automatic fallback:

d2 diagram