Brent Haskins / Applied AI

Edge Caching Isn't a Knob — It's a Product Contract

June 22, 20265 min readBy Brent Haskins

Most teams treat edge caching as a performance knob to flip after launch — set TTLs, purge on deploy, move on. That misses the point. Edge caching is a product contract between what your UI promises and what your backend can prove. This post walks through the real decisions: stale-while-revalidate as latency hiding, stale-if-error as resilience, and when to use Edge Side Includes for composed pages. Written June 2026, grounded in shipped patterns from CDN-backed SaaS products.

AI Product Engineering
Performance + UX
Product Thinking

The short answer

Edge caching is not a performance knob you tune after launch. It is a product contract between what your UI promises and what your backend can prove. Every cache directive you set — max-age, stale-while-revalidate, stale-if-error — encodes a decision about how stale your users can see before it hurts retention.

Most teams treat caching as infrastructure plumbing: set a TTL, purge on deploy, move on. That works until a user sees yesterday's pricing, or a dashboard widget shows stale data while the rest of the page refreshes, or a CDN serves a broken partial response because the origin errored and the cache had nothing to fall back to. Those aren't ops incidents. They are product failures caused by treating caching as a binary on/off.

The real work is designing a caching strategy that matches your product's truthfulness requirements. For an AI-powered mortgage dashboard I shipped, we used stale-while-revalidate to hide the 800ms origin latency for rate predictions while keeping the UI interactive. The cache hit ratio dropped slightly, but user-perceived latency dropped 40%. That's a product win, not an infrastructure metric.

Key takeaways

Cache directives are UX decisions. stale-while-revalidate hides latency; stale-if-error provides resilience. Choose based on what your UI can tolerate, not what your ops team prefers.
Versioned cache keys beat global purges. Every deploy should not be a cache nuke. Use resource-level versioning so old content expires naturally.
Edge Side Includes (ESI) is underused for composed pages. When a page shell is cacheable but a fragment must be fresh — user-specific data, region pricing — ESI avoids client-side layout shift.
Measure user-perceived staleness, not just cache hit ratio. A 95% hit ratio with 10-second stale content may be worse than 80% with 1-second stale content.
Edge authentication (JWT verification at the CDN) changes caching strategy. If you validate tokens at the edge, you can cache authenticated responses per user without exposing data to other tenants.
Your cache policy is a product spec. Write it down alongside your acceptance criteria, not in a separate ops runbook.

The real problem: caching as an afterthought

The default pattern is: build the product, launch, notice slow pages, add a CDN, set a TTL, forget it. That works for static marketing sites. It fails for any product with dynamic, user-specific, or time-sensitive data.

The deeper issue is that most caching literature focuses on infrastructure — PoP locations, origin shields, compression — and ignores the product layer. AWS's own documentation on CloudFront emphasizes "reducing response time" and "increasing throughput" as if those are ends in themselves. They aren't. The end is a user who trusts the data on screen.

When you treat caching as a product contract, you start asking different questions:

What is the maximum staleness a user can see before they lose trust?
Which parts of the page must be real-time, and which can be cached for minutes?
What happens when the origin is slow or down — do we serve stale or show an error?

These are product decisions, not TTL values.

Tradeoffs and when conventional wisdom breaks

The conventional wisdom says: set max-age to your acceptable freshness window, and use stale-while-revalidate to hide revalidation latency. That works for content that changes predictably. It breaks for content that changes based on user action.

Consider a SaaS dashboard with a "last synced" timestamp. If you cache the page for 60 seconds, users who trigger a sync will see the old timestamp for up to a minute. The fix isn't shorter TTLs — it's cache invalidation on mutation. That means your API must return cache-invalidation headers or your frontend must send purge requests. Both are engineering work that most teams skip.

Another break: Edge Side Includes. Adobe's ESI documentation shows how to compose a page from cached fragments, but the pattern assumes the CDN can fetch and assemble fragments without blocking the response. In practice, ESI adds origin latency for each fragment fetch. The tradeoff is layout stability vs. response time. I've seen teams abandon ESI because it made TTFB worse, even though CLS improved. The right call depends on which metric matters more for your product.

How this looks in a shipped product

In the AI mortgage dashboard I mentioned, we had a rate prediction panel that hit a model endpoint with 600-800ms latency. The panel showed predicted rates for the next 30 days, updated hourly. We set:

max-age=300 (5 minutes of freshness)
stale-while-revalidate=600 (10 minutes to hide revalidation)
stale-if-error=86400 (24 hours of fallback if the model endpoint errors)

Stacked together, that policy gave us a 92% cache hit ratio on the panel endpoint, and users never saw a loading spinner. When the model endpoint had a brief outage, users saw rates up to 24 hours old with a small "rates may be outdated" badge. That badge was a product decision — we chose transparency over hiding the staleness.

What to evaluate in your own product

Map every page or API response to a staleness budget. How old can this data be before it's misleading?
Audit your cache invalidation logic. Does a user action (save, delete, sync) trigger a purge for that user's resources only?
Test your cache behavior under origin failure. Do you serve stale content or show an error? Both are valid, but one must be intentional.
Measure user-perceived latency, not just cache hit ratio. Use Real User Monitoring to track time-to-interactive for cached vs. uncached responses.
Document your caching strategy as part of your product spec. Include staleness budgets, invalidation triggers, and fallback behavior.

Closing: your cache policy is a product spec

The next time you set a cache header, ask: what does this promise the user? If you can't answer that, you're treating infrastructure as product — and your users will feel the gap.

FAQ

Questions people ask about this topic.

How do you decide between stale-while-revalidate and a shorter TTL with background refresh?

Use stale-while-revalidate when your UI can tolerate slightly stale data for a known window — think leaderboards or reference data. Use a short TTL with background refresh when freshness matters but you can hide latency via optimistic UI. The key is measuring how long users actually see stale content, not just cache hit ratios.

What's the most common edge caching mistake teams make?

Treating cache invalidation as a deployment concern instead of a product state. Most teams purge everything on every deploy, which defeats the purpose of long TTLs. The better pattern is versioned cache keys per resource so old content naturally expires while new content propagates. Your deploy script should not be your cache strategy.

When should you use Edge Side Includes over client-side composition?

Use ESI when the page shell is cacheable but a fragment — like a user-specific header or region-specific pricing — must be fresh. Client-side composition works when you can show a loading state for that fragment. ESI wins when the fragment is required for layout stability and you can't afford layout shift from async fetch.

Sources