Brent Haskins / Applied AI

CDN Metrics Are Lying to You: What Product Engineers Should Track Instead

June 13, 20265 min readBy Brent Haskins

Most CDN dashboards report edge-level metrics that don't reflect actual user experience. Cache hit ratio and byte offload are infrastructure metrics, not UX metrics. Product engineers need to instrument from the client, track percentiles, and treat caching as a product decision. Written June 2026, based on real shipped experience with real-time dashboards and global user bases.

Performance + UX
Product Thinking

The short answer

CDN dashboards are built for infrastructure engineers. They report cache hit ratio, byte offload, and edge response times — numbers that look great in a slide deck but tell you almost nothing about whether your product actually feels fast to users. I've watched teams celebrate a 95% cache hit ratio while their users in Southeast Asia waited two seconds for uncached assets. That metric is a vanity number.

Real user latency depends on geography, network conditions, the proportion of uncached requests, and how your application handles cache misses. Product engineers need to instrument from the browser, track percentiles by region, and treat caching as a product decision — not an infrastructure default. The CDN is a tool, not a performance guarantee.

Key takeaways

Cache hit ratio is an infrastructure metric, not a UX metric. It measures edge efficiency, not perceived speed.
Byte offload tells you how much bandwidth you saved, not how fast your pages loaded.
Track p95 and p99 TTFB and LCP from the client, segmented by region and device.
Edge computing (CloudFront Functions, Lambda@Edge) can reduce latency for dynamic content without forcing a cache-origin round trip.
Caching strategy should vary by content type: static assets can have long TTLs, API responses need short TTLs or no cache, and personalized data should bypass the cache entirely.
Use real user monitoring (RUM) alongside synthetic checks. Synthetics tell you if the site is up; RUM tells you if it's fast for actual users.

The real problem: infrastructure metrics masquerading as UX metrics

Standard CDN dashboards report metrics like cache hit ratio, edge response time, and byte offload. These are useful for capacity planning and cost optimization, but they don't correlate with user experience. A high cache hit ratio can coexist with terrible performance if the uncached requests are slow, if the cache serves stale data, or if the user is far from the nearest edge point of presence.

As the Hydrolix article points out, "standard CDN performance metrics hide real user latency." The edge response time reported by the CDN is measured at the edge server, not at the user's browser. It doesn't account for last-mile latency, DNS resolution, TLS negotiation, or client-side rendering. A 20ms edge response can become a 600ms user experience.

Product engineers should demand metrics that reflect the user's reality: time-to-first-byte from the browser, first contentful paint, and largest contentful paint, broken down by percentile and geography. These are the numbers that tell you whether your product is actually fast.

Tradeoffs: when caching hurts more than helps

Caching is not always a win. For dynamic content — real-time dashboards, personalized recommendations, or any data that changes per user — caching can serve stale or incorrect data. I've seen products where a 5-minute cache TTL caused users to see outdated information, leading to support tickets and lost trust.

The tradeoff is between latency and freshness. For static assets (images, scripts, stylesheets), long TTLs are safe and beneficial. For API responses, you need to decide: can the user tolerate stale data? If not, don't cache at the edge — use edge functions to assemble responses close to the user, or stream data directly from the origin with low latency.

Edge computing changes the calculus. Instead of caching a full response, you can cache the template and fetch only the dynamic data at the edge. This gives you the speed of a cache hit with the freshness of a live request. But it adds complexity: you need to write and maintain edge functions, handle errors gracefully, and monitor cold-start latency.

How this looks in a shipped product

In a real-time dashboard I shipped, we used CloudFront for static asset delivery and API caching. The initial approach was to cache all API responses for 60 seconds. The CDN dashboard showed a 90% cache hit ratio and sub-50ms edge response times. But our RUM data told a different story: users in India and Brazil saw TTFB over 800ms on cache misses, and the dashboard displayed stale data during the cache window.

We made three changes. First, we moved dynamic API calls to WebSocket streams, bypassing the CDN cache entirely. Second, we used CloudFront Functions to rewrite cache keys per user, so personalized data was never served from a shared cache. Third, we added client-side instrumentation that sent performance metrics to our analytics pipeline, with alerts for p95 TTFB over 300ms by region.

The result: cache hit ratio dropped to 60%, but real user latency improved by 40% because every request was either fast (streamed) or fresh (uncached). The CDN dashboard looked worse, but the product felt better.

What to evaluate and watch for

Are you measuring from the client or from the edge? If you're only looking at CDN dashboards, you're missing the last mile.
Do you have percentile breakdowns by region? Averages hide the long tail of slow users.
Are you tracking cache hit ratio by content type? A high ratio for static assets is fine; a high ratio for API responses may indicate stale data.
Do you have a fallback for cache misses? If the origin is slow, the CDN can't fix it — you need to optimize the origin or use edge computing.
Are you using edge functions for dynamic content? They can reduce latency without sacrificing freshness, but monitor cold starts and error rates.

Closing

Stop relying on CDN dashboard averages. Add a RUM script that sends performance metrics to your analytics. Set up alerts for p95 latency by region. Treat caching as a product decision, not an infrastructure default. The CDN is a tool — use it where it helps, bypass it where it hurts, and always measure from the user's perspective.

FAQ

Questions people ask about this topic.

What's the most misleading CDN metric?

Cache hit ratio. A high ratio doesn't mean fast pages — it means the edge served cached content, but it says nothing about how long uncached requests take, how stale the cache is, or what users in different regions actually experience. I've seen 95% hit ratios mask 2-second uncached loads for users far from the edge.

How should I measure real user latency instead?

Instrument from the browser using the Performance API or a RUM agent. Track time-to-first-byte (TTFB), first contentful paint (FCP), and largest contentful paint (LCP) at p50, p95, and p99 by region. Compare those against synthetic checks. If your CDN dashboard says 50ms but RUM says 500ms, you have a geography or routing problem.

When should I bypass CDN caching for dynamic content?

When freshness matters more than speed — real-time dashboards, personalized feeds, or any data that changes per user or session. Use edge functions to assemble dynamic responses close to the user, or stream data directly. Caching stale personalized data is worse than a slow fresh response.

Sources