DEV Community

ZeroTrust Architect
ZeroTrust Architect

Posted on • Originally published at cacheguard.com

HTTP Cache-Control Headers, Squid, and Why Your Gateway Cache Misses More Than You Think

Web proxy caching stores HTTP responses locally and serves them to subsequent requestors without fetching from the origin. The hit rate — the proportion of requests served from cache — determines how much bandwidth is actually saved. Understanding what drives hit rate means understanding HTTP caching semantics.

What is Web Caching?

Cache-Control directives that matter at the proxy layer

HTTP caching is controlled by the Cache-Control response header. The directives most relevant to a forward proxy cache:

Cache-Control: max-age=3600          # Cache for 3600 seconds
Cache-Control: no-cache              # Revalidate with origin before serving
Cache-Control: no-store              # Do not cache at all
Cache-Control: private               # Cache only in browser, not proxy
Cache-Control: s-maxage=86400        # Proxy-specific max-age (overrides max-age for shared caches)
Cache-Control: must-revalidate       # Must revalidate on expiry, no stale serving
Enter fullscreen mode Exit fullscreen mode

The private directive is the most common reason proxy cache hit rates are lower than expected. Any response marked private is explicitly excluded from shared caches like a proxy. This includes most authenticated API responses, personalised pages, and session-dependent content.

The Vary header

The Vary header specifies which request headers affect the cached response:

Vary: Accept-Encoding
Enter fullscreen mode Exit fullscreen mode

This means the proxy must cache separate copies for gzip and identity encodings. A Vary: * header makes a response uncacheable at the proxy level.

Vary: Cookie or Vary: Authorization effectively makes every authenticated response unique — uncacheable in practice.

Cache validation: conditional requests

When a cached response expires (past max-age), the proxy does not necessarily fetch a new copy. It can revalidate using conditional request headers:

If-Modified-Since: Wed, 07 May 2026 10:00:00 GMT
If-None-Match: "abc123etag"
Enter fullscreen mode Exit fullscreen mode

If the origin responds with 304 Not Modified, the cached copy is refreshed with a new TTL at zero bandwidth cost. This is why responses with ETags or Last-Modified headers achieve better effective cache rates than responses without them.

Why HTTPS responses are uncacheable without SSL inspection

HTTPS traffic is encrypted end-to-end. A forward proxy in the flow can see the TCP connection and the TLS SNI — but not the HTTP response, its headers, or its body. Without visibility into Cache-Control, Content-Type, or response status, the proxy cannot make caching decisions.

[Client] <-- TLS --> [?? Proxy ??] <-- TLS --> [Origin]
                        ↑
               Cannot see Cache-Control
               Cannot see response body
               Cannot store anything
Enter fullscreen mode Exit fullscreen mode

SSL inspection (TLS MITM) breaks the end-to-end encrypted session at the proxy. The proxy terminates the client's TLS session, reads the plaintext response, makes caching decisions, stores cacheable content, and re-encrypts before delivery.

[Client] <-- TLS-A --> [Proxy w/ SSL inspection] <-- TLS-B --> [Origin]
                               ↑
                       Sees full HTTP response
                       Applies Cache-Control logic
                       Stores cacheable content
Enter fullscreen mode Exit fullscreen mode

This is why enabling SSL inspection on a proxy cache dramatically increases the cache hit rate — it makes HTTPS traffic (the overwhelming majority of modern web content) eligible for caching.

Squid's storage architecture

Squid (the most widely deployed open-source proxy cache) uses a multi-tier storage system:

  • In-memory cache (hot objects): Frequently accessed small objects kept in RAM for sub-millisecond response.
  • Disk cache (ufs or rock store): Larger objects stored on disk. Squid uses a hash-based directory structure to map URLs to disk locations.
  • Negative caching: Squid caches error responses (404, 503) for a configurable period to avoid hammering unavailable origins.

Cache size is configured separately for memory and disk:

cache_mem 256 MB
maximum_object_size_in_memory 512 KB
cache_dir ufs /var/spool/squid 10000 16 256
maximum_object_size 100 MB
Enter fullscreen mode Exit fullscreen mode

The last parameter of cache_dir specifies directory depth (16 first-level, 256 second-level directories) — tunable based on expected object count.

What genuinely cannot be cached

Even with SSL inspection enabled:

  • Responses with Cache-Control: no-store or private
  • Responses with Authorization request headers (unless Cache-Control: public overrides)
  • POST, PUT, DELETE, PATCH responses (non-idempotent methods)
  • Responses with Set-Cookie (Squid excludes these by default)
  • Streaming responses (chunked transfer with no Content-Length)

In practice, static assets (CSS, JS, images, fonts, software update packages) make up a large fraction of total bytes transferred and are almost always cacheable. Dynamic application responses are rarely cacheable. A realistic hit rate for a mixed-traffic office network with SSL inspection enabled is 20–50% of bytes served from cache.

CacheGuard ships Squid as its web caching engine, integrated with SSL mediation so HTTPS content is cacheable. Cache size is configurable at installation time based on available disk, and the dashboard exposes real-time hit rate metrics.

https://www.cacheguard.com/what-is-web-caching/


Originally published on the CacheGuard Blog. CacheGuard is free and open source — GitHub.

Top comments (0)