←Back to Tutorials

Caching Strategies & Patterns

Master caching from browser to distributed systems with comprehensive strategies

95 minutes
8Detailed Sections
Senior Level

Caching stores frequently accessed data in a fast-access location to reduce latency and load on backend systems. The performance gains are dramatic: reading from RAM (Redis, Memcached) takes 1-10ms versus 100-500ms for database queries.

For compute-intensive operations like image resizing or report generation, caching can improve response times from seconds to milliseconds.

However, caching introduces complexity: stale data (showing outdated information), cache invalidation (the hardest problem in computer science according to Phil Karlton), memory limits (what to evict?), and consistency challenges (keeping cache and source of truth synchronized).

The cache hit ratio is criticalβ€”percentage of requests served from cache. A 90% hit ratio means only 10% of requests hit the database, reducing load 10x.

Cache placement matters: client-side (browser), CDN (edge), application-level (in-memory or Redis), and database-level (query cache). Each layer has different characteristics.

Browser caching uses HTTP headers (Cache-Control, ETag) to store resources locallyβ€”fastest possible, but user-specific. CDN caching places content geographically close to users, reducing latency from 200ms to 20ms for global applications.

Application caching uses Redis or Memcached for session data, computed results, or frequently queried data. The key principle: cache what's expensive to compute or fetch, and accessed frequently.

Key Takeaways

1
Performance Impact: 90% cache hit ratio can reduce database load 10x and improve P95 latency from 500ms to 50ms
2
Memory Cost: Redis hosting costs $50-500/month depending on size; much cheaper than scaling databases
3
Hit Ratio Calculation: (cache hits / total requests) Γ— 100; monitor this as primary cache metric
4
Latency Comparison: RAM 1ms, SSD 10ms, Network 50ms, Database 200ms, Disk 10000ms
5
Browser Caching: Use Cache-Control: max-age=31536000 for immutable assets with versioned filenames
6
Common Pitfall: Caching everythingβ€”cache only what's expensive to compute and frequently accessed
7
Solution: Use cache-aside pattern for most use cases; write-through for strong consistency needs
8
Real-World: Facebook serves 99% of reads from cache (Memcached cluster with 10,000+ servers)
9
Eviction: LRU (Least Recently Used) is default; consider LFU (Least Frequently Used) for long-tail data

Visual Diagram


β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         Cache Layers & Latency            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                            β”‚
β”‚ Client (Browser):                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” Cache-Control: 1hr    β”‚
β”‚  β”‚ Static Assets  β”‚ Latency: 0ms (local)  β”‚
β”‚  β”‚ (CSS, JS, img) β”‚ Hit ratio target: 95% β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                        β”‚
β”‚         ↓ (miss)                           β”‚
β”‚ CDN (Edge):                                β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” Global distribution   β”‚
β”‚  β”‚ HTML, API resp β”‚ Latency: 10-50ms      β”‚
β”‚  β”‚ Media files    β”‚ Hit ratio target: 80% β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                        β”‚
β”‚         ↓ (miss)                           β”‚
β”‚ Application Cache (Redis):                β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” In same datacenter    β”‚
β”‚  β”‚ Session, user  β”‚ Latency: 1-10ms       β”‚
β”‚  β”‚ Computed data  β”‚ Hit ratio target: 90% β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                        β”‚
β”‚         ↓ (miss)                           β”‚
β”‚ Database:                                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” Source of truth       β”‚
β”‚  β”‚ All data       β”‚ Latency: 50-500ms     β”‚
β”‚  β”‚ Complex queriesβ”‚ No caching            β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                        β”‚
β”‚                                            β”‚
β”‚ Request flow (cache hit):                 β”‚
β”‚  Client β†’ CDN (hit) β†’ 20ms response       β”‚
β”‚                                            β”‚
β”‚ Request flow (cache miss all layers):     β”‚
β”‚  Client β†’ CDN β†’ App Cache β†’ DB            β”‚
β”‚  0ms + 20ms + 5ms + 200ms = 225ms         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜