Caching Strategies & Patterns

Master caching from browser to distributed systems with comprehensive strategies

95 minutes

8Detailed Sections

Senior Level

Caching stores frequently accessed data in a fast-access location to reduce latency and load on backend systems. The performance gains are dramatic: reading from RAM (Redis, Memcached) takes 1-10ms versus 100-500ms for database queries.

For compute-intensive operations like image resizing or report generation, caching can improve response times from seconds to milliseconds.

However, caching introduces complexity: stale data (showing outdated information), cache invalidation (the hardest problem in computer science according to Phil Karlton), memory limits (what to evict?), and consistency challenges (keeping cache and source of truth synchronized).

The cache hit ratio is critical—percentage of requests served from cache. A 90% hit ratio means only 10% of requests hit the database, reducing load 10x.

Cache placement matters: client-side (browser), CDN (edge), application-level (in-memory or Redis), and database-level (query cache). Each layer has different characteristics.

Browser caching uses HTTP headers (Cache-Control, ETag) to store resources locally—fastest possible, but user-specific. CDN caching places content geographically close to users, reducing latency from 200ms to 20ms for global applications.

Application caching uses Redis or Memcached for session data, computed results, or frequently queried data. The key principle: cache what's expensive to compute or fetch, and accessed frequently.

Key Takeaways

Performance Impact: 90% cache hit ratio can reduce database load 10x and improve P95 latency from 500ms to 50ms

Memory Cost: Redis hosting costs $50-500/month depending on size; much cheaper than scaling databases

Hit Ratio Calculation: (cache hits / total requests) × 100; monitor this as primary cache metric

Latency Comparison: RAM 1ms, SSD 10ms, Network 50ms, Database 200ms, Disk 10000ms

Browser Caching: Use Cache-Control: max-age=31536000 for immutable assets with versioned filenames

Common Pitfall: Caching everything—cache only what's expensive to compute and frequently accessed

Solution: Use cache-aside pattern for most use cases; write-through for strong consistency needs

Real-World: Facebook serves 99% of reads from cache (Memcached cluster with 10,000+ servers)

Eviction: LRU (Least Recently Used) is default; consider LFU (Least Frequently Used) for long-tail data

Visual Diagram


┌────────────────────────────────────────────┐
│         Cache Layers & Latency            │
├────────────────────────────────────────────┤
│                                            │
│ Client (Browser):                          │
│  ┌────────────────┐ Cache-Control: 1hr    │
│  │ Static Assets  │ Latency: 0ms (local)  │
│  │ (CSS, JS, img) │ Hit ratio target: 95% │
│  └────────────────┘                        │
│         ↓ (miss)                           │
│ CDN (Edge):                                │
│  ┌────────────────┐ Global distribution   │
│  │ HTML, API resp │ Latency: 10-50ms      │
│  │ Media files    │ Hit ratio target: 80% │
│  └────────────────┘                        │
│         ↓ (miss)                           │
│ Application Cache (Redis):                │
│  ┌────────────────┐ In same datacenter    │
│  │ Session, user  │ Latency: 1-10ms       │
│  │ Computed data  │ Hit ratio target: 90% │
│  └────────────────┘                        │
│         ↓ (miss)                           │
│ Database:                                  │
│  ┌────────────────┐ Source of truth       │
│  │ All data       │ Latency: 50-500ms     │
│  │ Complex queries│ No caching            │
│  └────────────────┘                        │
│                                            │
│ Request flow (cache hit):                 │
│  Client → CDN (hit) → 20ms response       │
│                                            │
│ Request flow (cache miss all layers):     │
│  Client → CDN → App Cache → DB            │
│  0ms + 20ms + 5ms + 200ms = 225ms         │
└────────────────────────────────────────────┘

All Tutorials Practice Questions

Caching Strategies & Patterns

Table of Contents

Why Caching? Performance Gains and Trade-offs

Key Takeaways

Visual Diagram

Cache-Aside (Lazy Loading): The Default Pattern

Write-Through and Write-Behind: Consistency vs Performance

Cache Invalidation: The Hardest Problem

Distributed Caching: Redis, Memcached, and Consistency

CDN Caching: Edge Computing for Global Performance

Advanced Caching Patterns: Bloom Filters, Precomputation, and Negative Caching

Case Studies: Caching at World-Scale