Database Indexes: B-Trees, Composite Indexes & Performance

Master how indexes work under the hood with B-trees, composite index design, covering indexes, and when adding indexes makes your database slower.

90 minutes

7Detailed Sections

Senior Level

Suppose your table has 10 million rows and you need to find one user by email. Without an index, the database must scan row by row until it finishes the table.

That is a full table scan, and its cost grows linearly with table size. Small tables hide this problem, but once data grows, the same endpoint that felt fast in development can become painfully slow in production.

Indexes solve this by creating a separate, ordered lookup structure so the database can jump to matching rows instead of checking every row. Conceptually, it works like a book index: you locate the term once, then jump to the right pages.

The trade-off is important: indexes are not free metadata, they are real on-disk structures that consume storage, use memory in the buffer pool, and must be updated on every INSERT, UPDATE, and DELETE.

So indexing is a read-vs-write decision. In most read-heavy workloads, the read-time savings are worth the write overhead.

The key is to understand when an index reduces total cost and when it quietly adds operational drag.

Key Takeaways

Full Table Scan: Database reads every single row sequentially—O(N) time complexity

Linear Growth Problem: 1M rows = 1 sec, 10M rows = 10 sec—performance degrades linearly

Production Impact: Same query that takes 3ms at 2 AM takes 12 seconds during peak traffic as data grows

Index = Separate Data Structure: Not magic, but a real sorted lookup structure stored on disk

Space-Time Tradeoff: Indexes trade disk space (20-50% of table size) for query speed

Memory Competition: Index pages compete for buffer pool space with table data

Write Cost: Every INSERT/UPDATE/DELETE must maintain all indexes on that table

Read-Heavy Workloads: Most web applications have 10:1 read:write ratio—indexes win

Not Always Used: Query planner may skip index if returning >30% of rows (cheaper to scan)

Real-World Example: Production query dropped from 6 seconds to 8ms after adding one index

Visual Diagram


╔════════════════════════════════════════════════════════════╗
║     📊 FULL TABLE SCAN vs INDEX LOOKUP                    ║
╠════════════════════════════════════════════════════════════╣
║                                                            ║
║  ❌ WITHOUT INDEX (Full Table Scan - O(N)):               ║
║  ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓    ║
║  ┃ Row 1:          email = "alice@example.com"  ❌  ┃    ║
║  ┃ Row 2:          email = "bob@example.com"    ❌  ┃    ║
║  ┃ Row 3:          email = "carol@example.com"  ❌  ┃    ║
║  ┃ Row 4-46:       ... scanning ...             ❌  ┃    ║
║  ┃ Row 47:         email = "dana@example.com"   ✅  ┃    ║
║  ┃ Row 48-10M:     ... must check remaining ...  ❌  ┃    ║
║  ┃ Row 10,000,000: email = "zoe@example.com"    ❌  ┃    ║
║  ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛    ║
║  ⏱️  Time: O(N) → 12 seconds at scale                     ║
║  💾 I/O: 10 MILLION disk reads                             ║
║                                                            ║
║  ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━  ║
║                                                            ║
║  ✅ WITH INDEX (B-Tree Lookup - O(log N)):                ║
║                                                            ║
║              ╔════════════════╗                            ║
║              ║  [ROOT NODE]   ║                            ║
║              ║   Range: M-Z   ║                            ║
║              ╚═══════╦════════╝                            ║
║                      ║                                     ║
║        ╔═════════════╩══════════════╗                      ║
║        ║                            ║                      ║
║   ╔════╩═════╗               ╔═════╩═════╗                ║
║   ║   A-D    ║               ║    E-L    ║                ║
║   ╚═══╦══════╝               ╚══════╦════╝                ║
║       ║                             ║                      ║
║  ╔════╩════╗ ╔════════╗      ╔═════╩═════╗                ║
║  ║  A-B    ║ ║  C-D   ║      ║   E-G     ║                ║
║  ║         ║ ║  ⬅ Here║      ║           ║                ║
║  ╚═════════╝ ╚════╦═══╝      ╚═══════════╝                ║
║                   ║                                        ║
║                   ▼                                        ║
║            🎯 "dana@..." → Row 47                          ║
║                                                            ║
║  📍 Path: Root → A-D → C-D → Row 47                       ║
║  ⏱️  Time: O(log N) → 3 milliseconds                      ║
║  💾 I/O: Just 3-4 disk reads                               ║
║                                                            ║
║  🚀 Performance: 4000x faster (3ms vs 12s)                 ║
╚════════════════════════════════════════════════════════════╝

All Tutorials Practice Questions

Database Indexes: B-Trees, Composite Indexes & Performance

Table of Contents

What Problem Do Indexes Solve?

Key Takeaways

Visual Diagram

The B-Tree: How Indexes Actually Work

Composite Indexes and Why Order Is Everything

Covering Indexes: Eliminating Table Lookups

When Indexes Hurt: Write Overhead and Memory Pressure

Handling High Write Loads: 10,000+ Operations with Multiple Indexes

Practical Advice: Index Design Strategy