Database Indexes: B-Trees, Composite Indexes & Performance
Master how indexes work under the hood with B-trees, composite index design, covering indexes, and when adding indexes makes your database slower.
Suppose your table has 10 million rows and you need to find one user by email. Without an index, the database must scan row by row until it finishes the table.
That is a full table scan, and its cost grows linearly with table size. Small tables hide this problem, but once data grows, the same endpoint that felt fast in development can become painfully slow in production.
Indexes solve this by creating a separate, ordered lookup structure so the database can jump to matching rows instead of checking every row. Conceptually, it works like a book index: you locate the term once, then jump to the right pages.
The trade-off is important: indexes are not free metadata, they are real on-disk structures that consume storage, use memory in the buffer pool, and must be updated on every INSERT, UPDATE, and DELETE.
So indexing is a read-vs-write decision. In most read-heavy workloads, the read-time savings are worth the write overhead.
The key is to understand when an index reduces total cost and when it quietly adds operational drag.
Key Takeaways
Visual Diagram
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β π FULL TABLE SCAN vs INDEX LOOKUP β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ£ β β β β WITHOUT INDEX (Full Table Scan - O(N)): β β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β Row 1: email = "alice@example.com" β β β β β Row 2: email = "bob@example.com" β β β β β Row 3: email = "carol@example.com" β β β β β Row 4-46: ... scanning ... β β β β β Row 47: email = "dana@example.com" β β β β β Row 48-10M: ... must check remaining ... β β β β β Row 10,000,000: email = "zoe@example.com" β β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β±οΈ Time: O(N) β 12 seconds at scale β β πΎ I/O: 10 MILLION disk reads β β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β β β WITH INDEX (B-Tree Lookup - O(log N)): β β β β ββββββββββββββββββ β β β [ROOT NODE] β β β β Range: M-Z β β β βββββββββ¦βββββββββ β β β β β βββββββββββββββ©βββββββββββββββ β β β β β β ββββββ©ββββββ βββββββ©ββββββ β β β A-D β β E-L β β β βββββ¦βββββββ ββββββββ¦βββββ β β β β β β ββββββ©βββββ ββββββββββ βββββββ©ββββββ β β β A-B β β C-D β β E-G β β β β β β β¬ Hereβ β β β β βββββββββββ ββββββ¦ββββ βββββββββββββ β β β β β βΌ β β π― "dana@..." β Row 47 β β β β π Path: Root β A-D β C-D β Row 47 β β β±οΈ Time: O(log N) β 3 milliseconds β β πΎ I/O: Just 3-4 disk reads β β β β π Performance: 4000x faster (3ms vs 12s) β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ