Database Design at Scale
Design robust databases that scale to handle millions of users
The relational database model has dominated for 40+ years, but NoSQL emerged around 2009 to address specific scalability and flexibility challenges.
The choice between SQL and NoSQL isn't about one being "better"βit's about matching database characteristics to your use case. Relational databases (PostgreSQL, MySQL, Oracle) excel at complex queries, transactions, and strong consistency.
They use structured tables with predefined schemas, support JOINs across tables, and provide ACID guarantees. The relational model enforces data integrity through foreign keys and constraints.
This is ideal for financial systems, inventory management, or any scenario requiring strong consistency. However, RDBMSs struggle with horizontal scalingβsharding is complex and joins across shards are expensive.
NoSQL databases sacrificed some relational features to achieve better scalability and flexibility.
Document stores (MongoDB, CouchDB) store semi-structured JSON-like documents, making them ideal for content management, user profiles, or product catalogs where each entity might have different fields.
Wide-column stores (Cassandra, HBase) organize data by rows and columns but allow different columns per row, optimized for write-heavy workloads like time-series data or messaging.
Key-value stores (Redis, DynamoDB) offer simple get/put operations with extremely high performance, perfect for caching or session storage.
Graph databases (Neo4j, Amazon Neptune) excel at relationship-heavy data like social networks or fraud detection.
Key Takeaways
Visual Diagram
ββββββββββββββββββββββββββββββββββββββββββββββ
β SQL vs NoSQL Trade-offs β
ββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β Relational (SQL): β
β βββββββββββ¬ββββββββββ¬ββββββββββ β
β β user_id β name β email β Schema β
β βββββββββββΌββββββββββΌββββββββββ€ β
β β 1 β Alice β a@ex.comβ Fixed β
β β 2 β Bob β b@ex.comβ β
β βββββββββββ΄ββββββββββ΄ββββββββββ β
β + Strong consistency (ACID) β
β + Complex queries (JOINs) β
β + Data integrity (foreign keys) β
β - Hard to scale horizontally β
β - Schema migrations expensive β
β Use: Banking, inventory, analytics β
β β
β Document Store (MongoDB): β
β {id:1, name:"Alice", email:"a@ex.com"} β
β {id:2, name:"Bob", tags:["vip"]} β
β + Flexible schema per document β
β + Horizontal scaling (sharding) β
β + Developer friendly (JSON) β
β - Limited transactions (improving) β
β - No JOINs (embed or reference) β
β Use: CMS, catalogs, user profiles β
β β
β Wide-Column (Cassandra): β
β Row key β Column families β Columns β
β user:1 β profile{name, email, ...} β
β + Massive write throughput β
β + Linear scalability β
β + Time-series optimized β
β - Limited query flexibility β
β - Eventual consistency default β
β Use: Time-series, IoT, messaging β
ββββββββββββββββββββββββββββββββββββββββββββββ