←Back to Tutorials

Database Design at Scale

Design robust databases that scale to handle millions of users

110 minutes
8Detailed Sections
Senior Level

The relational database model has dominated for 40+ years, but NoSQL emerged around 2009 to address specific scalability and flexibility challenges.

The choice between SQL and NoSQL isn't about one being "better"β€”it's about matching database characteristics to your use case. Relational databases (PostgreSQL, MySQL, Oracle) excel at complex queries, transactions, and strong consistency.

They use structured tables with predefined schemas, support JOINs across tables, and provide ACID guarantees. The relational model enforces data integrity through foreign keys and constraints.

This is ideal for financial systems, inventory management, or any scenario requiring strong consistency. However, RDBMSs struggle with horizontal scalingβ€”sharding is complex and joins across shards are expensive.

NoSQL databases sacrificed some relational features to achieve better scalability and flexibility.

Document stores (MongoDB, CouchDB) store semi-structured JSON-like documents, making them ideal for content management, user profiles, or product catalogs where each entity might have different fields.

Wide-column stores (Cassandra, HBase) organize data by rows and columns but allow different columns per row, optimized for write-heavy workloads like time-series data or messaging.

Key-value stores (Redis, DynamoDB) offer simple get/put operations with extremely high performance, perfect for caching or session storage.

Graph databases (Neo4j, Amazon Neptune) excel at relationship-heavy data like social networks or fraud detection.

Key Takeaways

1
ACID vs BASE: SQL guarantees Atomicity, Consistency, Isolation, Durability; NoSQL often uses Basically Available, Soft state, Eventual consistency
2
Schema Flexibility: NoSQL allows different fields per document; useful when schema evolves rapidly
3
Horizontal Scaling: NoSQL designed for adding nodes; SQL scales vertically (bigger servers)
4
Query Complexity: SQL supports complex JOINs; NoSQL requires denormalization or multiple queries
5
Transactions: Modern NoSQL (MongoDB 4.0+, Cosmos DB) now support multi-document transactions
6
Common Pitfall: Using NoSQL for everythingβ€”many problems still best solved with SQL
7
Solution: Polyglot persistenceβ€”use different databases for different parts of your system
8
Real-World: Uber uses MySQL for trip data (transactions), Cassandra for time-series (GPS coordinates)
9
Performance: DynamoDB single-digit millisecond latency for key-value lookups at any scale
10
GraphQL-Native: Databases like Dgraph provide native GraphQL interfaces, eliminating the need for complex ORMs or mapping layers between the API and the data tier

Visual Diagram


β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚       SQL vs NoSQL Trade-offs             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                            β”‚
β”‚ Relational (SQL):                          β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”‚
β”‚  β”‚ user_id β”‚  name   β”‚  email  β”‚ Schema   β”‚
β”‚  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€          β”‚
β”‚  β”‚    1    β”‚  Alice  β”‚ a@ex.comβ”‚ Fixed    β”‚
β”‚  β”‚    2    β”‚   Bob   β”‚ b@ex.comβ”‚          β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚
β”‚  + Strong consistency (ACID)               β”‚
β”‚  + Complex queries (JOINs)                 β”‚
β”‚  + Data integrity (foreign keys)           β”‚
β”‚  - Hard to scale horizontally              β”‚
β”‚  - Schema migrations expensive             β”‚
β”‚  Use: Banking, inventory, analytics        β”‚
β”‚                                            β”‚
β”‚ Document Store (MongoDB):                  β”‚
β”‚  {id:1, name:"Alice", email:"a@ex.com"}   β”‚
β”‚  {id:2, name:"Bob", tags:["vip"]}         β”‚
β”‚  + Flexible schema per document            β”‚
β”‚  + Horizontal scaling (sharding)           β”‚
β”‚  + Developer friendly (JSON)               β”‚
β”‚  - Limited transactions (improving)        β”‚
β”‚  - No JOINs (embed or reference)          β”‚
β”‚  Use: CMS, catalogs, user profiles         β”‚
β”‚                                            β”‚
β”‚ Wide-Column (Cassandra):                   β”‚
β”‚  Row key β†’ Column families β†’ Columns      β”‚
β”‚  user:1 β†’ profile{name, email, ...}       β”‚
β”‚  + Massive write throughput                β”‚
β”‚  + Linear scalability                      β”‚
β”‚  + Time-series optimized                   β”‚
β”‚  - Limited query flexibility               β”‚
β”‚  - Eventual consistency default            β”‚
β”‚  Use: Time-series, IoT, messaging          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Sign in to unlock

Sign In Free