The Core Decision Framework
The “right” database depends on three dimensions:
- Data shape — structured (schema-enforced) vs flexible (schema-less)
- Access pattern — how data is read/written (random vs sequential, point vs range, graph traversal)
- Scale — throughput (requests/sec), latency, and data volume
Data Shape
| Shape | Definition | Examples |
|---|---|---|
| Structured | Fixed schema, enforced at write | Financial transactions, user accounts, orders |
| Semi-structured | JSON documents, flexible fields | User profiles, logs, events, configs |
| Unstructured | Binary files, no schema | Images, videos, PDFs, text documents |
| Graph | Nodes + edges, relationships matter | Social networks, fraud rings, recommendation graphs |
| Time-series | Timestamp-ordered, high volume | Metrics, events, IoT sensor readings |
Access Pattern Analysis
Point Lookups: “Get user with ID=123”
- SQL:
SELECT * FROM users WHERE id = 123 - NoSQL:
db.get("user:123") - Optimal: Hash index (O(1)) or B+ tree (O(log n))
Range Queries: “Get all users aged 25–35”
- SQL:
SELECT * FROM users WHERE age BETWEEN 25 AND 35 - NoSQL: Not native; often require application-side filtering
- Optimal: B+ tree (O(log n) + range size)
Aggregations: “Sum of all purchases by country”
- SQL:
SELECT country, SUM(amount) FROM purchases GROUP BY country - NoSQL: Usually requires full table scan + application aggregation
- Optimal: Columnar DB (OLAP) for billion-row tables
Traversals: “Find all friends of friends of friends”
- SQL: 3+ JOINs, exponential complexity
- Graph DB: Linear in number of edges
- Optimal: Graph DB (O(edges)) vs SQL (O(n³))
Text Search: “Find all articles containing ‘machine learning’”
- SQL:
SELECT * FROM articles WHERE content LIKE '%machine%'(slow full table scan) - Search Engine: Inverted index, instant
- Optimal: Elasticsearch (O(k) where k = matching docs)
Similarity Search: “Find similar documents to query embedding”
- SQL: Compute distance to all N documents, sort (O(N log N))
- Vector DB: HNSW indexing, k-nearest neighbors (O(log N))
- Optimal: Vector DB for embeddings
Database Categories & Trade-offs
1. Relational (SQL)
Best for: Structured data + complex queries + transactions
| Property | Strength |
|---|---|
| ACID transactions | ✅ Full ACID (Atomicity, Consistency, Isolation, Durability) |
| Schema | Enforced at write (prevents garbage data) |
| Joins | Native, optimized by query planner |
| Scaling | Vertical (add RAM/CPU) + read replicas (read-only copies); sharding requires consistent hashing |
| Consistency | Strong (read your own writes) |
| Query flexibility | Arbitrary queries across tables |
Examples:
- PostgreSQL — Feature-rich, JSON support, full-text search extension
- MySQL — Web workloads, stable, wide tooling (WordPress, Laravel)
- MariaDB — MySQL fork, open-source, drop-in replacement
- Aurora (AWS) — Managed PostgreSQL/MySQL with auto-scaling read replicas
When to use:
- Complex business logic (invoicing, reservations, financial transfers)
- Multiple tables with cross-table queries
- Data integrity is critical (foreign keys, constraints)
When NOT to use:
- Need horizontal scaling (sharding with consistent hashing is complex)
- Write-heavy workload (> 50% writes) — use LSM-based DB
- Unstructured data (JSON documents are fine; but binary blobs are expensive)
2. Document (NoSQL)
Best for: Flexible schema + hierarchical data + horizontal scaling
| Property | Benefit |
|---|---|
| Schema-less | No schema enforcement (add fields freely) |
| JSON storage | Native nested objects (no multi-table JOINs needed) |
| Horizontal scaling | Sharding built-in (data partitioned by ID) |
| Eventual consistency | Faster writes (trade immediate consistency) |
Examples:
- MongoDB — Document-oriented, aggregation pipeline, wide adoption
- Firestore (Google Cloud) — Managed Firebase DB, real-time sync
- CouchDB — Multi-master replication, offline-first
- RethinkDB — Real-time feeds, changefeeds
When to use:
- Flexible schema (fields vary per document)
- Hierarchical data (user + profile + preferences in one doc)
- Horizontal scaling needed (millions of users)
When NOT to use:
- Complex multi-document transactions needed (use relational DB)
- Heavy aggregations across entire collection (use OLAP/data warehouse)
3. Key-Value (In-Memory / Cache)
Best for: Simple lookups + extreme speed + caching
| Property | Benefit |
|---|---|
| O(1) lookup | Microsecond latency |
| In-memory | RAM speed, not disk |
| Horizontal scaling | Consistent hashing, sharding |
Examples:
- Redis — In-memory data structures, caching, sessions, leaderboards
- Memcached — Simple key-value cache
- DynamoDB (AWS) — Managed key-value, auto-scaling
When to use:
- Caching hot data (user sessions, frequently accessed records)
- Counters and real-time aggregations
- Rate limiting, leaderboards, distributed locks
When NOT to use:
- Primary data store for critical data (use with replication)
- Complex queries needed (use relational DB)
- Data larger than available RAM
4. Wide-Column (NoSQL)
Best for: Write-heavy time-series + massive scale
| Property | Benefit |
|---|---|
| Column families | Group related data (e.g., user metrics, user metadata) |
| Sorted by row key | Range queries on row key (e.g., timestamp range) |
| Horizontal scaling | Built-in sharding, consistent hashing |
Examples:
- Cassandra — Facebook-origin, 1M+ writes/sec per cluster, eventual consistency
- HBase — Hadoop ecosystem, serious operations overhead
- Bigtable (Google) — Proprietary, powers Google Analytics, Search
When to use:
- Time-series data (metrics, logs, events)
- Write-heavy workload (> 100K writes/sec)
- Can tolerate eventual consistency
When NOT to use:
- Need strong consistency (use relational DB)
- Complex queries (use relational or OLAP)
5. Search Engine
Best for: Full-text search + relevance ranking
| Property | Benefit |
|---|---|
| Inverted index | Instant full-text search |
| Relevance scoring | TF-IDF, BM25 scoring |
| Faceted search | Filter by category, tag, etc. |
Examples:
- Elasticsearch — Distributed Lucene, REST API, rich aggregations
- OpenSearch — AWS fork of Elasticsearch
- Solr — Mature Lucene wrapper
- Meilisearch — Developer-friendly, typo-tolerant search
When to use:
- Full-text search (users type keywords)
- Relevance ranking needed
- Fast filtering + sorting (e.g., e-commerce search)
When NOT to use:
- Structured queries only (use relational DB)
- Exact-match lookups only (use key-value cache)
6. Graph
Best for: Relationship traversal + complex connections
| Property | Benefit |
|---|---|
| Adjacency lists | Sub-millisecond traversal |
| Path queries | “Find all friends of friends” |
| Pattern matching | Cypher language for graph queries |
Examples:
- Neo4j — Leading graph DB, Cypher language, ACID transactions
- Amazon Neptune — Managed graph, Gremlin/SPARQL APIs
- TigerGraph — Large-scale graph analytics
When to use:
- Social networks (friend connections, recommendations)
- Fraud detection (find connected accounts)
- Knowledge graphs (entity relationships)
When NOT to use:
- No relationship queries (use relational DB)
- Graph too large for memory (distributed graphs have latency overhead)
7. Time-Series
Best for: Metrics + IoT + append-only sequential data
| Property | Benefit |
|---|---|
| Timestamp ordering | Data is naturally sorted |
| Compression | Delta encoding, dictionary → 40–50× compression |
| Auto-TTL | Automatic expiration of old data |
| High ingest | 1M+ metrics/sec per node |
Examples:
- InfluxDB — Purpose-built time-series, Flux query language
- Prometheus — Metrics scraping, 2-hour local retention
- TimescaleDB — PostgreSQL extension, hypertable partitioning
- ClickHouse — Columnar, 100M+ events/sec
When to use:
- Metrics (CPU, memory, network)
- Events (user actions, system logs)
- High-volume sequential data
When NOT to use:
- Complex multi-dimensional queries (use OLAP)
- Primary relational data (use SQL)
8. Vector
Best for: Semantic search + embeddings + similarity
| Property | Benefit |
|---|---|
| HNSW indexing | Fast nearest-neighbor search |
| Embedding storage | Store LLM embeddings natively |
| Semantic search | “Find documents similar to query” |
Examples:
- Pinecone — Fully managed vector DB
- pgvector (PostgreSQL extension) — Vector support in Postgres
- Weaviate — Open-source vector DB with GraphQL
When to use:
- LLM-powered search (embeddings from OpenAI, etc.)
- Semantic similarity (not keyword matching)
- Approximate nearest neighbors (k-NN)
When NOT to use:
- Structured queries only (use relational DB)
- Data is low-dimensional (< 10 dimensions, use B+ tree)
9. Data Warehouse (OLAP)
Best for: Analytics + aggregations + historical data
| Property | Benefit |
|---|---|
| Columnar storage | 50–100× compression, column-only scans |
| Massive parallelism | Query billion-row tables in seconds |
| Complex aggregations | GROUP BY, WINDOW functions |
Examples:
- BigQuery (Google) — Fully managed, petabyte scale
- Redshift (AWS) — PostgreSQL-based, columnar, massive scale
- Snowflake — Cloud-native, sharing, automatic scaling
- ClickHouse — Fast columnar, open-source
When to use:
- Analytics on historical data (last 1–5 years)
- Aggregations across millions/billions of rows
- Ad-hoc reporting and exploration
When NOT to use:
- Real-time transactional updates (use OLTP)
- Complex multi-table transactions (use relational DB)
10. NewSQL
Best for: SQL + horizontal scaling + strong consistency
| Property | Benefit |
|---|---|
| SQL interface | Familiar SQL syntax |
| Horizontal scaling | Partition data across nodes |
| Strong consistency | ACID across shards (distributed transactions) |
| High availability | Automatic failover, replicas |
Examples:
- Google Spanner — Globally distributed, ACID, nanosecond-precision time
- CockroachDB — PostgreSQL-compatible, auto-sharding
- TiDB — MySQL-compatible, distributed ACID
When to use:
- Need both SQL + horizontal scaling
- Global distribution (data centers across continents)
- Strong consistency is non-negotiable
When NOT to use:
- Single-datacenter (overhead not worth it)
- Throughput < 10K req/sec (PostgreSQL replicas cheaper)
Interview Decision Template
When asked “what database would you choose?”, use this structure:
Step 1: Identify Data Shape
Question: “Is the data structured (fixed schema) or flexible (JSON)?
Answer: “The primary data is [structured/semi-structured/time-series/graph] because [reason].”
Example: “The primary data is structured because transactions must enforce referential integrity (orders must reference valid users, products, etc.), so I’d use a relational database.”
Step 2: Identify Access Patterns
Question: “What are the 3 most common queries?”
Answer: List top 3 access patterns + their complexity
Example:
- Pattern 1: “Get order details by order ID” — point lookup, O(1) hash index
- Pattern 2: “List orders placed between date X and Y, filtered by customer” — range + equality, O(log n) B+ tree
- Pattern 3: “Sum order totals by region” — aggregation, O(n) full table scan
Step 3: Identify Scale
Question: “What’s your throughput and data volume?”
Answer: State throughput (req/sec), latency target, and data size
Example: “1M users, 100K purchases/day (1 req/sec average, 50 req/sec peak), <100ms latency requirement. Data: 100GB after 5 years (manageable).”
Step 4: Identify Consistency Requirements
Question: “Can you tolerate eventual consistency or do you need strong consistency?”
Answer: State consistency model + reasoning
Example: “We need strong consistency because users must see their updated profile immediately after editing. We cannot tolerate a 5-minute eventual consistency delay.”
Step 5: Name 2 Options & Trade-offs
Answer: Choose 2 databases, compare pros/cons
Example:
Option 1: PostgreSQL (relational)
- ✅ ACID transactions (prevent duplicate orders)
- ✅ JOIN queries (fetch order + user + product data in one query)
- ✅ Constraints (foreign keys ensure data integrity)
- ❌ Scaling limited to read replicas (sharding is manual, complex)
- ❌ For 10M orders, single primary becomes bottleneck
Option 2: MongoDB (document)
- ✅ Horizontal scaling (shard by customer_id, distribute 10M orders across nodes)
- ✅ Flexible schema (add fields per order without migration)
- ❌ No multi-document ACID (if updating order + inventory in one transaction, must handle rollback manually)
- ❌ JOINs are application-level (fetch order, then fetch user separately)
Final recommendation: “Given the consistency requirement, I’d choose PostgreSQL with read replicas for reads (checkout, order history) and writes going to primary. If order volume exceeds 10M and becomes a bottleneck, I’d shard by customer_id and run multiple PostgreSQL primaries, each owning a customer range.”
Worked Example: E-Commerce Platform
Requirements
- 1M users
- 10M orders (5 years history)
- 50 req/sec average, 500 req/sec peak
- Read-heavy (80% reads: view orders, 20% writes: checkout)
- Cross-user analytics: “top 100 products by revenue”
Analysis
Data shape: Structured (orders, products, users have fixed schemas)
Access patterns:
- “Get my orders” — range + equality, O(log n)
- “Get product details” — point lookup, O(1)
- “Top 100 products by revenue” — aggregation, O(10M) full scan
Scale: 10M orders, 500 req/sec peak
Consistency: Strong (orders must be accurate)
Decision
OLTP (transactional): PostgreSQL
- Primary handles 20 writes/sec (checkout)
- Read replicas handle 400 reads/sec (view orders, product details)
- Queries 1 & 2 use B+ tree indexes on (customer_id, date) and (product_id)
Analytics (query 3): Separate ClickHouse or Redshift cluster
- Mirror orders table from PostgreSQL → ClickHouse nightly
- “Top 100 products by revenue” query runs in 100ms on billion-row table
- Reporting dashboard queries the warehouse, not production DB
Caching layer: Redis
- Cache hot products (top 100) — 1-hour TTL
- Cache user sessions
- Handle rate limiting per user
Architecture summary:
1
2
3
4
5
6
7
8
9
10
Users
↓
[Application]
├→ PostgreSQL (Orders, Products, Users)
│ ├→ Primary (writes)
│ └→ Read Replicas (reads)
│
├→ Redis (cache, sessions)
│
└→ ClickHouse (analytics, offline)
Quick Reference: Common Scenarios
| Scenario | Best DB | Why |
|---|---|---|
| User authentication | Redis (cache) + Postgres (authoritative) | Sessions in Redis (fast), users in Postgres (reliable) |
| Shopping cart | Redis (session) + Postgres (orders) | Cart is short-lived (session), orders are permanent |
| Real-time metrics | InfluxDB or TimescaleDB | Time-series optimized, auto-TTL, high compression |
| Recommendations | Graph (Neo4j) or Vector DB | Traversals (graph) or semantic (vector) |
| Product catalog | Elasticsearch + Postgres | Search (ES), updates (Postgres), denormalize catalog to ES |
| Analytics dashboard | BigQuery or Redshift | OLAP, columnar, petabyte scale |
| Logs & events | ClickHouse or ELK stack | High-volume sequential, aggregations |
| Chat messages | MongoDB or Firestore | Flexible schema, easy horizontal scaling |
| Financial transactions | Postgres (only) | ACID non-negotiable, no eventual consistency |
How Real Systems Use This — Polyglot Persistence at Scale
Airbnb — MySQL + Cassandra + Elasticsearch + Redis
Airbnb’s database architecture is a textbook example of polyglot persistence, designed to handle 1M+ listings, billions of searches, and complex reservations.
The Problem: Airbnb must support fast reservation lookups (strong consistency), full-text search over listings (relevance ranking), time-series metrics (availability calendars), and high-throughput data pipeline ingestion.
The Solution:
- MySQL (Amazon RDS): Core transactional database for reservations, bookings, user accounts, and listing metadata. Strong ACID guarantees ensure a confirmed booking can’t be double-booked. Primary-replica setup handles read replicas for scaling; writes go to primary.
- Cassandra: Distributed storage for user search history, availability calendars (time-series data per listing), and inbox messages. Write-heavy workload (millions of calendar updates/day), so Cassandra’s LSM write efficiency is ideal. Partition by listing_id; availability data is naturally sorted by date.
- Elasticsearch: Full-text search over listing descriptions, review text, and location data. Inverted index powers faceted search (“private rooms, pool, patio, near downtown”). Relevance scoring based on review count, booking frequency, and host rating.
- Redis: Session storage, rate limiting, real-time availability cache (hot listings), and leaderboards (trending listings by booking count).
Why This Mix: MySQL guarantees booking accuracy (critical), Cassandra absorbs massive write volume (calendars, messages), Elasticsearch powers user experience (search), Redis handles throughput (caching, sessions). No single database could handle all three requirements equally well.
Concrete numbers: Airbnb ingests ~100K bookings/day (consistent load on MySQL), ~1M calendar updates/day (perfect for Cassandra), ~1B search queries/month (ES cluster with 100+ nodes), and 10M+ concurrent sessions (Redis with cluster replication).
Lessons:
- Specialized tools beat generalists. Elasticsearch search is orders of magnitude faster than
LIKE '%pattern%'in MySQL. - Consistency tiers matter. Bookings need strong consistency; calendar availability can be eventually consistent.
- Hot path optimization. Redis caches hot listings; most users search the same popular destinations repeatedly.
Uber — MySQL → Cassandra → Schemaless KV Evolution
Uber’s database journey shows how architecture evolves as scale changes. They moved from MySQL (single database) to Cassandra (distributed writes) to a custom schemaless KV store (maximum flexibility and control).
Phase 1: MySQL (Early Stage) When Uber was smaller (2011-2013), a single MySQL instance held all data: rides, users, locations, payments. Queries were straightforward: “Get all active rides in a city,” “Fetch user by ID.” Replication handled reads. Sharding by city_id allowed basic horizontal scaling.
Problem: At 10K rides/day, MySQL primary became bottleneck. Write volume to rides table exceeded what a single primary could handle. Resharding was manual and painful.
Phase 2: Cassandra (Growth Phase) As Uber scaled (2013-2016), they migrated ride data to Cassandra. Cassandra’s distributed architecture and LSM writes were perfect for time-series data (ride history is append-mostly). Partition by rider_id + date; querying “All rides by user in Q3” scans one partition per node instead of full table scan.
Cassandra’s replication factor 3 gave them fault tolerance: losing one node didn’t impact availability. Write latency stayed low even at 100K rides/day (peak RPS > 1K writes/sec).
Problem: Cassandra’s fixed schema and strict column family structure forced schema migrations to be coordinated. Adding a new field to ride data required compaction and coordination across all nodes. Operational complexity grew with cluster size.
Phase 3: Schemaless KV (Maturity Phase)
By 2016+, Uber built custom infrastructure: UberData (internal name), a key-value store built on top of RocksDB (LSM) with Cassandra-style distributed architecture but JSON schema-free. Data: { ride_id: "abc123", user_id: "user456", origin: {...}, destination: {...}, driver_id: "driver789", status: "completed", metadata: { ... } }.
Advantages:
- No schema versioning. Add fields without coordination; old clients ignore new fields.
- Lower operational overhead. Owned and tuned for Uber’s exact access patterns (point lookups by ride_id dominate).
- Better compression. Schema-free JSON compresses better than fixed-column Cassandra format.
- Faster writes. RocksDB’s native LSM is faster than Cassandra’s Thrift protocol overhead.
Trade-off: Uber lost Cassandra’s battle-tested operational tooling. They now manage the custom KV store’s replication, failover, and disaster recovery themselves. But at their scale, the gains (lower latency, fewer operational surprises, full control) outweighed the engineering cost.
Concrete numbers: Uber stores trillions of ride records (compressing to petabytes with their KV store). Write throughput is >100K ops/sec across the cluster. Read latency is <10ms p95 (critical for user app showing “Your driver is 5 min away”).
Lessons:
- Evolution is natural. Start with proven databases (MySQL, Cassandra); optimize at scale.
- Schema flexibility matters. Schemaless systems let product teams iterate faster (new ride metadata, new driver attributes) without database migrations.
- Operational burden increases with customization. Building your own KV store is a last resort, justified only by extreme scale.
References
- 📄 Building Reliable Systems — Designing and Operating Systems for 10⁶ Events Per Day — System design trade-offs at scale
- 📖 PostgreSQL Documentation — Index Types — B+ tree, hash, GiST index choices
- 📖 MongoDB Database Design Best Practices — Sharding strategies, schema design for scaling
- 📖 Cassandra Data Model — Wide-column design patterns
- 🎥 ByteByteGo — Database Design — Visual comparisons of database trade-offs
- 📄 Designing Data-Intensive Applications — Martin Kleppmann (Book) — In-depth analysis of database internals and trade-offs
- 📖 AWS Database Decision Guide — Quick decision matrix for AWS services
- 📖 Google Cloud Databases — BigQuery, Firestore, Cloud SQL, Cloud Spanner comparison