System Design Framework
The 4-step method for any system design interview: Requirements, Capacity, Design, Deep Dive.
System Design Framework
The 4-step method for any system design interview: Requirements, Capacity, Design, Deep Dive
The 4-Step Flow
flowchart LR
A["1️⃣ Requirements<br/>5–7 min"] --> B["2️⃣ Capacity<br/>3–5 min"]
B --> C["3️⃣ High-Level Design<br/>15–20 min"]
C --> D["4️⃣ Deep Dive<br/>10–15 min"]
Step 1 – Requirements (5-7 min)
Functional Requirements (what the system does)
- List 3-5 core features only – resist scope creep
- Clarify: read-heavy or write-heavy? Real-time or async?
- Ask: mobile/web? Global or regional? API or UI?
Non-Functional Requirements (how well it does it)
| NFR | Typical Target | Questions to Ask | |—–|—————|—————–| | Availability | 99.9% - 99.99% | Tolerate downtime? Active-active? | | Latency | P99 < 200ms | Which operations are latency-sensitive? | | Throughput | X req/sec | Peak vs average? | | Consistency | Strong / Eventual | Can users see stale data? | | Durability | No data loss | RPO / RTO targets? |
Step 2 – Capacity Estimation (3-5 min)
flowchart TD
A["Users/DAU"] --> B["Requests/sec<br/>(read + write)"]
B --> C["Storage/year"]
B --> D["Bandwidth<br/>(MB/s)"]
C --> E["Infrastructure<br/>sketch"]
D --> E
Key numbers to derive:
- QPS = DAU x actions/day / 86,400
- Peak QPS = avg QPS x 2-10x (spiky traffic)
- Storage/year = write QPS x record size x seconds/year
- Bandwidth = read QPS x response size
Step 3 – High-Level Design (15-20 min)
flowchart LR
Client["Client"] --> CDN["CDN"]
CDN --> LB["Load Balancer"]
LB --> API["API Servers<br/>(stateless)"]
API --> Cache["Cache<br/>(Redis)"]
API --> MQ["Message Queue<br/>(Kafka)"]
API --> DB["Primary DB"]
DB --> Replica["Read Replicas"]
MQ --> Worker["Background Workers"]
Draw these components in order:
- Client -> entry point (CDN, API gateway)
- Stateless app servers behind load balancer
- Data stores (which DB? why?)
- Async components (queues, workers)
- Supporting services (cache, search, blob store)
Step 4 – Deep Dive (10-15 min)
Pick 2-3 areas to go deep – let the interviewer guide:
| Deep Dive Area | Key Questions |
|---|---|
| Scaling reads | Caching strategy? Cache invalidation? Read replicas? |
| Scaling writes | Sharding? Write-ahead log? CQRS? |
| Fault tolerance | What fails? Circuit breaker? Retry with backoff? |
| Consistency | Which consistency model? Trade-offs? |
| Unique constraint | e.g. short URL uniqueness, ID generation |
Common Mistakes to Avoid
- Jumping to design before clarifying requirements
- Over-engineering from the start – start simple, then scale
- Ignoring failure modes – always ask “what if X fails?”
- Forgetting to justify your choices – explain the why
- Designing for perfect from day one – mention trade-offs
This post is licensed under
CC BY 4.0
by the author.