System Design & Infrastructure — Reading Order

A structured path through 20 posts — from algorithmic building blocks through distributed systems to production reliability. Built for backend engineers, platform engineers, and anyone preparing for system design interviews.

Posted Apr 24, 2026

2 min read

A structured path through my system design posts. The progression moves from fundamental data structures and algorithms through distributed systems primitives to reliability patterns and production operations. Each layer builds on the one below.

1. Algorithmic Building Blocks

The primitives that show up everywhere in system design — caching layers, indexing, query optimization, streaming analytics. You don’t need to implement these from scratch, but you need to know when and why to reach for them.

Sorting Algorithms — O(n log n) vs O(n+k) vs external merge sort — choosing by data shape
Hashing Algorithms — integrity, identity, distribution, authentication
Bloom Filters — probabilistic membership testing at scale
Binary Search & Variations — O(log n) on any sorted or monotonic space
Dynamic Programming Patterns — resource allocation, query optimization, capacity planning
Graph Algorithms — BFS, Dijkstra, topological sort — modeling relationships and dependencies
Sliding Window & Two Pointer — streaming analytics, rate limiting, network protocols

2. Distributed Systems Primitives

How data moves, how nodes agree, how state gets distributed. These are the concepts that separate single-machine thinking from distributed-systems thinking.

Consistent Hashing — distributing data so adding/removing nodes doesn’t reshuffle everything
Consensus — Paxos & Raft — how distributed nodes agree despite failures
Kafka Deep Dive — partitioned log architecture for messaging at scale

3. Reliability & Traffic Patterns

The patterns that keep systems alive under real-world conditions — traffic spikes, cascading failures, downstream outages.

Load Balancing Algorithms — least connections, consistent hashing, Maglev
Rate Limiting Algorithms — token bucket, sliding window — protecting services from overload
Circuit Breaker & Bulkhead — failing fast and isolating blast radius

4. System Design in Practice

Putting it all together — frameworks for reasoning about large-scale systems.

System Design Framework — the 4-step method: requirements, capacity, design, deep dive

5. Production Operations

Designing systems is half the job. Running them is the other half. These posts bridge architecture and operations.

On-Call & Incident Management — designing on-call as a system, not a people problem
Engineering Excellence & Quality — systems and culture where quality is the default
Developer Experience & Productivity — the multiplier across every engineer
Platform Engineering & Self-Service — internal platforms as products
Cloud Cost Optimization — FinOps discipline for engineering leaders

Where to Go Next

If you’re building AI systems on top of this infrastructure, continue to the AI & Agents Roadmap. If you’re leading the teams that build and operate these systems, continue to the Engineering Leadership Roadmap.

Software Architecture, System Design

roadmap

This post is licensed under CC BY 4.0 by the author.