Dynamic Programming Patterns
DP solves optimization problems by caching subproblem results -- reducing exponential time complexity to polynomial. Essential for system design: resource allocation, query optimization, cost modeling, and capacity planning.
DP solves optimization problems by caching subproblem results — reducing exponential time complexity to polynomial. Essential for system design: resource allocation, query optimization, cost modeling, and capacity planning.
The DP Approach
DP applies when:
- Optimal substructure: The optimal solution to a problem contains optimal solutions to subproblems
- Overlapping subproblems: The same subproblem appears multiple times in the recursion tree
Two implementation styles:
- Top-down (Memoization): Recursively solve subproblems, cache results in a dictionary/map
- Bottom-up (Tabulation): Fill a table iteratively, starting with base cases
Both achieve O(n²) or better from exponential O(2ⁿ) recursion.
The 5 Core Patterns
Pattern 1 — 0/1 Knapsack
Recurrence:
1
dp[i][w] = max(dp[i-1][w], val[i] + dp[i-1][w - wt[i]])
Choose a subset of items to maximize value within a weight capacity constraint. Each item is either taken or not (0/1).
| Complexity: O(n × W) time | O(n × W) space (where W = capacity) |
System design uses:
- Resource allocation (CPU/memory budgets within fixed capacity)
- Feature flag rollout limits (maximize feature coverage within risk threshold)
- Cost-optimal job scheduling (pack jobs to fit within billing period)
- Amazon warehouse bin packing (fit items into fixed-size containers)
Pattern 2 — Longest Common Subsequence (LCS)
Recurrence:
1
2
3
4
dp[i][j] = {
dp[i-1][j-1] + 1 if str1[i-1] == str2[j-1]
max(dp[i-1][j], dp[i][j-1]) otherwise
}
Find the longest sequence appearing in both strings in order (not necessarily contiguous).
| Complexity: O(n × m) time | O(n × m) space |
System design uses:
- Diff algorithms (Git, code review tools)
- DNA sequence alignment (bioinformatics)
- Plagiarism detection (document similarity)
- Version control merge strategies
Pattern 3 — Interval DP
Solve problems over ranges of indices: merge, split, partition, or decide how to decompose an interval.
Example recurrence (Optimal Binary Search Tree):
1
dp[i][j] = min(dp[i][k-1] + dp[k+1][j] + cost[i..j]) for all k in [i,j]
| Complexity: O(n³) time (three nested loops) | O(n²) space |
System design uses:
- Query optimizer join ordering (Google BigQuery, PostgreSQL): decide optimal order to join n tables — minimize intermediate result size
- Matrix chain multiplication: minimize scalar multiplications when multiplying k matrices
- Optimal BST construction: organize keys by access frequency
- Burst balloon problem: optimal order to collapse ranges
Pattern 4 — Tree DP
Compute values bottom-up on tree structures, accumulating information from children to parents.
Example: Compute subtree sums, max paths, or optimal configurations.
| Complexity: O(n) time (visit each node once) | O(n) space (recursion stack + DP table) |
System design uses:
- Computing subtree aggregates in hierarchical data (org charts, file systems)
- Optimal binary search tree construction
- Distributed system topology optimization
- Hierarchical caching policies
Pattern 5 — State Machine DP
Model transitions between discrete states; compute optimal sequences of transitions.
Example: State machine for request processing:
1
2
State transitions: Idle → Active → Cooldown → Idle
dp[i][state] = max cost to reach state i in given state
| Complexity: O(n × S) time (n time steps, S states) | O(n × S) space |
System design uses:
- Rate limiting state machines (tracking requests per window)
- Connection lifecycle management (TCP states, socket lifecycles)
- Order state machines (checkout → payment → fulfilled → shipped)
- Request routing state transitions (queued → dispatched → executing → done)
Key Properties Table
| Pattern | Time | Space | Use Case |
|---|---|---|---|
| 0/1 Knapsack | O(n×W) | O(n×W) | Resource allocation, bin packing |
| LCS | O(n×m) | O(n×m) | Diff algorithms, sequence alignment |
| Interval DP | O(n³) | O(n²) | Query optimization, matrix chain |
| Tree DP | O(n) | O(n) | Hierarchical aggregation |
| State Machine DP | O(n×S) | O(n×S) | Rate limiting, request lifecycle |
When to Use / Avoid
✅ Use DP when:
- Problem can be decomposed into overlapping subproblems
- Optimal substructure exists (optimal solution builds on optimal subproblems)
- Constraints are small enough for tabulation (n ≤ 5000 typical)
- You need guaranteed optimality (not approximate/greedy)
❌ Avoid DP when:
- Subproblems don’t overlap (simple divide-and-conquer is better)
- Greedy always works (activity selection, Huffman coding)
- Problem size is too large for table storage (terabytes of state)
- Real-time latency is critical (DP requires full table computation)
DP vs Greedy
| Dimension | Dynamic Programming | Greedy |
|---|---|---|
| Approach | Considers all subproblems, memoizes results | Takes locally optimal choice at each step |
| Correctness | Always produces optimal solution | Sometimes optimal; needs exchange argument proof |
| Complexity | Higher time/space (table storage) | Lower time, O(1) or O(n) space |
| When correct | Overlapping subproblems + optimal substructure | Greedy choice property proven (matroid theory) |
| Example optimal | 0/1 Knapsack, LCS, Interval DP | Activity selection, Huffman, MST |
| Example fails | N/A (always correct for overlapping problems) | Fractional knapsack (greedy works; 0/1 does not) |
How Real Systems Use This
Google BigQuery & PostgreSQL — Query Optimizer (Interval DP for Join Ordering)
When a SQL query joins n tables, the optimizer must decide the order in which to execute joins. Different orderings produce dramatically different intermediate result sizes. Google BigQuery’s query optimizer uses interval dynamic programming (Selinger et al. 1979 algorithm) to explore all possible join orderings and select the cheapest. For a 6-table join, there are 5! = 120 orderings to compare; the DP approach computes dp[i][j] = minimum cost to join tables i through j, then combines overlapping subproblems. PostgreSQL uses the same algorithm: for each pair of tables (i, j), compute cost to join them; for triples, reuse the pair results. This reduces exponential exploration to polynomial. Real-world impact: a 20-table join that would take years to optimize exhaustively is solved in milliseconds using DP.
Amazon Warehouse Fulfillment — Bin Packing (0/1 Knapsack)
Amazon fulfillment centers process 500,000+ shipments daily. Each package contains items of varying dimensions and weights; Amazon optimizes packing into fixed-size boxes to minimize waste and shipping cost. The bin packing problem (variant of 0/1 Knapsack with multiple bins) is NP-hard, but for small n (20-50 items per package), DP-based solutions achieve 94% space utilization. Amazon’s systems implement next-k-fit heuristic with DP fallback for high-value orders: for 100,000+ packages daily, DP optimization reduces shipping costs by 12-18% and warehouse space by 25-35%. The DP solution: dp[i][w] = minimum items needed to fill weight w using first i items; this guides packing decisions in real-time.
Netflix Video Encoding — Bitrate Ladder Optimization (DP for Cost-Quality Trade-off)
Netflix streams to 250+ million subscribers with diverse devices and bandwidths. The system must encode each video at multiple bitrates (480p@2Mbps, 720p@5Mbps, 1080p@8Mbps, 4K@25Mbps) to serve all users. Netflix’s “Per-Title Encode Optimization” uses DP to select the optimal set of encoding tiers per title, minimizing encoding cost while maintaining perceived quality (measured by VMAF score). The DP formulation: dp[tier][quality_level] = minimum encoding cost to achieve quality_level using first tier encoders. Netflix’s Dynamic Optimizer reduces bitrate by 30% on average compared to static ladders. For a 2-hour film, encoding cost drops from $200 (fixed ladder) to $140 (DP-optimized), multiplied across 10,000+ titles = $600,000+ annual savings.
Speech Recognition Systems — Viterbi Algorithm (State Machine DP on Hidden Markov Models)
Speech-to-text systems (used in Alexa, Google Assistant, Siri) convert acoustic signals to text using Hidden Markov Models (HMMs). The Viterbi algorithm (1967) is a state machine DP that finds the most likely sequence of phonemes/words given observed acoustic features. The algorithm models states = possible words/phonemes; transitions = probabilities from word A to word B; observations = acoustic signal frames. DP recurrence: dp[t][state] = probability of most likely path to state at time t. Modern ASR systems evaluate 10,000+ states per frame across 100+ frames; Viterbi DP reduces this from exponential O(S^T) to polynomial O(T × S). Real impact: Viterbi enables real-time speech recognition on mobile devices (< 100ms latency for live transcription).
Ride-Sharing Platforms — Dynamic Pricing (State Machine DP for Demand-Supply Matching)
Uber/Lyft adjust prices dynamically based on demand/supply ratio, time of day, and surge events. The pricing engine uses state machine DP: states = demand levels (low/medium/high/surge); transitions = probability of moving from one state to another; rewards = revenue per ride in each state. DP recurrence: dp[t][demand_state] = maximum expected revenue from time t to end-of-day starting in demand_state. DP solves this backward in time: compute expected value of each state at T-1, T-2, …, 1. Real-time pricing in NYC handles 50,000 simultaneous requests; DP precomputes optimal pricing decisions per state, enabling 10ms response latency. During surge, the system immediately knows whether to accept/reject rides based on current state and DP precomputation.
Stripe & Payment Systems — Cost-Optimal Transaction Routing (Knapsack for Fee Minimization)
Payment processors route transactions across multiple payment methods (credit cards, ACH, wallets, international gateways) to minimize total cost. Each method has different transaction fees (Visa: 2.9%+$0.30, ACH: $0.25/fixed, wire: $0.50). For a $1000 transaction, Stripe’s routing engine uses a knapsack variant to select which methods to split the payment across, minimizing total fees paid. DP recurrence: dp[i][amount] = minimum fee to route amount using first i payment methods. Real impact: large merchants process billions in volume; optimizing routing by 0.1% = millions in annual savings passed to merchants.
Implementation
0/1 Knapsack — Bottom-up Tabulation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
def knapsack_01(weights, values, capacity):
"""
Solve 0/1 Knapsack using dynamic programming.
Args:
weights: list of item weights
values: list of item values
capacity: maximum weight capacity
Returns:
maximum value achievable without exceeding capacity
Time: O(n * capacity)
Space: O(n * capacity)
"""
n = len(weights)
# dp[w] = maximum value achievable with weight limit w
dp = [0] * (capacity + 1)
# Iterate through each item
for i in range(n):
# Traverse capacity in reverse to avoid using same item twice
for w in range(capacity, weights[i] - 1, -1):
# Either skip item i, or take it and add its value
dp[w] = max(dp[w], dp[w - weights[i]] + values[i])
return dp[capacity]
def knapsack_01_with_items(weights, values, capacity):
"""
Extended version that also returns which items were selected.
"""
n = len(weights)
# Build full 2D table for backtracking
dp = [[0] * (capacity + 1) for _ in range(n + 1)]
# Fill DP table
for i in range(1, n + 1):
for w in range(capacity + 1):
# Don't take item i-1
dp[i][w] = dp[i-1][w]
# Take item i-1 if it fits
if weights[i-1] <= w:
dp[i][w] = max(dp[i][w],
dp[i-1][w - weights[i-1]] + values[i-1])
# Backtrack to find which items were selected
selected = []
w = capacity
for i in range(n, 0, -1):
# If value changed, item i-1 was selected
if dp[i][w] != dp[i-1][w]:
selected.append(i-1)
w -= weights[i-1]
return dp[n][capacity], selected[::-1]
# Example usage
if __name__ == "__main__":
weights = [2, 3, 4, 5]
values = [3, 4, 5, 6]
capacity = 8
max_value = knapsack_01(weights, values, capacity)
print(f"Maximum value: {max_value}") # Output: 13 (items 1,2,3)
max_value, items = knapsack_01_with_items(weights, values, capacity)
print(f"Maximum value: {max_value}, Items selected: {items}")
Longest Common Subsequence (LCS)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
def lcs_length(str1, str2):
"""
Compute length of longest common subsequence.
Args:
str1, str2: input strings
Returns:
length of LCS
Time: O(n * m)
Space: O(n * m)
"""
n, m = len(str1), len(str2)
# dp[i][j] = LCS length of str1[0:i] and str2[0:j]
dp = [[0] * (m + 1) for _ in range(n + 1)]
for i in range(1, n + 1):
for j in range(1, m + 1):
if str1[i-1] == str2[j-1]:
# Characters match: extend previous LCS
dp[i][j] = dp[i-1][j-1] + 1
else:
# Characters don't match: take max from exclude either char
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
return dp[n][m]
def lcs_string(str1, str2):
"""
Compute longest common subsequence string itself.
Returns:
the actual LCS string
"""
n, m = len(str1), len(str2)
# Build DP table
dp = [[0] * (m + 1) for _ in range(n + 1)]
for i in range(1, n + 1):
for j in range(1, m + 1):
if str1[i-1] == str2[j-1]:
dp[i][j] = dp[i-1][j-1] + 1
else:
dp[i][j] = max(dp[i-1][j], dp[i][j-1])
# Backtrack to reconstruct LCS
lcs = []
i, j = n, m
while i > 0 and j > 0:
if str1[i-1] == str2[j-1]:
lcs.append(str1[i-1])
i -= 1
j -= 1
elif dp[i-1][j] > dp[i][j-1]:
i -= 1
else:
j -= 1
return ''.join(reversed(lcs))
# Example usage
if __name__ == "__main__":
str1 = "ABCDE"
str2 = "ACE"
length = lcs_length(str1, str2)
lcs = lcs_string(str1, str2)
print(f"LCS length: {length}") # Output: 3
print(f"LCS string: {lcs}") # Output: ACE
References
- Programming as a Key to Understanding — Richard Bellman (1957) — Foundational paper introducing optimal substructure principle; coined the term “dynamic programming.”
- Access Path Selection in a Relational Database Management System — Selinger et al. (1979) — Seminal paper on cost-based query optimization using DP for join ordering in System R; used by all modern query optimizers (PostgreSQL, BigQuery, Snowflake).
- Introduction to Algorithms (CLRS) — Chapters 14-15 — Comprehensive treatment of DP with 15+ classic problems and their solutions.
- MIT OpenCourseWare — Dynamic Programming (Lecture 19) — Clear walkthrough of substructure, overlapping subproblems, and recurrence relation construction.
- ByteByteGo — Dynamic Programming Patterns — Visual explanation of the 5 core patterns with real-world system examples.
- Per-Title Encode Optimization — Netflix TechBlog (2015) — Production case study of DP applied to video encoding bitrate selection.
- The Bin Packing Problem — OR-Tools Documentation — Google’s open-source solver with DP and approximation algorithms for warehouse optimization.
- Viterbi Algorithm — Wikipedia — Clear explanation with HMM context and pseudocode; used in all speech recognition systems.
- Wikipedia: Dynamic Programming — Quick reference with classic problem gallery and complexity comparison.