Skip List
A layered linked list with express lanes -- O(log n) average search/insert using randomization, without the complexity of balancing a tree.
A layered linked list with express lanes — O(log n) average search/insert using randomization, without the complexity of balancing a tree.
Structure
Search for 25: Level 3: HEAD → 20 → (50 too big, drop down) → Level 2: 20 → 30 (too big, drop) → Level 1: 20 → 25 ✅
How Levels Are Assigned
Each new element flips a coin — promotes to next level with 50% probability. Expected height = O(log n).
Key Properties
| Property | Average | Worst Case |
|---|---|---|
| Search | O(log n) | O(n) |
| Insert | O(log n) | O(n) |
| Delete | O(log n) | O(n) |
| Space | O(n log n) | O(n²) |
| Range query | O(log n + k) | — |
vs Balanced BST (Red-Black Tree)
| Skip List | Red-Black Tree | |
|---|---|---|
| Implementation | Simpler | Complex rotations |
| Concurrent access | Easier (lock-free variants) | Harder |
| Cache efficiency | 🔶 Pointer chasing | 🔶 Similar |
| Deterministic | No (probabilistic) | Yes |
When to Use ✅ / Avoid ❌
✅ In-memory sorted data with concurrent access ✅ Range queries on sorted keys ✅ When you want simpler code than a balanced BST ❌ Disk-based storage (poor cache locality — use B-Tree) ❌ Memory is critical (overhead of multiple levels)
Real Systems
| System | How Used |
|——–|———-|
| Redis | Sorted Sets (ZADD, ZRANGE) |
| LevelDB | MemTable implementation |
| CockroachDB | In-memory sorted storage |
| Lucene | Term index structures |
Implementation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
import random
class SkipListNode:
"""
A single node in a skip list.
- key: the search key
- value: the associated value
- forward: list of pointers to next nodes at each level
"""
def __init__(self, key, value, level):
self.key = key
self.value = value
# forward[i] points to next node at level i
self.forward = [None] * (level + 1)
class SkipList:
"""
Skip list: a probabilistic data structure for O(log n) sorted searches.
"""
def __init__(self, max_level=16, p=0.5):
"""
Initialize skip list.
- max_level: maximum allowed level (16 ≈ 2^16 = 65k nodes)
- p: probability of promoting node to next level (typical: 0.5 or 0.25)
"""
self.max_level = max_level
self.p = p
# Sentinel head node: links start from here
self.head = SkipListNode(None, None, max_level)
self.level = 0 # Current height of skip list
def _random_level(self):
"""
Randomly generate a level for a new node.
Each node has probability p of being promoted to next level.
Expected level = O(log n).
"""
level = 0
while random.random() < self.p and level < self.max_level:
level += 1
return level
def search(self, key):
"""
Search for a key in the skip list.
Time: O(log n) average, O(n) worst case
Returns: value if found, None otherwise
"""
# Start from head at the topmost level
node = self.head
# Scan down through levels (from top to bottom)
for i in range(self.level, -1, -1):
# Scan right at level i until we find a key >= our target
while node.forward[i] is not None and node.forward[i].key < key:
node = node.forward[i]
# If we find exact match at any point, remember it
# Drop to level 0 and check if next node is our target
node = node.forward[0]
if node is not None and node.key == key:
return node.value
return None
def insert(self, key, value):
"""
Insert a key-value pair into the skip list.
Time: O(log n) average, O(n) worst case
"""
# Update array stores predecessor at each level (needed for linking)
update = [None] * (self.max_level + 1)
node = self.head
# Find predecessors at each level (same scan as search)
for i in range(self.level, -1, -1):
while node.forward[i] is not None and node.forward[i].key < key:
node = node.forward[i]
update[i] = node # Record predecessor at level i
# Move to potential position at level 0
node = node.forward[0]
# If key already exists, update value and return
if node is not None and node.key == key:
node.value = value
return
# Create new node with random level
new_level = self._random_level()
# If new level exceeds current list height, update links at new levels
if new_level > self.level:
for i in range(self.level + 1, new_level + 1):
update[i] = self.head # Predecessor at new levels is head
self.level = new_level # Update overall list height
# Create new node
new_node = SkipListNode(key, value, new_level)
# Wire the new node into the skip list at each level
for i in range(new_level + 1):
# new_node.forward[i] points to where update[i] was pointing
new_node.forward[i] = update[i].forward[i]
# update[i] now points to new_node
update[i].forward[i] = new_node
def delete(self, key):
"""
Delete a key from the skip list.
Time: O(log n) average, O(n) worst case
"""
update = [None] * (self.max_level + 1)
node = self.head
# Find predecessors at each level
for i in range(self.level, -1, -1):
while node.forward[i] is not None and node.forward[i].key < key:
node = node.forward[i]
update[i] = node
# Move to potential target node
node = node.forward[0]
# Key not found
if node is None or node.key != key:
return
# Splice out the node at each level
for i in range(self.level + 1):
# If update[i].forward[i] is the target, bypass it
if update[i].forward[i] is node:
update[i].forward[i] = node.forward[i]
# Lower the list height if top level(s) become empty
while self.level > 0 and self.head.forward[self.level] is None:
self.level -= 1
# Example usage:
if __name__ == "__main__":
skip_list = SkipList()
skip_list.insert(3, "three")
skip_list.insert(7, "seven")
skip_list.insert(12, "twelve")
skip_list.insert(17, "seventeen")
skip_list.insert(23, "twenty-three")
print(skip_list.search(17)) # "seventeen"
print(skip_list.search(19)) # None
skip_list.delete(7)
print(skip_list.search(7)) # None
Algorithm Flow
SEARCH for key=17 in a 3-level skip list:
INSERT key=19, random_level=2:
Redis Sorted Sets (ZADD/ZRANGE/ZRANGEBYSCORE)
Redis implements sorted sets as a dual structure: a skip list for ordered iteration and range queries plus a hash map for O(1) score lookup by member name. The skip list has max 64 levels and uses a per-node random level with p=0.25 (not 0.5) to reduce average level and memory overhead. ZRANGEBYSCORE is O(log N + M) where M = result count.
LevelDB / RocksDB MemTable
The in-memory write buffer before SSTable flush is a skip list. Chosen over red-black tree for two reasons: (1) simpler lock-free concurrent reads during flush, (2) comparable O(log n) performance with less implementation complexity. RocksDB supports pluggable MemTable implementations and also offers hash skip list for point lookups only.
Apache Lucene Posting Lists
Skip lists (with skip pointers every √n entries) are embedded in posting lists to accelerate AND queries. When intersecting two posting lists of size 1M and 100, skip pointers let the smaller list jump over large gaps in the bigger list instead of scanning every entry, turning O(n) into O(log n) hops.
CockroachDB MVCC Layer
CockroachDB’s Pebble storage engine (RocksDB fork) uses a skip list for its in-memory write buffer, storing (key, timestamp) tuples to support MVCC reads at any snapshot timestamp. This enables consistent read isolation without locking.
References
- 📄 Pugh 1990 — Skip Lists: A Probabilistic Alternative to Balanced Trees — the original paper by William Pugh
- 🎥 MIT OCW 6.006 — Skip Lists Lecture — rigorous probabilistic analysis
- 🎥 ByteByteGo — Skip List Explained — visual walkthrough
- 📖 Wikipedia: Skip List — variants, complexity proofs
- 🔗 Redis Skip List Source Code — see real implementation with p=0.25 and 64 levels