Post

Skip List

A layered linked list with express lanes -- O(log n) average search/insert using randomization, without the complexity of balancing a tree.

Skip List

A layered linked list with express lanes — O(log n) average search/insert using randomization, without the complexity of balancing a tree.

Structure

Search for 25: Level 3: HEAD → 20 → (50 too big, drop down) → Level 2: 20 → 30 (too big, drop) → Level 1: 20 → 25 ✅

How Levels Are Assigned

Each new element flips a coin — promotes to next level with 50% probability. Expected height = O(log n).

Key Properties

Property Average Worst Case
Search O(log n) O(n)
Insert O(log n) O(n)
Delete O(log n) O(n)
Space O(n log n) O(n²)
Range query O(log n + k)

vs Balanced BST (Red-Black Tree)

  Skip List Red-Black Tree
Implementation Simpler Complex rotations
Concurrent access Easier (lock-free variants) Harder
Cache efficiency 🔶 Pointer chasing 🔶 Similar
Deterministic No (probabilistic) Yes

When to Use ✅ / Avoid ❌

✅ In-memory sorted data with concurrent access ✅ Range queries on sorted keys ✅ When you want simpler code than a balanced BST ❌ Disk-based storage (poor cache locality — use B-Tree) ❌ Memory is critical (overhead of multiple levels)

Real Systems

| System | How Used | |——–|———-| | Redis | Sorted Sets (ZADD, ZRANGE) | | LevelDB | MemTable implementation | | CockroachDB | In-memory sorted storage | | Lucene | Term index structures |

Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
import random

class SkipListNode:
    """
    A single node in a skip list.
    - key: the search key
    - value: the associated value
    - forward: list of pointers to next nodes at each level
    """
    def __init__(self, key, value, level):
        self.key = key
        self.value = value
        # forward[i] points to next node at level i
        self.forward = [None] * (level + 1)

class SkipList:
    """
    Skip list: a probabilistic data structure for O(log n) sorted searches.
    """

    def __init__(self, max_level=16, p=0.5):
        """
        Initialize skip list.
        - max_level: maximum allowed level (16 ≈ 2^16 = 65k nodes)
        - p: probability of promoting node to next level (typical: 0.5 or 0.25)
        """
        self.max_level = max_level
        self.p = p
        # Sentinel head node: links start from here
        self.head = SkipListNode(None, None, max_level)
        self.level = 0  # Current height of skip list

    def _random_level(self):
        """
        Randomly generate a level for a new node.
        Each node has probability p of being promoted to next level.
        Expected level = O(log n).
        """
        level = 0
        while random.random() < self.p and level < self.max_level:
            level += 1
        return level

    def search(self, key):
        """
        Search for a key in the skip list.
        Time: O(log n) average, O(n) worst case
        Returns: value if found, None otherwise
        """
        # Start from head at the topmost level
        node = self.head

        # Scan down through levels (from top to bottom)
        for i in range(self.level, -1, -1):
            # Scan right at level i until we find a key >= our target
            while node.forward[i] is not None and node.forward[i].key < key:
                node = node.forward[i]
            # If we find exact match at any point, remember it

        # Drop to level 0 and check if next node is our target
        node = node.forward[0]
        if node is not None and node.key == key:
            return node.value
        return None

    def insert(self, key, value):
        """
        Insert a key-value pair into the skip list.
        Time: O(log n) average, O(n) worst case
        """
        # Update array stores predecessor at each level (needed for linking)
        update = [None] * (self.max_level + 1)
        node = self.head

        # Find predecessors at each level (same scan as search)
        for i in range(self.level, -1, -1):
            while node.forward[i] is not None and node.forward[i].key < key:
                node = node.forward[i]
            update[i] = node  # Record predecessor at level i

        # Move to potential position at level 0
        node = node.forward[0]

        # If key already exists, update value and return
        if node is not None and node.key == key:
            node.value = value
            return

        # Create new node with random level
        new_level = self._random_level()

        # If new level exceeds current list height, update links at new levels
        if new_level > self.level:
            for i in range(self.level + 1, new_level + 1):
                update[i] = self.head  # Predecessor at new levels is head
            self.level = new_level  # Update overall list height

        # Create new node
        new_node = SkipListNode(key, value, new_level)

        # Wire the new node into the skip list at each level
        for i in range(new_level + 1):
            # new_node.forward[i] points to where update[i] was pointing
            new_node.forward[i] = update[i].forward[i]
            # update[i] now points to new_node
            update[i].forward[i] = new_node

    def delete(self, key):
        """
        Delete a key from the skip list.
        Time: O(log n) average, O(n) worst case
        """
        update = [None] * (self.max_level + 1)
        node = self.head

        # Find predecessors at each level
        for i in range(self.level, -1, -1):
            while node.forward[i] is not None and node.forward[i].key < key:
                node = node.forward[i]
            update[i] = node

        # Move to potential target node
        node = node.forward[0]

        # Key not found
        if node is None or node.key != key:
            return

        # Splice out the node at each level
        for i in range(self.level + 1):
            # If update[i].forward[i] is the target, bypass it
            if update[i].forward[i] is node:
                update[i].forward[i] = node.forward[i]

        # Lower the list height if top level(s) become empty
        while self.level > 0 and self.head.forward[self.level] is None:
            self.level -= 1

# Example usage:
if __name__ == "__main__":
    skip_list = SkipList()
    skip_list.insert(3, "three")
    skip_list.insert(7, "seven")
    skip_list.insert(12, "twelve")
    skip_list.insert(17, "seventeen")
    skip_list.insert(23, "twenty-three")

    print(skip_list.search(17))  # "seventeen"
    print(skip_list.search(19))  # None
    skip_list.delete(7)
    print(skip_list.search(7))   # None

Algorithm Flow

SEARCH for key=17 in a 3-level skip list:

INSERT key=19, random_level=2:

Redis Sorted Sets (ZADD/ZRANGE/ZRANGEBYSCORE)

Redis implements sorted sets as a dual structure: a skip list for ordered iteration and range queries plus a hash map for O(1) score lookup by member name. The skip list has max 64 levels and uses a per-node random level with p=0.25 (not 0.5) to reduce average level and memory overhead. ZRANGEBYSCORE is O(log N + M) where M = result count.

LevelDB / RocksDB MemTable

The in-memory write buffer before SSTable flush is a skip list. Chosen over red-black tree for two reasons: (1) simpler lock-free concurrent reads during flush, (2) comparable O(log n) performance with less implementation complexity. RocksDB supports pluggable MemTable implementations and also offers hash skip list for point lookups only.

Apache Lucene Posting Lists

Skip lists (with skip pointers every √n entries) are embedded in posting lists to accelerate AND queries. When intersecting two posting lists of size 1M and 100, skip pointers let the smaller list jump over large gaps in the bigger list instead of scanning every entry, turning O(n) into O(log n) hops.

CockroachDB MVCC Layer

CockroachDB’s Pebble storage engine (RocksDB fork) uses a skip list for its in-memory write buffer, storing (key, timestamp) tuples to support MVCC reads at any snapshot timestamp. This enables consistent read isolation without locking.

References

This post is licensed under CC BY 4.0 by the author.