Skip List

A layered linked list with express lanes -- O(log n) average search/insert using randomization, without the complexity of balancing a tree.

Posted Dec 1, 2023

7 min read

Skip List

A layered linked list with express lanes — O(log n) average search/insert using randomization, without the complexity of balancing a tree.

Structure

Search for 25: Level 3: HEAD → 20 → (50 too big, drop down) → Level 2: 20 → 30 (too big, drop) → Level 1: 20 → 25 ✅

How Levels Are Assigned

Each new element flips a coin — promotes to next level with 50% probability. Expected height = O(log n).

Key Properties

Property	Average	Worst Case
Search	O(log n)	O(n)
Insert	O(log n)	O(n)
Delete	O(log n)	O(n)
Space	O(n log n)	O(n²)
Range query	O(log n + k)	—

vs Balanced BST (Red-Black Tree)

	Skip List	Red-Black Tree
Implementation	Simpler	Complex rotations
Concurrent access	Easier (lock-free variants)	Harder
Cache efficiency	🔶 Pointer chasing	🔶 Similar
Deterministic	No (probabilistic)	Yes

When to Use ✅ / Avoid ❌

✅ In-memory sorted data with concurrent access ✅ Range queries on sorted keys ✅ When you want simpler code than a balanced BST ❌ Disk-based storage (poor cache locality — use B-Tree) ❌ Memory is critical (overhead of multiple levels)

Real Systems

Implementation

        
      
import random

class SkipListNode:
    """
    A single node in a skip list.
    - key: the search key
    - value: the associated value
    - forward: list of pointers to next nodes at each level
    """
    def __init__(self, key, value, level):
        self.key = key
        self.value = value
        # forward[i] points to next node at level i
        self.forward = [None] * (level + 1)

class SkipList:
    """
    Skip list: a probabilistic data structure for O(log n) sorted searches.
    """

    def __init__(self, max_level=16, p=0.5):
        """
        Initialize skip list.
        - max_level: maximum allowed level (16 ≈ 2^16 = 65k nodes)
        - p: probability of promoting node to next level (typical: 0.5 or 0.25)
        """
        self.max_level = max_level
        self.p = p
        # Sentinel head node: links start from here
        self.head = SkipListNode(None, None, max_level)
        self.level = 0  # Current height of skip list

    def _random_level(self):
        """
        Randomly generate a level for a new node.
        Each node has probability p of being promoted to next level.
        Expected level = O(log n).
        """
        level = 0
        while random.random() < self.p and level < self.max_level:
            level += 1
        return level

    def search(self, key):
        """
        Search for a key in the skip list.
        Time: O(log n) average, O(n) worst case
        Returns: value if found, None otherwise
        """
        # Start from head at the topmost level
        node = self.head

        # Scan down through levels (from top to bottom)
        for i in range(self.level, -1, -1):
            # Scan right at level i until we find a key >= our target
            while node.forward[i] is not None and node.forward[i].key < key:
                node = node.forward[i]
            # If we find exact match at any point, remember it

        # Drop to level 0 and check if next node is our target
        node = node.forward[0]
        if node is not None and node.key == key:
            return node.value
        return None

    def insert(self, key, value):
        """
        Insert a key-value pair into the skip list.
        Time: O(log n) average, O(n) worst case
        """
        # Update array stores predecessor at each level (needed for linking)
        update = [None] * (self.max_level + 1)
        node = self.head

        # Find predecessors at each level (same scan as search)
        for i in range(self.level, -1, -1):
            while node.forward[i] is not None and node.forward[i].key < key:
                node = node.forward[i]
            update[i] = node  # Record predecessor at level i

        # Move to potential position at level 0
        node = node.forward[0]

        # If key already exists, update value and return
        if node is not None and node.key == key:
            node.value = value
            return

        # Create new node with random level
        new_level = self._random_level()

        # If new level exceeds current list height, update links at new levels
        if new_level > self.level:
            for i in range(self.level + 1, new_level + 1):
                update[i] = self.head  # Predecessor at new levels is head
            self.level = new_level  # Update overall list height

        # Create new node
        new_node = SkipListNode(key, value, new_level)

        # Wire the new node into the skip list at each level
        for i in range(new_level + 1):
            # new_node.forward[i] points to where update[i] was pointing
            new_node.forward[i] = update[i].forward[i]
            # update[i] now points to new_node
            update[i].forward[i] = new_node

    def delete(self, key):
        """
        Delete a key from the skip list.
        Time: O(log n) average, O(n) worst case
        """
        update = [None] * (self.max_level + 1)
        node = self.head

        # Find predecessors at each level
        for i in range(self.level, -1, -1):
            while node.forward[i] is not None and node.forward[i].key < key:
                node = node.forward[i]
            update[i] = node

        # Move to potential target node
        node = node.forward[0]

        # Key not found
        if node is None or node.key != key:
            return

        # Splice out the node at each level
        for i in range(self.level + 1):
            # If update[i].forward[i] is the target, bypass it
            if update[i].forward[i] is node:
                update[i].forward[i] = node.forward[i]

        # Lower the list height if top level(s) become empty
        while self.level > 0 and self.head.forward[self.level] is None:
            self.level -= 1

# Example usage:
if __name__ == "__main__":
    skip_list = SkipList()
    skip_list.insert(3, "three")
    skip_list.insert(7, "seven")
    skip_list.insert(12, "twelve")
    skip_list.insert(17, "seventeen")
    skip_list.insert(23, "twenty-three")

    print(skip_list.search(17))  # "seventeen"
    print(skip_list.search(19))  # None
    skip_list.delete(7)
    print(skip_list.search(7))   # None

Algorithm Flow

SEARCH for key=17 in a 3-level skip list:

INSERT key=19, random_level=2:

Redis Sorted Sets (ZADD/ZRANGE/ZRANGEBYSCORE)

Redis implements sorted sets as a dual structure: a skip list for ordered iteration and range queries plus a hash map for O(1) score lookup by member name. The skip list has max 64 levels and uses a per-node random level with p=0.25 (not 0.5) to reduce average level and memory overhead. ZRANGEBYSCORE is O(log N + M) where M = result count.

LevelDB / RocksDB MemTable

The in-memory write buffer before SSTable flush is a skip list. Chosen over red-black tree for two reasons: (1) simpler lock-free concurrent reads during flush, (2) comparable O(log n) performance with less implementation complexity. RocksDB supports pluggable MemTable implementations and also offers hash skip list for point lookups only.

Apache Lucene Posting Lists

Skip lists (with skip pointers every √n entries) are embedded in posting lists to accelerate AND queries. When intersecting two posting lists of size 1M and 100, skip pointers let the smaller list jump over large gaps in the bigger list instead of scanning every entry, turning O(n) into O(log n) hops.

CockroachDB MVCC Layer

CockroachDB’s Pebble storage engine (RocksDB fork) uses a skip list for its in-memory write buffer, storing (key, timestamp) tuples to support MVCC reads at any snapshot timestamp. This enables consistent read isolation without locking.

References

📄 Pugh 1990 — Skip Lists: A Probabilistic Alternative to Balanced Trees — the original paper by William Pugh
🎥 MIT OCW 6.006 — Skip Lists Lecture — rigorous probabilistic analysis
🎥 ByteByteGo — Skip List Explained — visual walkthrough
📖 Wikipedia: Skip List — variants, complexity proofs
🔗 Redis Skip List Source Code — see real implementation with p=0.25 and 64 levels

Software Architecture, Data Structures

algorithms

This post is licensed under CC BY 4.0 by the author.