Hash Table Collision Resolution: Chaining vs Open Addressing

Understand how hash tables handle collisions with separate chaining and open addressing. Compare linear probing, quadratic probing, and double hashing strategies with examples.

Hash Table

Detailed Explanation

Hash Table Collision Resolution

A hash collision occurs when two different keys produce the same hash value (map to the same bucket). Since the number of possible keys is typically much larger than the number of buckets, collisions are inevitable. How they are handled determines the hash table's performance.

Separate Chaining

Each bucket stores a linked list (or other collection) of all entries that hash to that bucket:

Bucket 0: -> ("apple", 5) -> ("grape", 8)
Bucket 1: -> ("banana", 3)
Bucket 2: (empty)
Bucket 3: -> ("cherry", 7) -> ("date", 2) -> ("fig", 9)

Advantages:

  • Simple to implement
  • Graceful degradation: performance degrades linearly with load factor
  • Deletion is straightforward (just remove from the list)
  • Can store more elements than the number of buckets

Disadvantages:

  • Extra memory for pointers
  • Poor cache performance (pointer chasing through linked lists)

Open Addressing

All entries are stored directly in the bucket array. When a collision occurs, the algorithm probes for the next available slot.

Linear probing: Check bucket h, then h+1, h+2, ...

Insert "apple" -> hash=3 -> bucket 3 (empty, placed)
Insert "grape" -> hash=3 -> bucket 3 (occupied)
                         -> bucket 4 (empty, placed)

Quadratic probing: Check bucket h, then h+1, h+4, h+9, ...

This reduces primary clustering where consecutive occupied slots form long runs.

Double hashing: Use a second hash function to determine the probe step:

probe(i) = (h1(key) + i * h2(key)) % size

This provides the best distribution but is more complex to implement.

Performance Comparison

Metric Chaining Linear Probing Double Hashing
Average search (low load) O(1) O(1) O(1)
Average search (high load) O(1 + n/m) Degrades faster Better than linear
Worst case O(n) O(n) O(n)
Cache performance Poor Excellent Good
Deletion Easy Requires tombstones Requires tombstones

Tombstone Deletion

In open addressing, you cannot simply empty a slot on deletion — it would break the probe sequence for other keys. Instead, deleted slots are marked as tombstones: treated as occupied during search but available for insertion.

Use Case

Every hash table implementation must choose a collision resolution strategy. Language standard libraries vary: Java's HashMap uses chaining (with tree upgrade for long chains), Python's dict uses open addressing with pseudo-random probing, and Go's map uses a variant of chaining with buckets. Understanding these strategies is crucial for system design and choosing the right data structure.

Try It — Data Structure Visualizer

Open full tool