Hash Table Fundamentals: How Hashing Works

Learn how hash tables map keys to values using hash functions. Understand hash function properties, bucket arrays, and why hash tables provide O(1) average-case operations.

Hash Table

Detailed Explanation

Hash Table Fundamentals

A hash table (or hash map) is a data structure that maps keys to values using a hash function. It provides O(1) average-case time for insert, delete, and lookup operations.

How It Works

  1. A hash function converts the key into an integer (hash code).
  2. The hash code is mapped to a bucket index: index = hashCode % numBuckets.
  3. The key-value pair is stored in that bucket.
  4. On lookup, the same process is repeated to find the bucket, then the key is compared to find the exact entry.
hash("apple") = 2048392 % 8 = 0  -> bucket 0
hash("banana") = 9571032 % 8 = 0 -> bucket 0 (collision!)
hash("cherry") = 3847201 % 8 = 1 -> bucket 1

Hash Function Properties

A good hash function should:

  • Deterministic: Same key always produces the same hash.
  • Uniform distribution: Keys are spread evenly across buckets.
  • Fast: Computing the hash should be O(1) for fixed-size keys.
  • Avalanche effect: Small changes in the key produce very different hashes.

Common Hash Functions

Type Used For Example
Division General purpose h(k) = k % m
Multiplication General purpose h(k) = floor(m * (k * A % 1))
MurmurHash Hash tables Fast non-cryptographic hash
FNV-1a String hashing Simple and effective
SipHash Hash DoS protection Used in Python, Rust

String Hashing

For string keys, a common approach is polynomial hashing:

hash("abc") = a * 31^2 + b * 31^1 + c * 31^0
            = 97 * 961 + 98 * 31 + 99 * 1
            = 93217 + 3038 + 99
            = 96354

Java's String.hashCode() uses exactly this formula with base 31.

Hash Table vs Other Structures

Operation Hash Table Sorted Array BST
Search O(1) avg O(log n) O(log n)
Insert O(1) avg O(n) O(log n)
Delete O(1) avg O(n) O(log n)
Ordered iteration O(n log n) O(n) O(n)
Min/Max O(n) O(1) O(log n)

Worst Case

In the worst case (all keys hash to the same bucket), operations degrade to O(n). This is why hash function quality and collision resolution strategy matter. Modern hash tables use randomized hash seeds to prevent adversarial inputs (hash DoS attacks).

Use Case

Hash tables are the most widely used data structure in software engineering. They implement dictionaries, sets, caches, database indexes, and symbol tables in compilers. Every mainstream programming language provides a hash table implementation: Python dict, Java HashMap, JavaScript Map/Object, Go map, Rust HashMap, C++ unordered_map. Understanding how they work is essential for every developer.

Try It — Data Structure Visualizer

Open full tool