UUIDs as Database Primary Keys
Best practices for using UUIDs as database primary keys: storage types, indexing strategies, version selection, and performance optimization techniques.
Detailed Explanation
Using UUIDs as primary keys is increasingly common in modern applications, but doing it well requires understanding the storage, indexing, and performance implications across different database engines.
Storage options by database:
| Database | Native Type | Size | Notes |
|---|---|---|---|
| PostgreSQL | uuid |
16 bytes | Native binary storage, fully supported |
| MySQL 8+ | BINARY(16) |
16 bytes | No native UUID type; use binary |
| MySQL | CHAR(36) |
36 bytes | Common but wasteful; avoid if possible |
| SQLite | BLOB(16) |
16 bytes | Or TEXT(36) for readability |
| MongoDB | Embedded in BSON | 16 bytes | Use Binary subtype 4 |
The golden rule: store as binary, display as string. Storing UUIDs as CHAR(36) wastes 20 bytes per row per UUID column compared to binary storage. On a table with 100 million rows and 3 UUID columns, this wastes approximately 5.6 GB.
Version selection for primary keys:
- UUID v7 is the best choice for new applications. Its time-ordered nature provides B-tree-friendly sequential inserts while maintaining distributed uniqueness.
- UUID v4 works but causes random index scatter. Acceptable for tables under ~10 million rows or when insert throughput is not critical.
- UUID v1 is time-ordered but leaks MAC addresses. Prefer v7 instead.
PostgreSQL-specific optimization:
-- Use the native uuid type (stored as 16 bytes)
CREATE TABLE users (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
email TEXT NOT NULL
);
-- For UUID v7, use a custom function or extension
CREATE EXTENSION IF NOT EXISTS pgcrypto;
MySQL optimization with ordered UUIDs:
-- Store as BINARY(16), convert for display
CREATE TABLE users (
id BINARY(16) PRIMARY KEY,
email VARCHAR(255) NOT NULL
);
-- MySQL 8.0+ has UUID_TO_BIN with swap flag for v1 time reordering
INSERT INTO users (id, email) VALUES (UUID_TO_BIN(UUID(), true), 'user@example.com');
Indexing strategies:
- Clustered index on UUID v7: Safe and performant because inserts are approximately sequential.
- Clustered index on UUID v4: Avoid. Random inserts cause page splits and fragmentation.
- Secondary UUID index + auto-increment clustered: Best of both worlds for MySQL/InnoDB where the clustered index determines physical row order.
Join performance: UUID joins are slower than integer joins because comparing 16 bytes is slower than comparing 4 or 8 bytes. For tables with heavy join workloads, consider the hybrid approach: integer PK for internal joins, UUID for external references.
Use Case
UUIDs as primary keys are essential in multi-region database deployments where each region writes independently and data must merge without conflicts during replication.