Seed Data with Unique Constraint Columns

Learn how the seed generator handles UNIQUE columns like email and username. Understand collision probability and strategies for large datasets.

Advanced

Detailed Explanation

Unique Constraint Considerations

Columns with UNIQUE constraints require every value to be distinct. The seed generator produces data with high variance to minimize collisions, but does not strictly enforce uniqueness.

How Unique Columns Are Generated

The generator does not have a special "unique mode." Instead, it relies on the large combinatorial space of its data pools:

Column Type Combinatorial Space
Email 40 first names × 40 last names × 99 numbers × 10 domains = 15,840,000
Username 40 names × 999 numbers = 39,960
UUID 16^32 ≈ 3.4 × 10^38
Phone ~800 area codes × 900 mid × 9000 end > 6 billion

Collision Probability

For typical seed sizes (10–1,000 rows), the collision probability is negligible:

  • 10 rows: virtually zero chance of collision
  • 100 rows: < 0.1% chance for emails
  • 1,000 rows: < 1% chance for emails, still rare

What If Collisions Occur?

If you run the generated INSERT statements and encounter a unique constraint violation:

  1. Regenerate: Click Regenerate to get a new random seed, which produces entirely different data
  2. Reduce rows: Lower the row count to decrease collision probability
  3. Remove the UNIQUE constraint from the CREATE TABLE input temporarily, generate the data, then add the constraint back

UUID Columns

For UUID-type columns (or columns named uuid), the generator produces RFC 4122-format UUIDs. The space is so vast that collisions are statistically impossible at any practical row count.

Practical Recommendation

For most development and testing scenarios with up to 1,000 rows, unique constraint collisions are extremely unlikely. If you are generating data for a column with unusual uniqueness requirements (e.g., short codes, 2-character abbreviations), consider using the JSON output and post-processing it to remove duplicates.

Use Case

Your users table has UNIQUE constraints on both email and username. Before running the generated seed data, you want to understand how likely collisions are and what to do if they occur, especially when generating 500+ rows for load testing.

Try It — Database Seed Generator

Open full tool