Regex for URL Slug Validation — Lowercase, Hyphenated
Regex for validating SEO-friendly URL slugs: lowercase letters, digits, and hyphens with no leading or trailing hyphens. Length limits and Unicode considerations.
Detailed Explanation
Regex for URL Slug Validation
A URL slug is the human-readable portion of a URL path, typically derived from a title. SEO and routing best practices recommend lowercase ASCII letters, digits, and single hyphens between words.
Standard Pattern
^[a-z0-9]+(?:-[a-z0-9]+)*$
This pattern enforces:
- One or more lowercase letters or digits to start
- Zero or more groups of
-followed by alphanumerics - No leading, trailing, or consecutive hyphens
Tested Examples
| Input | Valid? |
|---|---|
hello-world |
yes |
my-blog-post-2024 |
yes |
a |
yes |
Hello-World |
no (uppercase) |
-hello |
no (leading hyphen) |
hello--world |
no (double hyphen) |
hello_world |
no (underscore) |
hello world |
no (space) |
With Length Limits
Most CMSs cap slugs at around 60–80 characters for SEO and URL display:
^[a-z0-9]+(?:-[a-z0-9]+){0,15}$
Or use a length anchor:
^(?=.{3,80}$)[a-z0-9]+(?:-[a-z0-9]+)*$
Generating Slugs from Titles
Validation is one half; generation is the other. A typical pipeline:
title
.toLowerCase()
.normalize("NFD").replace(/[\u0300-\u036f]/g, "") // strip accents
.replace(/[^a-z0-9]+/g, "-")
.replace(/^-+|-+$/g, "")
Unicode Slugs
Some platforms allow Unicode (e.g. Wikipedia). To accept letters in any script:
^[\p{L}\p{N}]+(?:-[\p{L}\p{N}]+)*$
(Requires the u flag.)
Why It Matters
Strict slug validation prevents broken canonical URLs, duplicate routes from case-only differences, and accidental whitespace that breaks redirect rules.
Use Case
Validating user-edited slugs in a CMS interface, generating safe URL fragments for blog posts and product pages, or migrating legacy URLs to a hyphenated SEO-friendly format.