Regex Capturing Groups, Named Groups, and Backreferences
Learn regex capturing groups (abc), non-capturing groups (?:abc), named groups (?<name>abc), and backreferences \1. Extract and reuse matched substrings.
Detailed Explanation
Capturing Groups and Backreferences
Groups are essential for extracting parts of a match and for applying quantifiers to multi-character sequences. Understanding the different group types unlocks powerful pattern matching capabilities.
Capturing Groups (...)
Parentheses create capturing groups. Each group captures the matched text, accessible by index starting at 1.
Pattern: (\d{3})-(\d{4})
Input: 555-1234
Group 1: 555
Group 2: 1234
Non-Capturing Groups (?:...)
When you need grouping for quantifiers but do not need to capture the text, use (?:...). This is more efficient and keeps your group numbering clean.
Pattern: (?:ab)+c
Matches: abc, ababc, abababc
Named Capturing Groups (?...)
Named groups make patterns self-documenting. In JavaScript, access them via match.groups.name.
Pattern: (?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})
Input: 2024-01-15
groups.year: 2024
groups.month: 01
groups.day: 15
Backreferences
Backreferences match the same text that was captured by a previous group. \1 refers to group 1, \2 to group 2, etc.
Pattern: (\w+) \1
Matches: "the the", "go go" (repeated words)
Named backreferences use \k<name>:
Pattern: (?<word>\w+) \k<word>
Same result with named reference
Alternation Within Groups
The pipe | inside a group creates alternation: (cat|dog) matches either "cat" or "dog". Combine with non-capturing groups when you do not need the captured value: (?:cat|dog)s?
Use Case
You are parsing structured text like dates, phone numbers, or log entries where you need to extract specific components. Backreferences are useful for finding duplicated words or matching paired delimiters.