Regex Quantifiers — Greedy vs Lazy Matching Explained
Master regex quantifiers: *, +, ?, {n}, {n,m}. Understand the difference between greedy and lazy (non-greedy) matching with practical examples.
Detailed Explanation
Regex Quantifiers: Greedy vs Lazy
Quantifiers control how many times a preceding element is matched. Understanding the difference between greedy and lazy modes is crucial for writing correct patterns.
Standard Quantifiers
| Quantifier | Meaning |
|---|---|
* |
Zero or more |
+ |
One or more |
? |
Zero or one (optional) |
{n} |
Exactly n times |
{n,} |
n or more times |
{n,m} |
Between n and m times |
Greedy Mode (Default)
By default, quantifiers are greedy: they match as many characters as possible while still allowing the overall pattern to succeed. For example, given the string <b>bold</b>, the pattern <.+> matches the entire string <b>bold</b> because .+ consumes everything it can.
Lazy Mode (Non-Greedy)
Adding ? after a quantifier makes it lazy: it matches as few characters as possible. The pattern <.+?> applied to <b>bold</b> matches <b> first, then </b> separately.
| Greedy | Lazy | Behavior |
|---|---|---|
* |
*? |
Zero or more, prefer fewer |
+ |
+? |
One or more, prefer fewer |
? |
?? |
Zero or one, prefer zero |
When to Use Each
- Greedy is usually fine for simple patterns where there is no ambiguity
- Lazy is essential when matching paired delimiters (HTML tags, quotes, brackets)
- Consider using negated character classes as an alternative:
<[^>]+>is often clearer and faster than<.+?>
Performance Consideration
Lazy quantifiers can cause excessive backtracking on long strings. When possible, prefer negated character classes or atomic grouping for better performance.
Use Case
You are extracting content between HTML tags, matching quoted strings, or parsing any text where you need to find the shortest possible match between delimiters rather than the longest.