Regex to Extract Quoted Strings — Single, Double, and Escaped
Regex to extract quoted strings from text, handling single quotes, double quotes, escaped quotes, and backtick template literals. Useful for log parsing and code analysis.
Detailed Explanation
Extracting Quoted Strings
Pulling out quoted text comes up in log parsing (request: "GET /api"), code search, and configuration analysis. The trick is handling escaped quotes inside the string.
Simple Double-Quoted Strings (no escapes)
"([^"]*)"
'msg: "hello world"'.match(/"([^"]*)"/) returns hello world in group 1.
With Escaped Quotes
To allow \" inside the string:
"((?:[^"\\]|\\.)*)"
This matches any character except " and \, or any escaped character (\\, \", etc.).
Single or Double Quoted
(["'])((?:(?!\1)[^\\]|\\.)*)\1
The backreference \1 ensures the closing quote matches the opening one.
Backtick Template Literals
`((?:[^`\\]|\\.)*)`
Tested Examples
| Input | Pattern | Captured |
|---|---|---|
'say "hi"' |
simple double | hi |
'msg "she said \"ok\""' |
escaped double | she said \"ok\" |
"name='Alice'" |
single or double | Alice |
"\"hello\"" |
escaped double | hello |
Multi-line Strings
For triple-quoted (Python) or heredoc strings, regex becomes inadequate. Use a real lexer.
Performance
The pattern "([^"]*)" is fast and safe. The escaped-quote variant uses an unrolled-loop technique (?:[^"\\]|\\.)* that avoids catastrophic backtracking. Avoid the naive ".*?" because it can backtrack badly on long inputs.
Practical Recommendation
For configuration files, prefer a proper parser (TOML, YAML, JSON). For ad-hoc log scraping, the unrolled-loop pattern above is the safe default.
Use Case
Extracting field values from log lines (`user="alice"`), pulling string literals out of source code for translation review, or scraping configuration from documentation snippets.