Regex to Match Data URIs
Match data URI scheme strings with optional MIME type, parameters, and base64 or plain encoding. Validates data: protocol format. Free regex tester.
Regular Expression
/data:([a-zA-Z]+\/[a-zA-Z0-9.+-]+)?(?:;([a-zA-Z0-9=]+))*(?:;base64)?,([^\s]*)/g
Token Breakdown
| Token | Description |
|---|---|
| d | Matches the literal character 'd' |
| a | Matches the literal character 'a' |
| t | Matches the literal character 't' |
| a | Matches the literal character 'a' |
| : | Matches the literal character ':' |
| ( | Start of capturing group |
| [a-zA-Z] | Character class — matches any one of: a-zA-Z |
| + | Matches the preceding element one or more times (greedy) |
| \/ | Matches a literal forward slash |
| [a-zA-Z0-9.+-] | Character class — matches any one of: a-zA-Z0-9.+- |
| + | Matches the preceding element one or more times (greedy) |
| ) | End of group |
| ? | Makes the preceding element optional (zero or one times) |
| (?: | Start of non-capturing group |
| ; | Matches the literal character ';' |
| ( | Start of capturing group |
| [a-zA-Z0-9=] | Character class — matches any one of: a-zA-Z0-9= |
| + | Matches the preceding element one or more times (greedy) |
| ) | End of group |
| ) | End of group |
| * | Matches the preceding element zero or more times (greedy) |
| (?: | Start of non-capturing group |
| ; | Matches the literal character ';' |
| b | Matches the literal character 'b' |
| a | Matches the literal character 'a' |
| s | Matches the literal character 's' |
| e | Matches the literal character 'e' |
| 6 | Matches the literal character '6' |
| 4 | Matches the literal character '4' |
| ) | End of group |
| ? | Makes the preceding element optional (zero or one times) |
| , | Matches the literal character ',' |
| ( | Start of capturing group |
| [^\s] | Negated character class — matches any character NOT in \s |
| * | Matches the preceding element zero or more times (greedy) |
| ) | End of group |
Detailed Explanation
This regex matches data URIs as defined in RFC 2397. Here is the token-by-token breakdown:
data: — Matches the literal data: scheme prefix that identifies a data URI.
([a-zA-Z]+/[a-zA-Z0-9.+-]+)? — Optional capturing group 1 matches the MIME type. It consists of a type (letters only) followed by a forward slash and a subtype (letters, digits, dots, plus signs, hyphens). Examples include text/plain, image/png, application/json, and application/vnd.ms-excel. The entire MIME type is optional; if omitted, text/plain is assumed.
(?:;([a-zA-Z0-9=]+))* — Zero or more non-capturing groups matching MIME parameters preceded by semicolons. Capturing group 2 captures the parameter value. Parameters include charset specifications like charset=UTF-8.
(?:;base64)? — An optional non-capturing group matching the literal ;base64 flag. When present, it indicates the data portion is Base64-encoded rather than percent-encoded.
, — Matches the literal comma that separates the metadata from the actual data content.
([^\s]*) — Capturing group 3 matches the data content: zero or more non-whitespace characters. For base64 data URIs, this contains the Base64-encoded content. For plain data URIs, this contains percent-encoded text.
The g flag enables global matching. Data URIs are used to embed small files directly in HTML, CSS, or JavaScript without requiring separate HTTP requests. This pattern is useful for detecting embedded resources, content security policy analysis, and resource extraction.
Example Test Strings
| Input | Expected |
|---|---|
| data:text/plain;base64,SGVsbG8= | Match |
| data:image/png;base64,iVBOR | Match |
| data:,Hello%20World | Match |
| https://example.com | No Match |
| data:application/json,{} | Match |
Try It — Interactive Tester
Match Highlighting(4 matches)
Matches & Capture Groups
76 charsFlags: gMatches: 4Ctrl+Shift+C to copy regex
Related Regex Patterns
Regex to Match Base64 Encoded Strings
/^(?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?$/m
Regex to Match Content-Type Headers
/^[a-zA-Z]+\/[a-zA-Z0-9.+-]+(?:\s*;\s*[a-zA-Z0-9-]+=(?:"[^"]*"|[^;,\s]+))*$/
Regex to Match URLs
/https?:\/\/[\w.-]+(?:\.[a-zA-Z]{2,})(?:\/[\w./?#&=%-]*)*/gi
Regex to Match Image File Names
/^[\w.-]+\.(?:jpg|jpeg|png|gif|bmp|svg|webp|ico|tiff?)$/i