Regex to Match Data URIs

Match data URI scheme strings with optional MIME type, parameters, and base64 or plain encoding. Validates data: protocol format. Free regex tester.

Regular Expression

/data:([a-zA-Z]+\/[a-zA-Z0-9.+-]+)?(?:;([a-zA-Z0-9=]+))*(?:;base64)?,([^\s]*)/g

Token Breakdown

TokenDescription
dMatches the literal character 'd'
aMatches the literal character 'a'
tMatches the literal character 't'
aMatches the literal character 'a'
:Matches the literal character ':'
(Start of capturing group
[a-zA-Z]Character class — matches any one of: a-zA-Z
+Matches the preceding element one or more times (greedy)
\/Matches a literal forward slash
[a-zA-Z0-9.+-]Character class — matches any one of: a-zA-Z0-9.+-
+Matches the preceding element one or more times (greedy)
)End of group
?Makes the preceding element optional (zero or one times)
(?:Start of non-capturing group
;Matches the literal character ';'
(Start of capturing group
[a-zA-Z0-9=]Character class — matches any one of: a-zA-Z0-9=
+Matches the preceding element one or more times (greedy)
)End of group
)End of group
*Matches the preceding element zero or more times (greedy)
(?:Start of non-capturing group
;Matches the literal character ';'
bMatches the literal character 'b'
aMatches the literal character 'a'
sMatches the literal character 's'
eMatches the literal character 'e'
6Matches the literal character '6'
4Matches the literal character '4'
)End of group
?Makes the preceding element optional (zero or one times)
,Matches the literal character ','
(Start of capturing group
[^\s]Negated character class — matches any character NOT in \s
*Matches the preceding element zero or more times (greedy)
)End of group

Detailed Explanation

This regex matches data URIs as defined in RFC 2397. Here is the token-by-token breakdown:

data: — Matches the literal data: scheme prefix that identifies a data URI.

([a-zA-Z]+/[a-zA-Z0-9.+-]+)? — Optional capturing group 1 matches the MIME type. It consists of a type (letters only) followed by a forward slash and a subtype (letters, digits, dots, plus signs, hyphens). Examples include text/plain, image/png, application/json, and application/vnd.ms-excel. The entire MIME type is optional; if omitted, text/plain is assumed.

(?:;([a-zA-Z0-9=]+))* — Zero or more non-capturing groups matching MIME parameters preceded by semicolons. Capturing group 2 captures the parameter value. Parameters include charset specifications like charset=UTF-8.

(?:;base64)? — An optional non-capturing group matching the literal ;base64 flag. When present, it indicates the data portion is Base64-encoded rather than percent-encoded.

, — Matches the literal comma that separates the metadata from the actual data content.

([^\s]*) — Capturing group 3 matches the data content: zero or more non-whitespace characters. For base64 data URIs, this contains the Base64-encoded content. For plain data URIs, this contains percent-encoded text.

The g flag enables global matching. Data URIs are used to embed small files directly in HTML, CSS, or JavaScript without requiring separate HTTP requests. This pattern is useful for detecting embedded resources, content security policy analysis, and resource extraction.

Example Test Strings

InputExpected
data:text/plain;base64,SGVsbG8=Match
data:image/png;base64,iVBORMatch
data:,Hello%20WorldMatch
https://example.comNo Match
data:application/json,{}Match

Try It — Interactive Tester

//g
gimsuy

Match Highlighting(4 matches)

data:text/plain;base64,SGVsbG8= data:image/png;base64,iVBOR data:,Hello%20World https://example.com data:application/json,{}

Matches & Capture Groups

#1data:text/plain;base64,SGVsbG8=index 0
Group 1:text/plain
Group 2:base64
Group 3:SGVsbG8=
#2data:image/png;base64,iVBORindex 32
Group 1:image/png
Group 2:base64
Group 3:iVBOR
#3data:,Hello%20Worldindex 60
Group 1:undefined
Group 2:undefined
Group 3:Hello%20World
#4data:application/json,{}index 100
Group 1:application/json
Group 2:undefined
Group 3:{}
Pattern: 76 charsFlags: gMatches: 4

Ctrl+Shift+C to copy regex

Customize this pattern →