Regex to Match DOI Identifiers

Validate Digital Object Identifier (DOI) strings starting with 10. prefix followed by a registrant code and suffix. Matches scholarly article identifiers. Free regex.

Regular Expression

/^10\.\d{4,9}/[^\s]+$/

Token Breakdown

TokenDescription
^Anchors at the start of the string (or line in multiline mode)
1Matches the literal character '1'
0Matches the literal character '0'
\.Matches a literal dot
\dMatches any digit (0-9)
{4,9}Matches between 4 and 9 times
/Matches the literal character '/'
[^\s]Negated character class — matches any character NOT in \s
+Matches the preceding element one or more times (greedy)
$Anchors at the end of the string (or line in multiline mode)

Detailed Explanation

This regex validates Digital Object Identifiers (DOIs) as standardized by the International DOI Foundation. Here is the token-by-token breakdown:

^ — Anchors the match at the start of the string.

10 — Matches the literal number 10, which is the DOI directory indicator. All DOIs begin with 10. followed by a registrant code.

. — Matches a literal dot separating the directory indicator from the registrant code. The dot is escaped because it is a regex metacharacter.

\d{4,9} — Matches 4 to 9 digits for the registrant code (also called the prefix). The registrant code identifies the organization that registered the DOI. Common registrant codes include 1000 series for test DOIs, 1002 for Wiley, 1038 for Nature, and 1109 for IEEE.

/ — Matches the literal forward slash that separates the prefix (registrant code) from the suffix.

[^\s]+ — Matches one or more non-whitespace characters for the DOI suffix. The suffix is assigned by the registrant and can contain letters, digits, dots, hyphens, underscores, and other characters. It uniquely identifies the content item within the registrant's namespace.

$ — Anchors the match at the end of the string.

No flags are used since this validates a single DOI string.

DOIs are persistent identifiers used to uniquely identify academic papers, datasets, and other scholarly objects. Examples include: 10.1000/xyz123, 10.1038/nature12373, and 10.1109/5.771073. This pattern is essential for academic citation systems, research databases, and library catalogs.

Example Test Strings

InputExpected
10.1000/xyz123Match
10.1038/nature12373Match
10.1109/5.771073Match
11.1000/xyzNo Match
10.12/shortNo Match

Try It — Interactive Tester

//
gimsuy
No matches found.
Pattern: 20 charsFlags: noneMatches: 0

Ctrl+Shift+C to copy regex

Customize this pattern →