Regex to Match XML Declarations

Match XML declaration headers like <?xml version="1.0" encoding="UTF-8"?> with this regex pattern. Validates XML prolog format. Free regex tester.

Regular Expression

/<\?xml\s+[^?]*\?>/g

Token Breakdown

TokenDescription
<Matches the literal character '<'
\?Matches a literal question mark
xMatches the literal character 'x'
mMatches the literal character 'm'
lMatches the literal character 'l'
\sMatches any whitespace character (space, tab, newline)
+Matches the preceding element one or more times (greedy)
[^?]Negated character class — matches any character NOT in ?
*Matches the preceding element zero or more times (greedy)
\?Matches a literal question mark
>Matches the literal character '>'

Detailed Explanation

This regex matches XML declarations (also called XML prologs) that appear at the beginning of XML documents. Here is the token-by-token breakdown:

<? — Matches the literal opening delimiter of an XML processing instruction. The question mark is escaped with a backslash because it is a regex metacharacter (quantifier meaning zero or one). In XML, processing instructions start with <? to distinguish them from regular elements.

xml — Matches the literal string xml, identifying this processing instruction as the XML declaration specifically rather than other processing instructions like <?php or <?xml-stylesheet.

\s+ — Matches one or more whitespace characters. At least one space is required between xml and the first attribute. This separates the xml identifier from the version attribute.

[^?]* — Matches zero or more characters that are not a question mark. This consumes the declaration attributes such as version="1.0", encoding="UTF-8", and standalone="yes". By excluding the question mark, it prevents the match from extending past the closing delimiter.

?> — Matches the literal closing delimiter of the processing instruction. The question mark is escaped, followed by the closing angle bracket.

The g flag enables global matching, though typically only one XML declaration appears per document at the very beginning. This pattern is useful for XML parsing, document type detection, encoding identification, and preprocessing XML files. It validates that the document starts with a proper XML prolog.

Example Test Strings

InputExpected
<?xml version="1.0"?>Match
<?xml version="1.0" encoding="UTF-8"?>Match
<xml>not a declaration</xml>No Match
<?php echo 'hello'; ?>No Match
<?xml version="1.1" standalone="yes"?>Match

Try It — Interactive Tester

//g
gimsuy

Match Highlighting(3 matches)

<?xml version="1.0"?> <?xml version="1.0" encoding="UTF-8"?> <xml>not a declaration</xml> <?php echo 'hello'; ?> <?xml version="1.1" standalone="yes"?>

Matches & Capture Groups

#1<?xml version="1.0"?>index 0
#2<?xml version="1.0" encoding="UTF-8"?>index 22
#3<?xml version="1.1" standalone="yes"?>index 113
Pattern: 17 charsFlags: gMatches: 3

Ctrl+Shift+C to copy regex

Customize this pattern →