XML Entity References — Predefined, Numeric, and Custom Entities
Master XML entity references: the 5 predefined entities, numeric character references, custom entity declarations in DTDs, and how entities are resolved during XML parsing and formatting.
Detailed Explanation
XML Entity References
XML entities are placeholders that are replaced with their defined values during parsing. They are essential for including special characters, reusable text fragments, and external content in XML documents.
The 5 Predefined Entities
XML defines exactly five built-in entities that are always available:
| Entity | Character | Description |
|---|---|---|
< |
< |
Less-than sign |
> |
> |
Greater-than sign |
& |
& |
Ampersand |
" |
" |
Double quote |
' |
' |
Apostrophe |
These entities must be used when the literal character would be misinterpreted as XML markup. For example, < in text content must be written as <.
Numeric Character References
Any Unicode character can be included using numeric references:
- Decimal:
©produces(c)(copyright symbol) - Hexadecimal:
©produces(c)(same character) - Emoji:
😀produces a grinning face emoji
Numeric references are useful for characters not available on the keyboard or not supported by the document's encoding.
Custom Entities (DTD)
You can define custom entities in a Document Type Definition:
<!DOCTYPE document [
<!ENTITY company "Acme Corporation">
<!ENTITY copyright "© 2024 Acme Corporation. All rights reserved.">
]>
<document>
<header>&company; - Confidential</header>
<footer>©right;</footer>
</document>
Custom entities act like text macros — each &company; reference is expanded to the full text during parsing.
External Entities
External entities reference content from external files or URLs:
<!ENTITY chapter1 SYSTEM "chapter1.xml">
Security warning: External entities can be exploited in XXE (XML External Entity) attacks. Most modern parsers disable external entity resolution by default.
Parameter Entities
Used within DTDs only, parameter entities (declared with %) allow reuse of DTD fragments:
<!ENTITY % common-attrs "id ID #IMPLIED class CDATA #IMPLIED">
<!ELEMENT item (#PCDATA)>
<!ATTLIST item %common-attrs;>
Formatting Considerations
When formatting XML, entity references should be preserved as-is. The formatter should not expand entities (replacing < with < would break the document) or introduce new entities where literal characters are valid.
Use Case
Understanding XML entities is crucial for developers handling XML data that contains special characters (internationalized content, mathematical formulas, legal symbols), building XML documents programmatically, securing XML parsers against XXE attacks, and working with DTD-validated XML in publishing and document management systems.