Handle Mixed Content XML in JSON Schema
Learn how elements containing both text content and child elements or attributes are represented using the configurable text content key (#text or _text).
Detailed Explanation
Mixed Content: Text + Elements + Attributes
In XML, an element can contain text interleaved with child elements — this is called mixed content. The converter handles this by creating a dedicated property for the text content alongside the child element and attribute properties.
Example XML
<paragraph class="intro" id="p1">
This is <bold>important</bold> text with <link href="/docs">a link</link>.
</paragraph>
Generated Schema with #text key
{
"type": "object",
"properties": {
"paragraph": {
"type": "object",
"properties": {
"@class": { "type": "string" },
"@id": { "type": "string" },
"#text": { "type": "string" },
"bold": { "type": "string" },
"link": {
"type": "object",
"properties": {
"@href": { "type": "string" },
"#text": { "type": "string" }
}
}
}
}
}
}
The Text Content Key
When an element has both text and child elements (or text and attributes), the text content is placed under a special property key. You can choose between:
#text— the more common convention, used by xml2js and many XML-to-JSON libraries_text— an alternative for systems that do not support#in property names
When Is #text Used?
The #text property only appears when an element has mixed content. If an element contains only text (no children, no attributes), the element maps directly to a scalar type without needing a separate text property.
| Element Has | Schema Representation |
|---|---|
| Text only | Scalar type (string, number, etc.) |
| Text + attributes | Object with #text + attribute properties |
| Text + children | Object with #text + child properties |
| Children only | Object with child properties |
Use Case
When processing markup-heavy XML like XHTML content, DocBook documents, or any XML format where text is mixed with inline elements. Understanding mixed content handling is essential for accurately modeling these formats in JSON Schema.