Detect Arrays from Repeated XML Elements
Learn how repeated sibling elements with the same tag name are automatically detected and converted to JSON Schema array types with merged item schemas.
Detailed Explanation
Repeated Elements Become Arrays
One of the key challenges in XML-to-JSON conversion is detecting arrays. Unlike JSON, XML does not have an explicit array type — arrays are implied by multiple sibling elements sharing the same tag name.
Example XML
<classroom>
<student>
<name>Alice</name>
<grade>92</grade>
</student>
<student>
<name>Bob</name>
<grade>85</grade>
</student>
<student>
<name>Charlie</name>
<grade>78</grade>
</student>
</classroom>
Generated Schema
{
"type": "object",
"properties": {
"classroom": {
"type": "object",
"properties": {
"student": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"grade": { "type": "integer" }
}
}
}
}
}
}
}
How Array Detection Works
The converter counts how many times each tag name appears as a direct child of a parent element. When a tag appears more than once, it is treated as an array. The schemas from all occurrences are merged to produce a single, unified item schema.
Schema Merging
When array items have different structures across occurrences, the converter merges them:
<logs>
<entry level="info"><message>Started</message></entry>
<entry level="error"><message>Failed</message><code>500</code></entry>
</logs>
The merged item schema includes both message (required, appears in all) and code (optional, appears in some), with required detection reflecting which properties appear universally.
Array Threshold
The array detection threshold controls the minimum ratio of occurrences that triggers array inference. The default threshold means any tag that appears more than once is treated as an array. This is the correct behavior for most XML documents.
Use Case
When working with XML data that contains collections — such as list responses from APIs, log files, CSV-like data in XML format, or any document with repeating record elements. Understanding array detection is crucial for generating accurate schemas.