Infer JSON Schema from an RSS Feed
Generate a JSON Schema from RSS 2.0 feed XML with channel metadata, repeating item elements, and mixed attribute and element content.
Detailed Explanation
RSS Feed Schema Inference
RSS (Really Simple Syndication) is one of the most common XML formats on the web. Converting RSS to JSON Schema is useful for building feed readers, aggregators, and content pipelines.
Example RSS 2.0 Feed
<rss version="2.0">
<channel>
<title>Tech Blog</title>
<link>https://example.com</link>
<description>Latest tech articles</description>
<language>en-us</language>
<lastBuildDate>Mon, 01 Jan 2024 00:00:00 GMT</lastBuildDate>
<item>
<title>Getting Started with TypeScript</title>
<link>https://example.com/typescript</link>
<description>A beginner guide to TypeScript</description>
<pubDate>Mon, 01 Jan 2024 00:00:00 GMT</pubDate>
<guid>https://example.com/typescript</guid>
<category>Programming</category>
</item>
<item>
<title>Understanding JSON Schema</title>
<link>https://example.com/json-schema</link>
<description>Deep dive into JSON Schema</description>
<pubDate>Tue, 02 Jan 2024 00:00:00 GMT</pubDate>
<guid>https://example.com/json-schema</guid>
<category>Data</category>
</item>
</channel>
</rss>
Key Schema Features
- Root attribute: The
versionattribute on<rss>becomes@version - Channel object: Contains metadata fields and the items array
- Items array: Multiple
<item>elements are detected as an array - Merged item schema: All item fields are combined into a single item type
- All strings: RSS dates and URLs are strings (no special date/URL types in JSON Schema)
Channel vs Item Structure
The schema clearly separates channel-level metadata (title, link, description, language) from item-level properties. This distinction is important for feed processing applications.
Extending the Schema
The generated schema provides a solid foundation. For production use, you might want to:
- Add
format: "uri"to link and guid properties - Add
format: "date-time"to pubDate and lastBuildDate - Add enum values for known categories
- Mark certain fields as required (title, link are typically mandatory in RSS)
Atom Feeds
The same approach works for Atom feeds, though the structure differs. Atom uses attributes more heavily (e.g., <link rel="alternate" href="..."/>) and has different element names.
Use Case
When building RSS feed readers, aggregators, or content syndication systems that need to validate feed data after converting from XML to JSON. The schema serves as documentation and validation for the expected feed structure.