Convert WordPress HTML Content to Clean Markdown

Convert WordPress-generated HTML (with wp-block classes, Gutenberg blocks, shortcodes, and embedded media) to clean Markdown. A practical guide for WordPress migration.

Real-World HTML

Detailed Explanation

WordPress HTML to Markdown

WordPress generates HTML with specific patterns — Gutenberg block wrappers, wp-block-* classes, inline styles, shortcodes, and embedded media. Converting this to clean Markdown requires understanding these patterns.

Gutenberg Block Wrappers

<!-- wp:paragraph -->
<p>This is a paragraph block.</p>
<!-- /wp:paragraph -->

<!-- wp:heading {"level":2} -->
<h2 class="wp-block-heading">Getting Started</h2>
<!-- /wp:heading -->

<!-- wp:list -->
<ul class="wp-block-list">
  <li>First item</li>
  <li>Second item</li>
</ul>
<!-- /wp:list -->

Converts to:

This is a paragraph block.

## Getting Started

- First item
- Second item

The Gutenberg block comments (<!-- wp:... -->) and wp-block-* classes are stripped during conversion.

WordPress Image Blocks

<!-- wp:image {"id":123,"sizeSlug":"large"} -->
<figure class="wp-block-image size-large">
  <img src="https://example.com/photo.jpg" alt="A scenic view" class="wp-image-123">
  <figcaption class="wp-element-caption">Photo by John Doe</figcaption>
</figure>
<!-- /wp:image -->

Converts to:

![A scenic view](https://example.com/photo.jpg)

Photo by John Doe

The <figure> and <figcaption> are handled to extract the image and caption text.

Shortcodes

WordPress shortcodes like [gallery], [embed], and [contact-form] have no Markdown equivalent. They are typically:

  • Preserved as-is if the target platform supports them
  • Stripped with their content removed
  • Replaced with a placeholder comment

WordPress Embeds

<!-- wp:embed {"url":"https://youtube.com/watch?v=abc","type":"video"} -->
<figure class="wp-block-embed is-type-video">
  <div class="wp-block-embed__wrapper">
    https://youtube.com/watch?v=abc
  </div>
</figure>
<!-- /wp:embed -->

Converts to a plain URL or a Markdown link:

[https://youtube.com/watch?v=abc](https://youtube.com/watch?v=abc)

Classic Editor Content

Older WordPress content from the Classic Editor uses simpler HTML (paragraphs, headings, images) without Gutenberg wrappers. This is more straightforward to convert using standard HTML-to-Markdown rules.

Use Case

WordPress powers over 40% of the web, making WordPress-to-Markdown conversion one of the most common migration scenarios. Teams move from WordPress to static site generators (Hugo, Jekyll, Astro) for performance and developer experience, requiring reliable HTML-to-Markdown conversion.

Try It — HTML to Markdown

Open full tool