Regex to Extract Markdown Links — [text](url) Pattern
Regex to extract Markdown inline links [text](url) and reference links. Captures link text, URL, and optional title separately for analysis or link auditing.
Detailed Explanation
Extracting Markdown Links
Markdown’s inline link syntax is [text](url "optional title"). Reference links use [text][ref] with a definition elsewhere. Both can be extracted with regex when content is well-formed.
Inline Links
\[(?<text>[^\]]+)\]\((?<url>[^\s\)]+)(?:\s+"(?<title>[^"]*)")?\)
Captures three named groups: text, url, and optional title.
Tested Examples
| Input | text | url | title |
|---|---|---|---|
[Google](https://google.com) |
https://google.com | — | |
[Docs](https://x.io "Documentation") |
Docs | https://x.io | Documentation |
[link](/relative/path) |
link | /relative/path | — |
[empty]() |
— (matches but url is empty) | — | — |
Reference Links
\[(?<text>[^\]]+)\]\[(?<ref>[^\]]+)\]
Matches [click][ref-id].
Reference Definitions
^\s*\[(?<id>[^\]]+)\]:\s*(?<url>\S+)(?:\s+"(?<title>[^"]*)")?\s*$
(Use the m flag for multi-line documents.)
Image Syntax
Markdown images are like links with a leading !:
!\[(?<alt>[^\]]*)\]\((?<url>[^\s\)]+)(?:\s+"(?<title>[^"]*)")?\)
Avoid Matching Code Blocks
To skip Markdown links inside fenced code blocks, pre-process the document:
const stripped = md.replace(/```[\s\S]*?```/g, "");
const links = [...stripped.matchAll(/\[([^\]]+)\]\(([^\)]+)\)/g)];
Practical Note
For full CommonMark compliance, use a parser (marked, micromark, markdown-it). Regex works well for link audits, sitemap generation from Markdown docs, and bulk URL replacement.
Use Case
Auditing internal links in a Markdown documentation site, generating a list of outbound URLs from blog posts for SEO review, or rewriting relative paths during a docs migration.