Regex for URL Matching — HTTP, HTTPS, and URI Patterns

Regex patterns for matching URLs including HTTP, HTTPS, and general URI formats. Covers protocol, domain, path, query parameters, and fragment matching.

Common Patterns

Detailed Explanation

URL Matching with Regex

Matching URLs is a frequent requirement in text processing, link extraction, and input validation. Here are patterns from simple to comprehensive.

Simple HTTP/HTTPS Pattern

https?://[\w.-]+(?:/[\w./?%&=-]*)?

This covers basic URLs with optional paths and query strings:

  • https://example.com
  • http://sub.domain.com/path/to/page
  • https://api.example.com/v1/users?page=1&limit=10

Token Breakdown

Token Purpose
https? "http" or "https"
:// Protocol separator
[\w.-]+ Domain name (letters, digits, dots, hyphens)
(?:/[\w./?%&=-]*)? Optional path and query string

More Comprehensive Pattern

For URLs with ports, authentication, and fragments:

https?://(?:[\w-]+(?::[\w-]+)?@)?[\w.-]+(?::\d{1,5})?(?:/[\w./?%&=#-]*)?

This additionally matches:

  • Port numbers: https://localhost:3000/api
  • Basic auth: https://user:pass@example.com
  • Fragment identifiers: https://example.com/page#section

Extracting URL Components

Using named capture groups to extract parts:

(?<protocol>https?)://(?<domain>[\w.-]+)(?::(?<port>\d+))?(?<path>/[^?#]*)?(?:\?(?<query>[^#]*))?(?:#(?<fragment>.*))?

Important Caveats

  • No regex can validate all possible URLs per RFC 3986
  • Consider using the URL constructor (new URL(str)) in JavaScript for reliable parsing
  • These patterns may match invalid domains; DNS resolution is the true validation

Use Case

You are building a tool that extracts links from plain text, validates user-submitted URLs in a form, or processes log files to find all referenced endpoints.

Try It — Regex Cheat Sheet

Open full tool