Regex for Log File Parsing — Apache, Nginx, and Application Logs

Regex patterns for parsing common log formats including Apache Combined Log, Nginx access logs, and application log files. Extract timestamps, IPs, methods, and status codes.

Common Patterns

Detailed Explanation

Log File Parsing with Regex

Server and application logs follow semi-structured formats that regex excels at parsing. Named capture groups make field extraction clean and maintainable.

Apache Combined Log Format

(?<ip>[\d.]+) - (?<user>\S+) \[(?<time>[^\]]+)\] "(?<method>\w+) (?<path>\S+) (?<proto>[^"]+)" (?<status>\d{3}) (?<size>\d+|-) "(?<referrer>[^"]*)" "(?<agent>[^"]*)"

Example log line:

192.168.1.1 - admin [15/Jan/2024:10:30:00 +0000] "GET /api/users HTTP/1.1" 200 1234 "https://example.com" "Mozilla/5.0"

Nginx Access Log

Nginx uses a similar format by default:

(?<ip>[\d.]+) - (?<user>\S+) \[(?<time>[^\]]+)\] "(?<request>[^"]*)" (?<status>\d{3}) (?<bytes>\d+) "(?<referrer>[^"]*)" "(?<agent>[^"]*)"

Application Log Pattern

Many apps use a timestamp-level-message format:

\[(?<timestamp>[^\]]+)\] (?<level>DEBUG|INFO|WARN|ERROR|FATAL) (?<logger>[\w.]+) - (?<message>.+)

Matches:

[2024-01-15 10:30:00.123] ERROR com.example.App - Connection timeout after 30s

Filtering by Status Code

Find all 5xx errors:

" 5\d{2} "

Find all non-200 responses:

" (?!200 )\d{3} "

Tips for Log Parsing

  • Use non-greedy quantifiers or negated character classes for fields enclosed in delimiters
  • Named groups make the extracted data self-documenting
  • For production log analysis, dedicated tools (ELK stack, Loki) are more appropriate than regex
  • Test patterns against real log samples, as formats often have subtle variations

Use Case

You are analyzing server access logs to find error patterns, building a simple log viewer that needs to parse and colorize log levels, or extracting specific request information from web server logs for debugging.

Try It — Regex Cheat Sheet

Open full tool