HTTP 429 vs 503 — Too Many Requests vs Service Unavailable Comparison

http 429 vs 503: when to return Too Many Requests for per-client rate limits versus Service Unavailable for overall capacity issues. Includes Retry-After header guidance.

4xx

429

Too Many Requests

View full 429 page →

4xx

503

Service Unavailable

View full 503 page →

Quick Cheat Sheet

Aspect 429 Too Many Requests 503 Service Unavailable
Class 4xx — Client error 5xx — Server error
Whose fault is it? The caller is over their quota The server is overloaded or in maintenance
Targeting Per-client / per-key Global or per-region
Retry-After header Recommended Recommended
Affects status pages? No, business as usual Yes, this is an incident

The Difference in One Sentence

429 says: "You are sending too much." 503 says: "We can't serve anyone right now."

When to Return 429

429 (RFC 6585, now in RFC 9110 § 15.5.29) is the correct response for any per-client rate limit: token bucket exceeded, daily quota hit, concurrent-request cap reached. The caller can typically resolve this by slowing down or waiting.

Always include a Retry-After header (either an integer of seconds or an HTTP-date) and ideally a RateLimit-Reset / X-RateLimit-Remaining header so well-behaved clients can self-regulate.

When to Return 503

503 (RFC 9110 § 15.6.4) is for global unavailability — the service itself is unhealthy, not the caller's fault:

  • Database is down for failover
  • Load balancer can't find a healthy backend
  • Maintenance window in progress
  • Auto-scaling can't keep up with traffic spike

Like 429, 503 should include Retry-After when known (e.g., a maintenance window expected to last 10 minutes).

Why the Distinction Matters

  • For SLAs and uptime monitoring: 429 doesn't count as downtime; 503 does. Confusing them inflates your incident metrics.
  • For client retry logic: smart clients back off differently — exponential per-client for 429, but they may try a different region or shed entirely on 503.
  • For CDNs: Cloudflare returns 429 for its WAF rate limiting and 503 for "host unreachable" / origin-unhealthy scenarios.

Common Mistakes

  • Returning 503 for per-IP rate limits — incorrect; that's 429.
  • Returning 429 for global capacity issues — incorrect; that's 503.
  • Returning 429 without Retry-After — clients then guess, often hammering you faster.

Real-World Examples

  • Twitter/X API uses 429 with x-rate-limit-reset (epoch seconds).
  • GitHub API uses 429 for secondary rate limits and 503 during major incidents.
  • AWS API Gateway returns 429 for throttling and 503 when the integration is unhealthy.

Real-World Use Case

In an Express middleware using express-rate-limit, exceeding the per-IP cap should return 429 with Retry-After: 60. When your Postgres primary fails over and the connection pool is empty, return 503 with Retry-After: 30 so PagerDuty fires and load balancers shift traffic to a healthy region.

Look Up Any Status Code

Browse all status codes →