HTTP 429 vs 503 — Too Many Requests vs Service Unavailable Comparison
http 429 vs 503: when to return Too Many Requests for per-client rate limits versus Service Unavailable for overall capacity issues. Includes Retry-After header guidance.
Quick Cheat Sheet
| Aspect | 429 Too Many Requests | 503 Service Unavailable |
|---|---|---|
| Class | 4xx — Client error | 5xx — Server error |
| Whose fault is it? | The caller is over their quota | The server is overloaded or in maintenance |
| Targeting | Per-client / per-key | Global or per-region |
Retry-After header |
Recommended | Recommended |
| Affects status pages? | No, business as usual | Yes, this is an incident |
The Difference in One Sentence
429 says: "You are sending too much." 503 says: "We can't serve anyone right now."
When to Return 429
429 (RFC 6585, now in RFC 9110 § 15.5.29) is the correct response for any per-client rate limit: token bucket exceeded, daily quota hit, concurrent-request cap reached. The caller can typically resolve this by slowing down or waiting.
Always include a Retry-After header (either an integer of seconds or an HTTP-date) and ideally a RateLimit-Reset / X-RateLimit-Remaining header so well-behaved clients can self-regulate.
When to Return 503
503 (RFC 9110 § 15.6.4) is for global unavailability — the service itself is unhealthy, not the caller's fault:
- Database is down for failover
- Load balancer can't find a healthy backend
- Maintenance window in progress
- Auto-scaling can't keep up with traffic spike
Like 429, 503 should include Retry-After when known (e.g., a maintenance window expected to last 10 minutes).
Why the Distinction Matters
- For SLAs and uptime monitoring: 429 doesn't count as downtime; 503 does. Confusing them inflates your incident metrics.
- For client retry logic: smart clients back off differently — exponential per-client for 429, but they may try a different region or shed entirely on 503.
- For CDNs: Cloudflare returns 429 for its WAF rate limiting and 503 for "host unreachable" / origin-unhealthy scenarios.
Common Mistakes
- Returning 503 for per-IP rate limits — incorrect; that's 429.
- Returning 429 for global capacity issues — incorrect; that's 503.
- Returning 429 without
Retry-After— clients then guess, often hammering you faster.
Real-World Examples
- Twitter/X API uses 429 with
x-rate-limit-reset(epoch seconds). - GitHub API uses 429 for secondary rate limits and 503 during major incidents.
- AWS API Gateway returns 429 for throttling and 503 when the integration is unhealthy.
Real-World Use Case
In an Express middleware using express-rate-limit, exceeding the per-IP cap should return 429 with Retry-After: 60. When your Postgres primary fails over and the connection pool is empty, return 503 with Retry-After: 30 so PagerDuty fires and load balancers shift traffic to a healthy region.
Look Up Any Status Code
Related Comparisons
HTTP 503 vs 500 — Service Unavailable vs Internal Server Error Comparison
http 503 vs 500: 503 is intentional (overload, maintenance) and includes Retry-After, while 500 is an unintentional crash. Why the right choice affects monitoring and SLAs.
HTTP 502 vs 503 — Bad Gateway vs Service Unavailable Comparison
http 502 vs 503: 502 is an *unintentional* upstream failure detected by a proxy, 503 is an *intentional* unavailability signal from the application. Why this distinction matters for incident triage.
HTTP 408 vs 504 — Request Timeout vs Gateway Timeout Comparison
http 408 vs 504: who timed out, the client or an upstream server? Learn the difference, when each is returned, and how to debug timeouts in Nginx, ALB, and Cloudflare.
HTTP 401 vs 403 — Unauthorized vs Forbidden Status Code Comparison
Confused about http 401 vs 403? Learn the exact difference between Unauthorized and Forbidden, when each is returned, and how Express, Stripe, and GitHub APIs use them.