HTTP 429 vs 503 — Too Many Requests vs Service Unavailable Comparison

http 429 vs 503: when to return Too Many Requests for per-client rate limits versus Service Unavailable for overall capacity issues. Includes Retry-After header guidance.

4xx

429

Too Many Requests

View full 429 page →

4xx

503

Service Unavailable

View full 503 page →

Quick Cheat Sheet

Aspect	429 Too Many Requests	503 Service Unavailable
Class	4xx — Client error	5xx — Server error
Whose fault is it?	The caller is over their quota	The server is overloaded or in maintenance
Targeting	Per-client / per-key	Global or per-region
`Retry-After` header	Recommended	Recommended
Affects status pages?	No, business as usual	Yes, this is an incident

The Difference in One Sentence

429 says: "You are sending too much." 503 says: "We can't serve anyone right now."

When to Return 429

429 (RFC 6585, now in RFC 9110 § 15.5.29) is the correct response for any per-client rate limit: token bucket exceeded, daily quota hit, concurrent-request cap reached. The caller can typically resolve this by slowing down or waiting.

Always include a Retry-After header (either an integer of seconds or an HTTP-date) and ideally a RateLimit-Reset / X-RateLimit-Remaining header so well-behaved clients can self-regulate.

When to Return 503

503 (RFC 9110 § 15.6.4) is for global unavailability — the service itself is unhealthy, not the caller's fault:

Database is down for failover
Load balancer can't find a healthy backend
Maintenance window in progress
Auto-scaling can't keep up with traffic spike

Like 429, 503 should include Retry-After when known (e.g., a maintenance window expected to last 10 minutes).

Why the Distinction Matters

For SLAs and uptime monitoring: 429 doesn't count as downtime; 503 does. Confusing them inflates your incident metrics.
For client retry logic: smart clients back off differently — exponential per-client for 429, but they may try a different region or shed entirely on 503.
For CDNs: Cloudflare returns 429 for its WAF rate limiting and 503 for "host unreachable" / origin-unhealthy scenarios.

Common Mistakes

Returning 503 for per-IP rate limits — incorrect; that's 429.
Returning 429 for global capacity issues — incorrect; that's 503.
Returning 429 without Retry-After — clients then guess, often hammering you faster.

Real-World Examples

Twitter/X API uses 429 with x-rate-limit-reset (epoch seconds).
GitHub API uses 429 for secondary rate limits and 503 during major incidents.
AWS API Gateway returns 429 for throttling and 503 when the integration is unhealthy.

Real-World Use Case

In an Express middleware using express-rate-limit, exceeding the per-IP cap should return 429 with Retry-After: 60. When your Postgres primary fails over and the connection pool is empty, return 503 with Retry-After: 30 so PagerDuty fires and load balancers shift traffic to a healthy region.

Look Up Any Status Code

Browse all status codes →

Related Comparisons

HTTP 503 vs 500 — Service Unavailable vs Internal Server Error Comparison

http 503 vs 500: 503 is intentional (overload, maintenance) and includes Retry-After, while 500 is an unintentional crash. Why the right choice affects monitoring and SLAs.

HTTP 502 vs 503 — Bad Gateway vs Service Unavailable Comparison

http 502 vs 503: 502 is an *unintentional* upstream failure detected by a proxy, 503 is an *intentional* unavailability signal from the application. Why this distinction matters for incident triage.

HTTP 408 vs 504 — Request Timeout vs Gateway Timeout Comparison

http 408 vs 504: who timed out, the client or an upstream server? Learn the difference, when each is returned, and how to debug timeouts in Nginx, ALB, and Cloudflare.

HTTP 401 vs 403 — Unauthorized vs Forbidden Status Code Comparison

Confused about http 401 vs 403? Learn the exact difference between Unauthorized and Forbidden, when each is returned, and how Express, Stripe, and GitHub APIs use them.