HTTP 502 vs 503 — Bad Gateway vs Service Unavailable Comparison
http 502 vs 503: 502 is an *unintentional* upstream failure detected by a proxy, 503 is an *intentional* unavailability signal from the application. Why this distinction matters for incident triage.
Quick Cheat Sheet
| Aspect | 502 Bad Gateway | 503 Service Unavailable |
|---|---|---|
| Issued by | A proxy/gateway | The application itself |
| Intent | Unintentional (upstream broke) | Intentional (service knows it's down) |
| Retry-After | Optional | Recommended |
| What to investigate | Upstream process / connectivity | Application logic / dependencies |
The Source of the Response Tells You What's Wrong
This is the key insight: 502 comes from the proxy, 503 comes from the application (in most setups).
502 Bad Gateway: your reverse proxy (Nginx, ALB, Cloudflare) tried to talk to your backend and got something invalid back — or got nothing at all because the backend is dead. Your application typically has no awareness this happened.
503 Service Unavailable: your application (or a healthy proxy in front that knows the situation) is deliberately responding with "no, not now." The application is alive and aware enough to refuse gracefully.
Mental Model
Imagine your stack as: Browser → Nginx → Node.js → DB
- Node.js process crashed → Browser sees 502 from Nginx. Nginx has no app logs because the app didn't run.
- Node.js detects the DB is in failover, refuses requests with Retry-After → Browser sees 503. App logs show "DB unhealthy, returning 503."
- Node.js is doing a graceful shutdown for deploy → App returns 503 while in-flight requests drain.
When You'll See Each in Production
502 typical scenarios:
- A bad deployment crashes the new container; Nginx/ALB returns 502 while old pods drain
- An OOM kill from the kernel reaps your worker process
- A NAT-related connection reset between proxy and backend
- A misconfigured upstream port
503 typical scenarios:
- Maintenance window; you've deployed a "we'll be right back" page
- The app's internal health check fails (DB unreachable, queue full); the app self-reports unavailable
- AWS-style "load shedding": when capacity is exceeded, the app returns 503 instead of crashing
Why the Distinction Matters
For incident response, the distinction tells you where to look:
- 502 spike → check infrastructure: process crashes, OOM events, container health, network issues
- 503 spike → check application logic: what's the app's self-reported health? What dependency is down?
For load balancers, AWS ALB treats 503 as "this target is temporarily unhealthy, route around it gently" but treats consistent 502s as "this target may need to be replaced."
For client retry behavior, a smart client should retry 503 (especially with Retry-After) and may retry 502 once or twice but not aggressively, since 502 often indicates a more persistent failure.
CDN-Specific Mappings
- Cloudflare 502: origin returned an invalid response
- Cloudflare 503: usually rate-limiting or origin health-check failure
- AWS ALB 502: target closed connection or sent malformed response
- AWS ALB 503: no healthy targets available
Common Mistakes
- Returning 503 from a healthy app during planned brief operations like a 1-second restart — clients see this as a failure; better to have zero-downtime deploys
- Misinterpreting 502 as "my code has a bug" — usually it's an infrastructure issue, not an application bug
Real-World Use Case
When your Kubernetes pod is OOMKilled mid-request, the Nginx ingress returns 502 with no app logs. When your app detects Postgres is unreachable in its health probe and starts returning 503 with Retry-After: 30, the load balancer drains traffic from that pod gracefully. The first is an unplanned crash, the second is your app behaving correctly under known-bad conditions.
Look Up Any Status Code
Related Comparisons
HTTP 502 vs 504 — Bad Gateway vs Gateway Timeout Comparison
http 502 vs 504: both come from your reverse proxy, but 502 means the upstream returned a bad response while 504 means it didn't respond in time. Nginx, ALB, Cloudflare debugging tips.
HTTP 503 vs 500 — Service Unavailable vs Internal Server Error Comparison
http 503 vs 500: 503 is intentional (overload, maintenance) and includes Retry-After, while 500 is an unintentional crash. Why the right choice affects monitoring and SLAs.
HTTP 500 vs 502 — Internal Server Error vs Bad Gateway Comparison
http 500 vs 502: 500 is your application crashing, 502 is your reverse proxy unable to reach the upstream. Learn the debugging workflow for each.
HTTP 429 vs 503 — Too Many Requests vs Service Unavailable Comparison
http 429 vs 503: when to return Too Many Requests for per-client rate limits versus Service Unavailable for overall capacity issues. Includes Retry-After header guidance.