Nginx Health Checks Configuration

Configure passive and active health checks in Nginx to automatically detect failed backend servers and route traffic around them for high availability.

Proxy

Detailed Explanation

Health checks allow Nginx to detect when backend servers are unhealthy and automatically stop sending traffic to them until they recover. This capability is critical for maintaining application availability in multi-server production environments.

Passive Health Checks (Open Source)

The open-source version of Nginx supports passive health checks, which monitor the results of real client requests to detect backend failures:

upstream backend {
    server 10.0.1.10:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.12:8080 backup;
}

When a server accumulates max_fails consecutive failed responses within the fail_timeout window, Nginx marks it as unavailable for the remainder of that timeout period. After the timeout expires, Nginx tentatively sends a single request to probe whether the server has recovered.

Defining Failure Conditions

By default, Nginx considers a "failure" to be a connection error or timeout. Use the proxy_next_upstream directive to expand the definition to include specific HTTP error status codes:

location / {
    proxy_pass http://backend;
    proxy_next_upstream error timeout http_500 http_502 http_503;
    proxy_next_upstream_tries 2;
    proxy_next_upstream_timeout 10s;
}

This configuration automatically retries a failed request on the next available server if the current one returns a 500, 502, or 503 error, up to 2 total attempts within a 10-second window.

Active Health Checks (Nginx Plus)

Nginx Plus extends the open-source version with active health checks that periodically probe backend servers independently of client traffic:

upstream backend {
    zone backend 64k;
    server 10.0.1.10:8080;
    server 10.0.1.11:8080;
}

location / {
    proxy_pass http://backend;
    health_check interval=5s fails=3 passes=2 uri=/health;
}

Active checks send synthetic requests to a dedicated health endpoint every 5 seconds. A server is marked down after 3 consecutive failures and restored after 2 consecutive successes, providing faster detection than passive monitoring.

DIY Active Health Checks

For the open-source version, you can implement basic active health monitoring using external scripts that periodically probe backends and update the configuration:

#!/bin/bash
for server in 10.0.1.10 10.0.1.11; do
    if ! curl -sf "http://$server:8080/health" > /dev/null; then
        echo "Server $server is unhealthy"
    fi
done
nginx -s reload

Designing Health Check Endpoints

Design your application health check endpoint to verify actual readiness for traffic, not merely that the process is running. Check database connectivity, required external service availability, and sufficient disk space. Return a 200 status when fully healthy and a 503 when the application cannot properly serve requests.

Monitoring Integration

Combine Nginx health checks with external monitoring tools like Prometheus and Grafana. Enable the Nginx stub status module to export metrics about active connections, request rates, and upstream response times for comprehensive observability.

Use Case

You are running a high-availability application across multiple servers and need Nginx to automatically detect and route around failed backends without manual intervention.

Try It — Nginx Config Generator

Open full tool