Load balancer health checks are tests to confirm the availability of backend
servers.
These tests occur in the form of a request or a connection attempt depending on the protocol.
The health check policy includes a time interval you specify, to ensure that the backend
servers are continuously monitored. If a server fails a health check, the load balancer takes
the server temporarily out of rotation. If the server later passes the health check, the load
balancer returns it to the rotation.
The health check policy is configured when you create a backend set. You can configure
TCP-level or HTTP-level health checks for your backend servers. For backend sets configured
with SSL the health checks also use SSL encryption.
TCP-level health checks attempt to make a TCP connection with the backend servers and
validate the response based on the connection status.
HTTP-level health checks send requests to the backend servers at a specific URI and
validate the response based on the status code or entity data (body) returned.
Configure your health check protocol to match your application or service. If you run an HTTP
service, then configure an HTTP-level health check. If you run a TCP-level health check
against an HTTP service, then you might not get an accurate response. The TCP handshake can
succeed and indicate that the service is up even when the HTTP service is incorrectly
configured or having other issues. Although the health check returns no errors, you might
experience transaction failures.
For example:
The backend HTTP service has issues when communicating with the health check URL and the health check URL returns 5nn messages. An HTTP health check catches the message from the health check URL and marks the service as down. In this case, a TCP health check handshake succeeds and marks the service as healthy, even though the HTTP service might not be usable.
The backend HTTP service responds with 4nn messages because of authorization
issues or no configured content. A TCP health check does not catch these errors.
The service provides application-specific health check capabilities to help you increase
availability and reduce your application maintenance window.
Health status indicators are used to report the general health of a load balancer and its backend servers/sets. The possible statuses are: ok, warning, critical, unknown. Health status is updated every three minutes. No finer granularity is available. Historical health data is not provided.
Interpreting Load Balancer Health Issues
At the highest level, load balancer health reflects the status of its components. The
health status indicators provide information you might need to drill down into and investigate
further. Below are several common issues that the health status indicators can help you detect
and correct.
Health check misconfigured
All the backend servers for one or more of the affected listeners report as unhealthy.
If your investigation finds that the backend servers do not have problems, then a
backend set probably includes a misconfigured health check.
Listener misconfigured
All the backend server health status indicators report OK, but the load balancer does
not pass traffic on a listener. The listener might be configured to listen on the wrong
port, use the wrong protocol, or use the wrong policy. If your investigation shows that
the listener is not at fault, check the security rule configuration.
Security rule misconfigured
Health status indicators help you diagnose these cases of misconfigured security
rules:
All health status indicators report OK, but traffic does not flow (as with
misconfigured listeners). If the listener is not at fault, check the security rule
configuration.
All health status indicators report as unhealthy. You have checked your health
check configuration and your services run properly on your backend servers. In this
case, your security rules might not include the IP range for the source of the
health check requests. The source IP for health check requests belongs to a compute
instance managed by the Load Balancing service.
Note
Traffic may also be blocked because of misconfigured route tables in the compute
instances.
Backend server unhealthy
A backend server might be unhealthy or the health check might be misconfigured. To see
the corresponding error code, check the status field in the backend server's details
through the Console or CLI.
Common Side Effects of Load Balancer Health Check Misconfiguration 🔗
Misconfiguration scenarios are known to occur regularly. This page helps with
troubleshooting.
Wrong Port
In this scenario, all backend servers are reported as unhealthy. If you confirmed that
there are no issues with the backend servers, you might have made a mistake setting the
port. Traffic must be allowed, and the backend must be listening on that port.
Wrong Path
In this scenario, all backend servers are reported as unhealthy. If you confirmed that
there are no issues with the backend servers, you might have made a mistake setting the
path for the HTTP health check. It needs to match an actual application on the
backend.
You can use the curl utility to run a test from a system within the same network. For
example: curl -i http://backend_ip_address/health.
You receive the configured status code in the response:
In this scenario, all backend servers are reported as unhealthy. If you confirmed that
there are no issues with the backend servers, for an HTTP health check you might have
made a mistake setting the status code. It must match the actual status code being
returned from the backend. A typical mismatch is when a backend returns a 302 status
code while a 200 status code is expected. This is often caused by the backend directing
you to a login page or another location on the server. You can either make the backend
return the expected code or use 302 in your health check configuration.
Error message: msg:invalid statusCode, statusCode:nnn,expected:200
(where nnn represents the actual status code returned).
Wrong Regex Pattern
In this scenario, all backend servers are reported as unhealthy. If you confirmed that
there are no issues with the backend servers, you might have made a mistake setting a
regex pattern that's not consistent with the body, or the backend isn't returning the
expected body. In this scenario, you can either change the backend to match the pattern
or correct the pattern to match the backend.
Error message: response match resulte: failed.
Misconfigured Security Rules
In this scenario, all or some backend servers report as unhealthy. If you confirmed
that there are no issues with the backend servers, then you might have improperly
configured either the network security groups, security lists, or local firewalls (such
as firewalld, iptables, or SELinux).
In this scenario, you can use either the curl or netcat utilities to run a test from a
system within the same subnet and network security group (NSG) as your load balancer
instance HTTP. For example: curl -i http://backend_ip_address/health TCP
or nc -zvw3 backend_ip_address 443.
Local firewalls can be verified by using the command: firewall-cmd --list-all
--zone=public. If the expected rules are missing from the firewall configuration, then
add the required service. For example, to add HTTP port 80: