balance

Load
Balancing

The traffic cop of your system. Distributing requests across servers to ensure no single node is overwhelmed. The entry point for scale.

What is a Load Balancer?

person
person
person
Users
arrow_forward
hubLoad Balancer
Traffic Cop
arrow_forward
Server 1
Server 2
Server 3
It sits in front of your servers and routes client requests. It handles SSL termination, health checks, and traffic distribution.

L4 vs L7 Load Balancing

layers

Layer 4 (Transport)

Operates at the TCP/UDP level. It sees IP addresses and ports, but NOT the content of the request.

  • Extremely fast (packet forwarding)
  • High throughput
  • "Dumb" routing (Round Robin/IP Hash)
  • Can't cache content
  • Can't route based on URL path
Example: HAProxy (TCP mode), Network LBs
http

Layer 7 (Application)

Operates at the HTTP level. It decrypts the request and inspects headers, URL, cookies, and payload.

  • Smart routing (/api vs /images)
  • Can terminate SSL (HTTPS)
  • Rate limiting & Auth
  • Slower (CPU intensive)
  • More complex to manage
Example: Nginx, AWS ALB, Envoy

Standard Algorithms

sync

Round Robin

Requests go to Server 1, then 2, then 3, then back to 1. Simple, fair if servers are equal.

network_node

Least Connections

Sends request to the server with the fewest active connections. Good for long-lived sessions (e.g. Chat).

weight

Weighted Round Robin

Assigns more requests to powerful servers. 5:1 ratio for SuperServer vs MiniServer.

fingerprint

IP Hash

Hashes the client's IP to always pick the same server. Useful for sticky sessions.

Consistent Hashing

The "Ring" Problem

Standard hashing hash(key) % N fails when N changes (server added/removed). Almost ALL keys would get remapped to different servers.

Consistent Hashing maps both servers and keys onto a circle (0-360°). A key goes to the first server it finds moving clockwise.

Benefit: When a server leaves, only its immediate neighbor takes its load. Only 1/N keys need to be moved. Essential for Caching systems and Distributed DBs (Cassandra, Dynamo).

Health Checks

A Load Balancer must know if a server is dead. It pings servers periodically.

  • Active: Server responds to /health with 200 OK.
  • Dead: Connection timeout or 500 Error.

"If a server fails 3 pings in a row, stop sending it traffic. When it passes 2, resume."

Server 1check
Server 2close
Server 3check

Sticky Sessions

Sometimes, a user MUST go to the same server (e.g., local session data). The LB uses a cookie or IP hash to bind a user to a specific node.

warningThe Danger

Sticky sessions make autoscaling hard. If a server dies, all its stuck users lose their session data.

Better Solution: Stateless servers. Store session data in a shared cache like Redis.