Load
Balancing
The traffic cop of your system. Distributing requests across servers to ensure no single node is overwhelmed. The entry point for scale.
What is a Load Balancer?
L4 vs L7 Load Balancing
Layer 4 (Transport)
Operates at the TCP/UDP level. It sees IP addresses and ports, but NOT the content of the request.
- ✓ Extremely fast (packet forwarding)
- ✓ High throughput
- ✓ "Dumb" routing (Round Robin/IP Hash)
- ✗ Can't cache content
- ✗ Can't route based on URL path
Layer 7 (Application)
Operates at the HTTP level. It decrypts the request and inspects headers, URL, cookies, and payload.
- ✓ Smart routing (/api vs /images)
- ✓ Can terminate SSL (HTTPS)
- ✓ Rate limiting & Auth
- ✗ Slower (CPU intensive)
- ✗ More complex to manage
Standard Algorithms
Round Robin
Requests go to Server 1, then 2, then 3, then back to 1. Simple, fair if servers are equal.
Least Connections
Sends request to the server with the fewest active connections. Good for long-lived sessions (e.g. Chat).
Weighted Round Robin
Assigns more requests to powerful servers. 5:1 ratio for SuperServer vs MiniServer.
IP Hash
Hashes the client's IP to always pick the same server. Useful for sticky sessions.
Consistent Hashing
The "Ring" Problem
Standard hashing hash(key) % N fails when N changes (server added/removed). Almost ALL keys would get remapped to different servers.
Consistent Hashing maps both servers and keys onto a circle (0-360°). A key goes to the first server it finds moving clockwise.
1/N keys need to be moved. Essential for Caching systems and Distributed DBs (Cassandra, Dynamo).Health Checks
A Load Balancer must know if a server is dead. It pings servers periodically.
- Active: Server responds to /health with 200 OK.
- Dead: Connection timeout or 500 Error.
"If a server fails 3 pings in a row, stop sending it traffic. When it passes 2, resume."
Sticky Sessions
Sometimes, a user MUST go to the same server (e.g., local session data). The LB uses a cookie or IP hash to bind a user to a specific node.
warningThe Danger
Sticky sessions make autoscaling hard. If a server dies, all its stuck users lose their session data.
Better Solution: Stateless servers. Store session data in a shared cache like Redis.