handshake

Consensus
Algorithms

How do distributed nodes agree on a single value (like "Who is the leader?") when messages can be lost and nodes can lie?

The Problem: Reaching Agreement

psychology_alt

Two Generals Problem

Two armies (nodes) need to attack a city at the exact same time. They can only communicate by sending a messenger through enemy territory (unreliable network).

General A: "Attack at dawn?"
General B: "Yes, dawn works." (Messenger captured? Did A get it?)
General A: "I got your Yes. Confirming." (Did B get this?)
// It's impossible to be 100% sure the other side knows.

Consensus algorithms like Paxos and Raft solve a relaxed version of this: agreeing on a value assuming the network eventually delivers messages.

Paxos (The Theorist)

The original consensus algorithm by Leslie Lamport. Proven correct, but famously difficult to understand and implement.

Key Concept: Proposals

Nodes propose values with a sequence number. Higher numbers win. Requires a majority (Quorum) to accept a proposal.

Used by: Google Chubby, Apache Zookeeper (ZAB is similar).

school

"There is only one consensus protocol, and that is Paxos."
— Mike Burrows (Google)

Raft (The Pragmatist)

anchor

Understandable Consensus

Designed explicitly to be easier to understand than Paxos. It decomposes the problem into: Leader Election, Log Replication, and Safety.

1. Leader

Accepts all writes. Handles replication.

2. Follower

Passive. Responds to Leader. Becomes Candidate if Leader dies.

3. Candidate

Active only during election. Asks for votes.

Used by: etcd (Kubernetes), Consul, CockroachDB.

Leader Election

How does Raft pick a leader?

Step 1: Timeout

Followers expect a heartbeat from Leader every 100ms. If silence for 300ms (randomized), a Follower assumes Leader is dead.

Step 2: Request Vote

Follower becomes Candidate. Increments term counter. Sends "Vote for Me!" to all.

Step 3: Majority Wins

If a Candidate gets votes from a majority (N/2 + 1), it becomes Leader. It starts sending heartbeats to suppress other candidates.

Gossip Protocols

Epidemic Algorithms

Unlike Raft (Strong Consistency), Gossip is for Eventual Consistency. Nodes pick random peers and exchange information. Like a virus spreading.

How it works

  • Every second, Node A picks random Node B.
  • They exchange state ("I know X=5", "Oh, I thought X=4").
  • Both update to the latest version.
  • Information propagates exponentially.

Use Cases

• Cassandra (Cluster membership)
• DynamoDB (Failure detection)
• Bitcoin (Peer discovery)