What is a distributed consensus algorithm?
In a distributed system, there are many nodes in one cluster. We need to have an algorithm to ensure a consensus of data among those nodes or reach an agreement on a proposal. The algorithm is called a distributed consensus algorithm.
What is Raft?
Raft is an understandable distributed consensus algorithm that is easy to explain how it works. On the contrary, Paxos, another distributed consensus algorithm that was first submitted in 1989, is hard to understand and implement. I’m going to give a short introduction about Raft.
The Raft has three states: Leader, Follower, and Candidate. Leader handles all client requests, replicates logs to followers, and there is only one leader at the same time. The Candidate is a transition state from Follower to Leader and is used to elect a new Leader. The Follower is passive and responds to requests from leaders and candidates.
Raft divides time into terms of arbitrary length. Each term begins with an election. The term number increases monotonically over time, which is used to detect if a leader is stale.
Election timeout and Heartbeat timeout
A follower will begin an election to choose a new leader if it receives no request over a period called election timeout. The leader sends heartbeat messages to followers in intervals specified by the heartbeat timeout, and followers will reset the election timeout.
Replicated state machine
The Replicated state machine is typically implemented using a replicated log. If each server contains the same series of commands, and the state machine executes those commands in the same order, those servers will have the same result.