ArrowLeft Icon

Raft and Paxos: Distributed Consensus Algorithms

📆 · ⏳ 5 min read · · 👀

Introduction

Greetings, fellow tech enthusiasts! Today, we’re setting out on an exciting journey into the world of distributed systems, where we’ll uncover the magic behind two prominent consensus algorithms: Raft and Paxos.

In distributed systems, achieving consensus among multiple nodes is essential for data replication, fault tolerance, and system reliability.

Both Raft and Paxos are designed to tackle this challenging task, but they do so in distinct ways. Let’s learn how - and why - these algorithms work!

Understanding Consensus Algorithms

Consensus algorithms play a pivotal role in distributed systems, ensuring that all nodes in a cluster agree on a single, consistent state despite failures and network partitions.

By establishing a shared understanding of the system’s state, consensus algorithms enable fault tolerance and data replication, making distributed systems resilient and reliable.

Introducing Paxos

Paxos ↗️, one of the pioneering consensus algorithms, was introduced by Leslie Lamport in the late 1980s.

It relies on a leader-based approach, where a single node acts as the leader to propose and coordinate changes to the system’s state.

At its heart, Paxos operates based on a few fundamental concepts

  • Proposal: In Paxos, a node initiates a proposal, which contains a value or operation that it wants to reach consensus on.

    A proposal is a unique identifier, and nodes use it to order proposals.

  • Acceptor: Acceptors are nodes that receive and store proposals. They play a crucial role in deciding which proposals get accepted and in what order.

  • Learner: Learners are nodes that gather the accepted proposals from acceptors and ensure that consensus is achieved.

    They provide the final output to the system.

Here’s a simplified walkthrough of how Paxos achieves consensus:

Phase 1: Prepare (or Prepare & Promise)

In this phase, the node acting as a proposer sends a prepare message to a quorum of acceptors. The prepare message includes the proposal number.

Acceptors check if they’ve seen a proposal with a higher number. If not, they promise not to accept any proposal with a lower number and respond with a promise.

Phase 2: Accept (or Accept & Accepted)

If a proposer receives promises from a quorum of acceptors, it can send an accept message with its proposal to those acceptors.

Upon receiving an accept message, acceptors check if they have promised not to accept any proposal with a lower number. If not, they accept the proposal and send an accepted message.

Phase 3: Learn

Learner nodes collect the accepted proposals from acceptors. Once they observe that a quorum of acceptors has accepted the same proposal, consensus is achieved.

The learners provide the agreed-upon value or operation as the final output.


While this is a simplified overview of Paxos, implementing it correctly and efficiently in a real-world distributed system can be challenging.

Over the years, several variants ↗️ and optimizations of Paxos have emerged to address various practical issues.

The Raft Consensus Algorithm

Raft ↗️, developed by Diego Ongaro and John Ousterhout, offers a more intuitive and understandable approach to consensus.

It features a leader election process and emphasizes simplicity, making it easier to comprehend and implement compared to Paxos.

Raft’s design revolves around several key components:

  • Leader: Raft operates in a leader-follower model. One node is elected as the leader, responsible for managing the replication of log entries across all nodes.

  • Follower: Nodes in the system that are not leaders are followers. They replicate log entries sent by the leader.

    Relate this to the master and slave nodes in a master-slave database architecture.

  • Candidate: When there is a need to elect a new leader (e.g., if the current leader fails), followers transition to candidate status and participate in leader election.

  • Log: Each node maintains a log, which stores a sequence of commands or entries. Log entries represent changes to the system’s state.

Raft achieves consensus through a series of phases and roles. Here’s a step-by-step overview of how Raft works:

Leader Election

Raft begins with leader election. Initially, all nodes are in follower state.

If a follower does not receive a heartbeat message from the leader for a specified time (timeout), it transitions to candidate status and requests votes from other nodes.

The first candidate to collect votes from the majority becomes the new leader.

Log Replication

Once a leader is established, it accepts client requests and appends them as log entries. These log entries are then replicated to all followers.

A log entry is considered committed once it’s stored on a majority of nodes. Followers acknowledge each log entry’s receipt.

Safety and Consistency

Raft ensures safety and consistency. A leader will only accept a log entry if it has not already been included in its log.

This prevents inconsistencies from emerging in the system.

Handling Failures

Raft handles failures gracefully. If the leader crashes or becomes unreachable, followers detect the absence of heartbeat messages and initiate a new leader election.

This ensures that a new leader quickly takes charge in the event of leader failure.


While Raft simplifies many aspects of distributed consensus, it’s not without its complexities.

Implementing Raft in a real-world system involves addressing various details and optimizations.

One good example would be how MongoDB implements Raft in its replica set ↗️.

Conclusion

Distributed consensus algorithms are a fascinating topic, and we’ve only scratched the surface here.

If you’re interested in learning more, would highly recommend checking out the Raft website ↗️ and Paxos website ↗️.

I hope you enjoyed this exploration of Raft and Paxos. Until next time, happy coding!

EnvelopeOpen IconStay up to date

Get notified when I publish something new, and unsubscribe at any time.

Need help with your software project? Let’s talk

You may also like

  • # system design# database

    Choosing the Right Data Storage Solution: SQL vs. NoSQL Databases

    Navigating the world of data storage solutions can be like choosing the perfect tool for a job. Join me as we dive into the dynamic debate of SQL and NoSQL databases, understanding their strengths, limitations, and where they best fit in real-world scenarios.

  • # system design

    Understanding Load Balancing Algorithms: Round-robin and Consistent Hashing

    Welcome to the world of load balancing algorithms, where we unravel the magic behind Round-robin and Consistent Hashing. If you have a solid grasp of technical concepts and are eager to understand how these algorithms efficiently distribute traffic across servers, this blog is your ultimate guide. We'll embark on a human-to-human conversation, exploring the inner workings of Round-robin and Consistent Hashing, and how they keep our systems scalable and performant.

  • # system design

    Dealing with Cache Avalanche to Safeguard Your System

    In the vast digital landscape, ensuring your web applications perform seamlessly is paramount. Yet, a phenomenon called 'Cache Avalanche' can bring your application to a grinding halt. Fear not, for in this guide, we'll demystify Cache Avalanche, delve into its intricacies, and equip you with actionable strategies to shield your system from its disruptive effects.