Consistent hash ring

Consistent hashing was first described in a paper, Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the World Wide Web (1997) by David Karger et al. It is used in distributed storage systems like Amazon Dynamo and memcached.

Consistent hashing is a very simple solution to a common problem: how can you find a server in a distributed system to store or retrieve a value identified by a key, while at the same time being able to cope with server failures and network partitions?

Simply finding a server for value is easy; just number your set of s servers from 0 to s – 1. When you want to store or retrieve a value, hash the value’s key modulo s, and that gives you the server.

The problem comes when servers fail or become unreachable through a network partition. At that point, the servers no longer fill the hash space, so the only option is to invalidate the caches on all servers, renumber them, and start again. Given that, in a system with hundreds or thousands of servers, failures are commonplace, this solution is not feasible.
The solution

In consistent hashing, the servers, as well as the keys, are hashed, and it is by this hash that they are looked up. The hash space is large, and is treated as if it wraps around to form a circle – hence hash ring. The process of creating a hash for each server is equivalent to placing it at a point on the circumference of this circle. When a key needs to be looked up, it is hashed, which again corresponds to a point on the circle. In order to find its server, one then simply moves round the circle clockwise from this point until the next server is found. If no server is found from that point to end of the hash space, the first server is used – this is the “wrapping round” that makes the hash space circular.

The only remaining problem is that in practice hashing algorithms are likely to result in clusters of servers on the ring (or, to be more precise, some servers with a disproportionately large space before them), and this will result in greater load on the first server in the cluster and less on the remainder. This can be ameliorated by adding each server to the ring a number of times in different places. This is achieved by having a replica count, which applies to all servers in the ring, and when adding a server, looping from 0 to the count – 1, and hashing a string made from both the server and the loop variable to produce the position. This has the effect of distributing the servers more evenly over the ring. Note that this has nothing to do with server replication; each of the replicas represents the same physical server, and replication of data between servers is an entirely unrelated issue.

I’ve written an example implementation of consistent hashing in C++. As you can imagine from the description above, it isn’t terribly complicated. Here is the main class:

template <class Node, class Data, class Hash = HASH_NAMESPACE::hash<const char*> >
class HashRing
{
public:
    typedef std::map<size_t, Node> NodeMap;

    HashRing(unsigned int replicas)
        : replicas_(replicas), hash_(HASH_NAMESPACE::hash<const char*>())
    {
    }

    HashRing(unsigned int replicas, const Hash& hash)
        : replicas_(replicas), hash_(hash)
    {
    }

    size_t AddNode(const Node& node);
    void RemoveNode(const Node& node);
    const Node& GetNode(const Data& data) const;

private:
    NodeMap ring_;
    const unsigned int replicas_;
    Hash hash_;
};

template <class Node, class Data, class Hash>
size_t HashRing<Node, Data, Hash>::AddNode(const Node& node)
{
    size_t hash;
    std::string nodestr = Stringify(node);
    for (unsigned int r = 0; r < replicas_; r++) {
        hash = hash_((nodestr + Stringify(r)).c_str());
        ring_[hash] = node;
    }
    return hash;
}

template <class Node, class Data, class Hash>
void HashRing<Node, Data, Hash>::RemoveNode(const Node& node)
{
    std::string nodestr = Stringify(node);
    for (unsigned int r = 0; r < replicas_; r++) {
        size_t hash = hash_((nodestr + Stringify(r)).c_str());
        ring_.erase(hash);
    }
}

template <class Node, class Data, class Hash>
const Node& HashRing<Node, Data, Hash>::GetNode(const Data& data) const
{
    if (ring_.empty()) {
        throw EmptyRingException();
    }
    size_t hash = hash_(Stringify(data).c_str());
    typename NodeMap::const_iterator it;
    // Look for the first node >= hash
    it = ring_.lower_bound(hash);
    if (it == ring_.end()) {
        // Wrapped around; get the first node
        it = ring_.begin();
    }
    return it->second;
}

A few points to note:

  • The default hash function is hash from <map>.
    In practice you probably don’t want to use this. Something like MD5 would probably be best.
  • I had to define HASH_NAMESPACE because g++ puts the non-standard hash in a different namespace than that which other compilers do.
    Hopefully this will all be resolved with the widespread availablity of std::unordered_map.
  • The Node and Data types need to have operator << defined for a std::ostream.
    This is because I write them to an ostringstream in order to "stringify" them before getting the hash.