One article to understand the principle and implementation of consistent hash

In the sharing of go-zero's distributed caching system, Kevin focused on the principle of consistent hashing and the practice in distributed caching. This article will talk about the principle of consistent hash and its implementation in go-zero in detail.

Take storage as an example. In the entire microservice system, our storage cannot be said to be just a single node.

One is to improve stability. When a single node is down, the entire storage will face service unavailability;
The second is data fault tolerance. Similarly, the data of a single node is physically damaged. In the case of multiple nodes, the nodes have backups, unless the nodes that are backing up each other are damaged at the same time.

So the question is, in the case of multiple nodes, which node should the data be written to?

hash

So in essence: we need a and convert it into a smaller value. This value is usually unique and extremely compact in format, such as uint64 :

Idempotence: Every time you use the same value to calculate the hash, you must ensure that you can get the same value

This is done by the hash

But the common hash algorithm is used for routing, such as: key % N . One node exits the cluster due to an abnormality or has an abnormal heartbeat. At this time, perform hash route , which will cause a large amount of data to be re-distributed to different nodes by When a node accepts a new request, it needs to reprocess the logic of obtaining data: if it is in the cache, it is easy to cause the cache avalanche .

At this time, the consistent hash algorithm needs to be introduced.

`consistent hash`

Let's take a look at how consistent hash solves these problems:

`rehash`

Solve a large number of rehash problems first:

As shown in the figure above, when a new node is added, the affected key is only key31 . After the new node is added (removed), it will only affect the data near the node. The data of other nodes will not be affected, thus solving the problem of node changes.

This is exactly: monotonicity. This is also normal hash algorithm cannot satisfy distributed scenarios.

`Data skew`

In fact, it can be seen from the above figure: Currently, most of the keys are concentrated on node 1 . If when the number of nodes is relatively small, most keys can be concentrated on a certain node , the problem found during monitoring is: uneven load between nodes.

To solve this problem, consistent hash introduced the concept virtual node

Since the load is uneven, we artificially construct a balanced scene, but there are only so many actual nodes. So use virtual node divide the area, and the actual service node is still the previous node.

`Implementation`

Let's take a look at Get() first:

`Get`

Let me talk about the principle of realization:

Calculate the hash of key
Find the index of the first matching virtual node , and get the corresponding h.keys[index] : virtual node hash value
Correspond to this ring to find a matching actual node

In fact, we can see that ring we get from []node is a 060f92c961ade4. This is because when calculating virtual node hash , hash conflicts may occur, and different virtual node hash correspond to an actual node.

This also shows: node and virtual node are a one-to-many relationship. And the ring inside is the following design:

This actually shows the allocation strategy of consistent hash:

virtual node is divided as the value range. key to obtain node , based on the division based on virtual node as the boundary
virtual node through hash ensures that the keys allocated by different nodes are roughly uniform in the corresponding relationship. That is break the binding
Adding a new node will allocate multiple virtual node correspondingly. The new node can load the pressure of multiple original nodes. From a global perspective, it is easier to achieve load balancing during expansion.

`Add Node`

After reading Get , you can roughly know the design of the entire consistent hash:

type ConsistentHash struct {
  hashFunc Func                            // hash 函数
  replicas int                            // 虚拟节点放大因子
  keys     []uint64                    // 存储虚拟节点hash
  ring     map[uint64][]interface{}                    // 虚拟节点与实际node的对应关系
  nodes    map[string]lang.PlaceholderType    // 实际节点存储【便于快速查找，所以使用map】
  lock     sync.RWMutex
}

Well, in this way, a basic consistent hash is complete.

Specific code: https://github.com/tal-tech/go-zero/blob/master/core/hash/consistenthash.go

`scenes to be used`

As I said at the beginning, consistent hash can be widely used in distributed systems:

Distributed cache. May redis cluster constructed on such a storage system cache proxy , freedom to control the routing. And this routing rule can use the consistent hash algorithm
Service discovery
Distributed scheduling tasks

All of the above distributed systems can be used in load balancing modules.

`project address`

https://github.com/tal-tech/go-zero

Welcome to use go-zero and star support us!

`WeChat Exchange Group`

Follow the " practice " public exchange group get the community group QR code.

One article to understand the principle and implementation of consistent hash

hash

`consistent hash`

`rehash`

`Data skew`

`Implementation`

`Get`

`Add Node`

`scenes to be used`

`project address`

`WeChat Exchange Group`

kevinwan

`引用和评论`

熔断原理分析与源码解读

一文掌握 MCP 上下文协议：从理论到实践

腾讯 tRPC-Go 教学——（5）filter、context 和日志组件

Go slice切片使用教程，一次通关！

腾讯 tRPC-Go 教学——（1）搭建服务

gozero限流、熔断、降级如何实现？面试的时候怎么回答？

一文弄懂用Go实现MCP服务