Hands-on implementation of a localcache-Appreciate the excellent open source design

Preface

Hello, everyone, my name is asong . Last article: hands-on implementation of a localcache-Design introduces the points to consider when designing a local cache. Readers and friends can learn from the storage design of bigcache, which can reduce GC pressure. This is something I haven't considered before. This kind of excellent open source design is worthy of our study, so before I started, I read a few high-quality local cache libraries, summarized the excellent design of each open source library, let's take a look at this article together.

Efficient concurrent access

The simple implementation of the local cache can use the combination of map[string]interface{} + sync.RWMutex sync.RWMutex to optimize the read, but when the amount of concurrency comes up, it becomes a serial read, and the goroutine waiting for the lock will live on block In order to solve this problem, we can divide buckets, each bucket uses a lock to reduce competition. key can also be understood as sharding. Each cached object is hash(key) according to its 061c46bd87ee59, and then hash(key)%N , N is the number of shards; ideally, each request falls on its own On-chip, there is basically no lock competition.

The realization of sharding mainly considers two points:

hash algorithm selection, hash algorithm selection should have the following characteristics:
- The hash result has a high discrete rate, that is, high randomness
- Avoid generating excess memory allocation and avoid the pressure caused by garbage collection
- Hash algorithm is highly efficient
Choose the number of shards. The more shards, the better. Based on experience, our number of 2 N the power of 061c46bd87efa0. In order to improve efficiency, we can also use bit operations instead of remainders during sharding.

Open source local cache library bigcache , go-cache , freecache have achieved fragmentation, bigcache of hash choice is fnv64a algorithm, go-cache of hash choice is djb2 algorithm, freechache chose xxhash algorithm. These three algorithms are all non-encrypted hash algorithms. Which algorithm is better to choose? You need to consider the above three points comprehensively. First, compare the operating efficiency. In the case of the same string, compare benchmark :

func BenchmarkFnv64a(b *testing.B) {
    b.ResetTimer()
    for i:=0; i < b.N; i++{
        fnv64aSum64("test")
    }
    b.StopTimer()
}

func BenchmarkXxxHash(b *testing.B) {
    b.ResetTimer()
    for i:=0; i < b.N; i++{
        hashFunc([]byte("test"))
    }
    b.StopTimer()
}


func BenchmarkDjb2(b *testing.B) {
    b.ResetTimer()
    max := big.NewInt(0).SetUint64(uint64(math.MaxUint32))
    rnd, err := rand.Int(rand.Reader, max)
    var seed uint32
    if err != nil {
        b.Logf("occur err %s", err.Error())
        seed = insecurerand.Uint32()
    }else {
        seed = uint32(rnd.Uint64())
    }
    for i:=0; i < b.N; i++{
        djb33(seed,"test")
    }
    b.StopTimer()
}

operation result:

goos: darwin
goarch: amd64
pkg: github.com/go-localcache
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkFnv64a-16      360577890                3.387 ns/op           0 B/op          0 allocs/op
BenchmarkXxxHash-16     331682492                3.613 ns/op           0 B/op          0 allocs/op
BenchmarkDjb2-16        334889512                3.530 ns/op           0 B/op          0 allocs/op

By comparing the results, we can observe that Fnv64a algorithm is still very high. Next, we will compare the randomness and first randomly generate 100000 strings, all of which are not the same;

func init() {
    insecurerand.Seed(time.Now().UnixNano())
    for i := 0; i < 100000; i++{
        randString[i] = RandStringRunes(insecurerand.Intn(10))
    }
}
var letterRunes = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")
func RandStringRunes(n int) string {
    b := make([]rune, n)
    for i := range b {
        b[i] = letterRunes[insecurerand.Intn(len(letterRunes))]
    }
    return string(b)
}

Then we run a unit test to count the number of conflicts:

func TestFnv64a(t *testing.T) {
    m := make(map[uint64]struct{})
    conflictCount :=0
    for i := 0; i < len(randString); i++ {
        res := fnv64aSum64(randString[i])
        if _,ok := m[res]; ok{
            conflictCount++
        }else {
            m[res] = struct{}{}
        }
    }
    fmt.Printf("Fnv64a conflict count is %d", conflictCount)
}

func TestXxxHash(t *testing.T) {
    m := make(map[uint64]struct{})
    conflictCount :=0
    for i:=0; i < len(randString); i++{
        res := hashFunc([]byte(randString[i]))
        if _,ok := m[res]; ok{
            conflictCount++
        }else {
            m[res] = struct{}{}
        }
    }
    fmt.Printf("Xxxhash conflict count is %d", conflictCount)
}


func TestDjb2(t *testing.T) {
    max := big.NewInt(0).SetUint64(uint64(math.MaxUint32))
    rnd, err := rand.Int(rand.Reader, max)
    conflictCount := 0
    m := make(map[uint32]struct{})
    var seed uint32
    if err != nil {
        t.Logf("occur err %s", err.Error())
        seed = insecurerand.Uint32()
    }else {
        seed = uint32(rnd.Uint64())
    }
    for i:=0; i < len(randString); i++{
        res := djb33(seed,randString[i])
        if _,ok := m[res]; ok{
            conflictCount++
        }else {
            m[res] = struct{}{}
        }
    }
    fmt.Printf("Djb2 conflict count is %d", conflictCount)
}

operation result:

Fnv64a conflict count is 27651--- PASS: TestFnv64a (0.01s)
Xxxhash conflict count is 27692--- PASS: TestXxxHash (0.01s)
Djb2 conflict count is 39621--- PASS: TestDjb2 (0.01s)

In a comprehensive comparison, it is better to fnv64a

Reduce GC

Go language has a garbage collector, and GC is also very time-consuming, so if you want to achieve high performance, how to avoid GC is also an important consideration. freecacne , bigcache are known to avoid high GC library, bigcache do to avoid high GC design is based on Go when garbage collected language map special treatment; in Go1.5 later, if the map object key and value do not contain pointers, Then the garbage collector will ignore him. For this point, key and value do not use pointers, so gc can be avoided. bigcache uses the hash value as key , and then serializes the cached data into a pre-allocated byte array, and uses offset as value . Using the pre-allocated slice will only add an additional object to the GC, due to the word The section slice does not contain other pointer data except for its own object, so the marking time of the GC for the entire object is O(1). The specific principle still needs to look at the source code to deepen the understanding. It is recommended to read the original author’s article: https://dev.to/douglasmakey/how-bigcache-avoids-expensive-gc-cycles-and-speeds-up-concurrent-access- in-go-12bb; The author wrote a simple version of cache on the basis of BigCache, and then used the code to illustrate the above principles, which is easier to understand.

freecache is to implement a ringbuffer structure by itself, by reducing the number of pointers to achieve map with zero GC overhead, key , value are stored in ringbuffer , and the index is used to find the object. freecache different from the traditional hash table implementation. There is a slot in the implementation. A summary diagram is drawn, so I won’t look at the source code carefully:

Summarize

In an efficient local cache, concurrent access and reduction of GC it, I read the elegant design in these libraries and directly overturned the code I wrote before. It is really not perfect. Design, no matter how you design, there will be sacrifices on some points. This is unavoidable, and the road to software development is still long and hindered. The code I implemented is still in the process of sewing and patching, and it will be sent out after it is perfected. It is up to everyone to help CR .

, this is the end of this article, my name is asong , see you in the next issue.

**Welcome to follow the public account: [Golang DreamWorks]

Hands-on implementation of a localcache-Appreciate the excellent open source design

Preface

Efficient concurrent access

Reduce GC

recommended article

Summarize

asong

引用和评论

伙计，Go项目怎么使用枚举？

腾讯 tRPC-Go 教学——（5）filter、context 和日志组件

大模型时代，后端程序员如何避免被AI卷死？

Go slice切片使用教程，一次通关！

腾讯 tRPC-Go 教学——（1）搭建服务

一文弄懂用Go实现MCP服务

gozero限流、熔断、降级如何实现？面试的时候怎么回答？