Preface
Hello, everyone, my name is asong
. Last article: hands-on implementation of a localcache-Design introduces the points to consider when designing a local cache. Readers and friends can learn from the storage design of bigcache, which can reduce GC pressure. This is something I haven't considered before. This kind of excellent open source design is worthy of our study, so before I started, I read a few high-quality local cache libraries, summarized the excellent design of each open source library, let's take a look at this article together.
Efficient concurrent access
The simple implementation of the local cache can use the combination of map[string]interface{}
+ sync.RWMutex
sync.RWMutex
to optimize the read, but when the amount of concurrency comes up, it becomes a serial read, and the goroutine
waiting for the lock will live on block
In order to solve this problem, we can divide buckets, each bucket uses a lock to reduce competition. key
can also be understood as sharding. Each cached object is hash(key)
according to its 061c46bd87ee59, and then hash(key)%N
, N is the number of shards; ideally, each request falls on its own On-chip, there is basically no lock competition.
The realization of sharding mainly considers two points:
hash
algorithm selection, hash algorithm selection should have the following characteristics:- The hash result has a high discrete rate, that is, high randomness
- Avoid generating excess memory allocation and avoid the pressure caused by garbage collection
- Hash algorithm is highly efficient
- Choose the number of shards. The more shards, the better. Based on experience, our number of
2
N
the power of 061c46bd87efa0. In order to improve efficiency, we can also use bit operations instead of remainders during sharding.
Open source local cache library bigcache
, go-cache
, freecache
have achieved fragmentation, bigcache
of hash
choice is fnv64a
algorithm, go-cache
of hash
choice is djb2 algorithm, freechache
chose xxhash
algorithm. These three algorithms are all non-encrypted hash algorithms. Which algorithm is better to choose? You need to consider the above three points comprehensively. First, compare the operating efficiency. In the case of the same string, compare benchmark
:
func BenchmarkFnv64a(b *testing.B) {
b.ResetTimer()
for i:=0; i < b.N; i++{
fnv64aSum64("test")
}
b.StopTimer()
}
func BenchmarkXxxHash(b *testing.B) {
b.ResetTimer()
for i:=0; i < b.N; i++{
hashFunc([]byte("test"))
}
b.StopTimer()
}
func BenchmarkDjb2(b *testing.B) {
b.ResetTimer()
max := big.NewInt(0).SetUint64(uint64(math.MaxUint32))
rnd, err := rand.Int(rand.Reader, max)
var seed uint32
if err != nil {
b.Logf("occur err %s", err.Error())
seed = insecurerand.Uint32()
}else {
seed = uint32(rnd.Uint64())
}
for i:=0; i < b.N; i++{
djb33(seed,"test")
}
b.StopTimer()
}
operation result:
goos: darwin
goarch: amd64
pkg: github.com/go-localcache
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkFnv64a-16 360577890 3.387 ns/op 0 B/op 0 allocs/op
BenchmarkXxxHash-16 331682492 3.613 ns/op 0 B/op 0 allocs/op
BenchmarkDjb2-16 334889512 3.530 ns/op 0 B/op 0 allocs/op
By comparing the results, we can observe that Fnv64a
algorithm is still very high. Next, we will compare the randomness and first randomly generate 100000
strings, all of which are not the same;
func init() {
insecurerand.Seed(time.Now().UnixNano())
for i := 0; i < 100000; i++{
randString[i] = RandStringRunes(insecurerand.Intn(10))
}
}
var letterRunes = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")
func RandStringRunes(n int) string {
b := make([]rune, n)
for i := range b {
b[i] = letterRunes[insecurerand.Intn(len(letterRunes))]
}
return string(b)
}
Then we run a unit test to count the number of conflicts:
func TestFnv64a(t *testing.T) {
m := make(map[uint64]struct{})
conflictCount :=0
for i := 0; i < len(randString); i++ {
res := fnv64aSum64(randString[i])
if _,ok := m[res]; ok{
conflictCount++
}else {
m[res] = struct{}{}
}
}
fmt.Printf("Fnv64a conflict count is %d", conflictCount)
}
func TestXxxHash(t *testing.T) {
m := make(map[uint64]struct{})
conflictCount :=0
for i:=0; i < len(randString); i++{
res := hashFunc([]byte(randString[i]))
if _,ok := m[res]; ok{
conflictCount++
}else {
m[res] = struct{}{}
}
}
fmt.Printf("Xxxhash conflict count is %d", conflictCount)
}
func TestDjb2(t *testing.T) {
max := big.NewInt(0).SetUint64(uint64(math.MaxUint32))
rnd, err := rand.Int(rand.Reader, max)
conflictCount := 0
m := make(map[uint32]struct{})
var seed uint32
if err != nil {
t.Logf("occur err %s", err.Error())
seed = insecurerand.Uint32()
}else {
seed = uint32(rnd.Uint64())
}
for i:=0; i < len(randString); i++{
res := djb33(seed,randString[i])
if _,ok := m[res]; ok{
conflictCount++
}else {
m[res] = struct{}{}
}
}
fmt.Printf("Djb2 conflict count is %d", conflictCount)
}
operation result:
Fnv64a conflict count is 27651--- PASS: TestFnv64a (0.01s)
Xxxhash conflict count is 27692--- PASS: TestXxxHash (0.01s)
Djb2 conflict count is 39621--- PASS: TestDjb2 (0.01s)
In a comprehensive comparison, it is better to fnv64a
Reduce GC
Go
language has a garbage collector, and GC
is also very time-consuming, so if you want to achieve high performance, how to avoid GC
is also an important consideration. freecacne
, bigcache
are known to avoid high GC
library, bigcache
do to avoid high GC
design is based on Go
when garbage collected language map
special treatment; in Go1.5
later, if the map object key and value do not contain pointers, Then the garbage collector will ignore him. For this point, key
and value
do not use pointers, so gc
can be avoided. bigcache
uses the hash value as key
, and then serializes the cached data into a pre-allocated byte array, and uses offset
as value
. Using the pre-allocated slice will only add an additional object to the GC, due to the word The section slice does not contain other pointer data except for its own object, so the marking time of the GC for the entire object is O(1). The specific principle still needs to look at the source code to deepen the understanding. It is recommended to read the original author’s article: https://dev.to/douglasmakey/how-bigcache-avoids-expensive-gc-cycles-and-speeds-up-concurrent-access- in-go-12bb; The author wrote a simple version of cache on the basis of BigCache, and then used the code to illustrate the above principles, which is easier to understand.
freecache
is to implement a ringbuffer
structure by itself, by reducing the number of pointers to achieve map with zero GC overhead, key
, value
are stored in ringbuffer
, and the index is used to find the object. freecache
different from the traditional hash table implementation. There is a slot
in the implementation. A summary diagram is drawn, so I won’t look at the source code carefully:
recommended article
- https://colobu.com/2019/11/18/how-is-the-bigcache-is-fast/
- https://dev.to/douglasmakey/how-bigcache-avoids-expensive-gc-cycles-and-speeds-up-concurrent-access-in-go-12bb
- https://studygolang.com/articles/27222
- https://blog.csdn.net/chizhenlian/article/details/108435024
Summarize
In an efficient local cache, concurrent access and reduction of GC
it, I read the elegant design in these libraries and directly overturned the code I wrote before. It is really not perfect. Design, no matter how you design, there will be sacrifices on some points. This is unavoidable, and the road to software development is still long and hindered. The code I implemented is still in the process of sewing and patching, and it will be sent out after it is perfected. It is up to everyone to help CR
.
, this is the end of this article, my name is asong
, see you in the next issue.
**Welcome to follow the public account: [Golang DreamWorks]
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。