Hands-on implementation of a localcache-design articles

Preface

Hello, everyone, my name is asong . Recently I want to write a localcache to practice my hands. I have been working for so long, and I have seen many local caches implemented by colleagues. They all have their own strengths. I am also thinking about how to implement a high-performance local cache. Next, I will base myself on it. The understanding of the implementation of a version of the cache, you are welcome to put forward valuable comments, I will continue to improve based on the comments.

This article mainly introduces what to consider when designing a local cache. Follow-up articles are for implementation. Welcome to follow this series.

Why have a local cache

With the popularization of the Internet, the number of users and visits is increasing, which requires our applications to support more concurrency. For example, the traffic on the homepage of a certain treasure, a large number of users enter the homepage at the same time, and it affects our application server and database. The calculations caused by the server are also huge. The database access itself occupies the database connection, resulting in huge network overhead. In the face of such a high amount of concurrency, the database will face a bottleneck. At this time, it is necessary to consider adding a cache, and the cache is divided into distribution In most scenarios, we can use distributed cache to meet the requirements. Distributed cache access speed is also very fast, but data needs to be transmitted across the network. In the face of the high concurrency level of the homepage, performance requirements It’s very high. We can’t spare a little performance optimization space. At this time, we can choose to use local cache to improve performance. Local cache does not need to be transmitted across the network. Both the application and the cache are in the same process. Quick request is applicable. This is a business scenario where the data update frequency is low on the homepage.

In summary, the system architecture after we often use local cache is like this:

Although the local cache brings performance optimization, it also has some drawbacks. The cache is coupled with the application. Multiple applications cannot directly share the cache. Each application or each node of the cluster needs to maintain its own separate cache, which is the same for memory. This kind of waste, it is our programmers who use the cache. We have to accurately determine which type of cache to use according to the data type and business scenario, and how to use this cache to achieve the best goal with the lowest cost and the fastest efficiency. .

Thinking: How to implement a high-performance local cache

data structure

In the first step, we must consider how the data should be stored; the search efficiency of the data should be high, first of all we thought of the hash table, the hash table search efficiency is high, the time complexity is O(1) , which can meet our needs; what structure to store, then we have to consider what type of storage, because of the different data types in different business scenarios used for commonality in java we can use generics, Go language temporarily without generics, we can Use the interface type instead, and leave the analytical data to the programmer for assertion, which enhances scalability and also increases some risks.

Summarize:

Data structure: hash table;
key : string type;
value : interface type;

Concurrency safety

Locally cached applications will definitely face concurrent read and write scenarios, which is the issue of concurrency security. Because we chose the hash structure, Go are mainly two types of hashes provided in the map and the other is thread-safe sync.map . For convenience, we can directly choose sync.map , or consider using map + sync.RWMutex combination method is implemented by itself to ensure thread safety. Some people on the Internet have compared these two methods. When the read operation is far more than the write operation, sync.map is much higher than the combination of map + sync.RWMutex The read operation in the local cache is much higher than the write operation, but our local cache not only supports the use of locks when storing data, but also requires locking when performing operations such as expiration clearing, so map+sync.RWMutex is more flexible, so here We choose this method to ensure concurrency safety.

High-performance concurrent access

Locking can ensure the security of data reading and writing, but it will increase lock competition. The local cache is originally designed to improve performance and cannot be a performance bottleneck. Therefore, we need to optimize lock competition. For the local cache application scenario, we can key to reduce lock contention.

Our key are of string , so we can use the djb2 hash algorithm to key into buckets, and then lock each bucket, that is, lock refinement to reduce competition.

Object limit

Because the local cache is stored in the memory, the memory is limited, we can not store infinitely, so we can specify the number of cached objects, according to our specific application scenarios to estimate the upper limit, we choose the cache by default The number is 1024.

Elimination strategy

Because we will set the number of cached objects, when the upper limit is triggered, we can use the elimination strategy to eliminate them. Common cache elimination algorithms are:

LFU

LFU (Least Frequently Used) is an algorithm that is not commonly used recently. The data is eliminated according to the historical access frequency of the data. The core idea of this algorithm is that the recently used data with low frequency will not be used again with a high probability. Replace the data with the least frequency of use. go out.

Existing problems:

Some data is frequently accessed in a short period of time, and will not be accessed for a long time afterwards, because the previous access frequency has increased sharply, then it will not be eliminated in a short time afterwards, occupying the top position of the queue , It will cause more frequently used blocks to be cleared more easily, and new data in the cache just entered may also be deleted quickly.

LRU

LRU (Least Recently User) is the least recently used algorithm. The data is eliminated according to the historical access records of the data. The core idea of this algorithm is that the recently used data will be used again with a high probability. Will not be used again, replace the data that has not been accessed for the longest time

There is a problem:

When a client accesses a large amount of historical data, the data in the cache may be replaced by historical data, reducing the cache hit rate.

FIFO

FIFO (First in First out) is the first-in-first-out algorithm. The core idea of this algorithm is that it has just been accessed recently, and the possibility of future access is relatively high. The data that enters the cache first is eliminated first.

Existing problems:

This kind of algorithm adopts absolutely fair way to carry on data replacement, it is very prone to the problem of page fault interruption.

Two Queues

Two Queues is FIFO + LRU . The core idea is to cache the data in the FIFO queue when the data is accessed for the first time, and move the data from the FIFO queue to the LRU queue when the data is accessed for the second time. The queue eliminates data in its own way.

There is a problem:

This algorithm is LRU-2 and has poor adaptability LRU requires a lot of access to clear the history records.

ARU

ARU (Adaptive Replacement Cache) is an adaptive cache replacement algorithm. It is LFU and LRU algorithms. The core idea is to increase the size of the linked list LRU or LFU according to the access situation of the eliminated data ARU mainly contains four Linked list, LRU and LRU Ghost , LFU and LFU Ghost , Ghost linked list is a linked list of corresponding data records, does not record data, only records ID and other information.

截屏2021-12-04 下午6.52.05

When the data is accessed, it is added to the LRU queue. If the data is accessed again, it is also placed in the LFU linked list; if the data LRU queue, then the data enters the LRU Ghost queue, if the data is later accessed again Once accessed, LRU queue is increased, and the size of the LFU queue is reduced at the same time.

There is a problem:

Because four queues need to be maintained, more memory space will be occupied.

choose

Each algorithm has its own characteristics. Combining with our local cache usage scenarios, ARU algorithm as a cache elimination strategy. The size of LRU and LFU can be dynamically adjusted to adapt to the current best cache hits.

Expired clearance

In addition to using the cache elimination strategy to clear data, you can also add an expiration time as a double guarantee to prevent infrequently accessed data from occupying memory all the time. There are two ways to do it:

Delete the data directly after expiration
The data will not be deleted after the expiration date, and the data will be updated asynchronously

Both methods have their own advantages and disadvantages. Asynchronous data update requires specific business scenarios. In order to cater to most businesses, we use the method of deleting the data when it expires, which is more friendly. Here we use the lazy loading method to judge the data when obtaining the data. Whether to expire, set up a timed task at the same time, delete the expired data regularly every day.

Cache monitoring

Many people also ignore the monitoring of the cache. After basically writing, it is already effective without reporting an error. It is impossible to perceive whether the cache is working. Therefore, the monitoring of various indicators of the cache is also more important. We can optimize the parameters of the cache to optimize the cache. If it is an enterprise application, we can use Prometheus for monitoring and reporting. We can simply write a small component in our self-test, and print the cache number, cache hit rate and other indicators regularly.

GC tuning

For applications that use a lot of local caches, GC issues must be commonplace due to cache elimination. If there are too many GCs and a longer STW time, it will definitely affect the service availability; for this matter, it is usually a case-specific analysis. After the local cache is online, remember to check the GC monitoring frequently.

Cache penetration

When using the cache, you must consider the problem of cache penetration, but this is generally not implemented in the local cache, and is basically handed over to the user to achieve. When an element cannot be found in the cache, it sets a lock on the cache key; so other threads will Wait for this element to be filled instead of hitting the database (externally use singleflight encapsulate it).

Reference article

Summarize

It is not easy to really want to design a high-performance local cache. Because I am also inadequate, the design idea of this article is also a personal practical idea. We welcome your valuable comments and we can make a real high-performance local cache together.

In the next article, I will share a local cache written by myself, please look forward to it! ! !

Welcome to follow the public account: Golang DreamWorks

Hands-on implementation of a localcache-design articles

Preface

Why have a local cache

Thinking: How to implement a high-performance local cache

data structure

Concurrency safety

High-performance concurrent access

Object limit

Elimination strategy

LFU

LRU

FIFO

Two Queues

ARU

choose

Expired clearance

Cache monitoring

GC tuning

Cache penetration

Reference article

Summarize

asong

引用和评论

伙计，Go项目怎么使用枚举？

Mybatis源码-缓存机制

Python 与 PostgreSQL 集成：深入 psycopg2 的应用与实践

windows下golang 使用go-oci8连接orcale配置 goframe框架配置后可直接使用

如何将豆瓣观影记录实时同步至博客中

想要冲击腾讯的朋友不要错过

4G模块详解