Redis cache design and analysis of classic problems

1 The basic idea of caching

1. Different storage media have different access delays and different storage capacities at the same cost:

SSD/DISK, Memory, L3 cache, L2 cache, L1 cache five storage media, the access delay is gradually reduced, but the capacity of the same cost is gradually increased.

2. The principle of time limitation

Data that has been fetched once will be fetched multiple times in the future

3. Exchange space for time

Open up a high-speed independent space to provide high-speed access

4. Performance cost trade-off

Low access latency, higher performance, and higher equal-capacity costs

2 Cache advantage

Improve access performance
Reduce network congestion
Reduce service load
Enhanced scalability

3 Cache cost

Increased system complexity
Higher storage and deployment costs
Data Consistency Issues

4 Three modes of caching

4.1 Cache Aside

Write operation: After updating the DB, delete the key directly from the cache;

Read operation: read the cache first, if not, read the DB, and synchronize the data of the DB to the cache.

Features: The business side handles all data access details, and at the same time uses the lazy loading idea. After updating the DB, directly delete the cache and update it through the DB to ensure that the data is subject to the DB, which can greatly reduce the probability of inconsistency between the cache and the DB.

shortcoming:

1. If the deletion of the cache fails, there may be problems;

Solution: Failure to increase monitoring

2. If there is a relatively high QPS to access the data that has just been inserted or updated at the same time, the DB may be overwhelmed;

Workaround: Use multithreading to execute queries asynchronously to prevent this problem.

Scenario: Read more and write less. For example, in user data, users rarely modify user information, but various business scenarios use user data in many reading scenarios.

4.2 Read/Write Through

Core idea: There is an intermediate data service proxy for read and write cache and DB operations.

Write operation: check the cache first, if the cache does not exist, only update the DB; if the cache exists, update the cache first, then update the DB, and then return;

Read operation: check the cache first, and return directly if hit. Otherwise load from DB, then seed back into the cache and return.

Features: The business side does not need to care about data details, and the system has good isolation.

4.3 write-Back or Write-Behind

Undertake Write Through, after the write operation updates the cache, asynchronously write data back to DB or batch write data back to DB.

Disadvantages: Asynchronous or batch write-back, which may lead to data loss.

Features: Merge or write DB asynchronously, with low DB pressure.

Usage scenario: The writing frequency is very high, but the business that does not require high data consistency.

5 redis common interview questions

5.1 Redis Avalanche

Concept: A large number of application requests cannot be processed in the Redis cache, and then,
The application sends a large number of requests to the database layer, causing a surge of pressure on the database layer.

reason:

1. A large amount of data in the cache expires at the same time, resulting in a large number of requests that cannot be processed

2. The Redis cache instance fails and goes down

solution:

Reason 1: Method 1: Avoid setting the same expiration time for a large amount of data; Method 2: Downgrade directly returns predefined information, null value or error information

Reason 2: Method 1: Implement service fuse or request current limiting mechanism in the business system; Method 2: Server current limiting

Precautions: Use master-slave nodes to build a Redis cache high-reliability cluster

5.2 Breakdown

Concepts: 1. In a scenario where a hot data fails, a large number of requests directly access the DB, the DB pressure increases sharply, and the business response is delayed.

Reason: The hotspot key expires or expires at the same time.

Solution: do not set the expiration time for the hotspot key; or set the expiration time as the base time + random time.

5.3 Penetration

Concept: The data to be accessed is neither in the Redis cache nor in the database, resulting in a cache miss when the request is accessing the cache. When accessing the database again, it is found that there is no data to be accessed in the database.

Reasons: 1. The business layer deleted the data by mistake; 2. Malicious attack: specially access the data that is not in the database.

Solution:

1. Default or empty value

2. Use the Bloom filter to quickly determine whether the data exists,
Avoid querying the database for the existence of data and reduce database pressure.

3. The front end performs request detection

5.4 bigkey

Concept: A cache key that stores too much data. For example, a List of tens of thousands of records;

harm:

1, causing uneven memory allocation. For example, in Redis cluster or codis, the memory usage of nodes will be uneven;

2. Timeout blocking. Because of the single-threaded feature of Redis, if it takes a long time to operate a Bigkey, subsequent requests will be blocked.

3. Network congestion consumes bandwidth.

4. Expired deletion will be very slow and will block redis. If Bigkey sets an expiration time, the key will be deleted when it expires. If the expired asynchronous deletion of Redis 4.0 is not used,
There is the possibility of blocking Redis, and it cannot be found in slow queries (because this deletion is an internal loop event).

How to find out:

redis command: redis-cli --bigkeys

Solution:

1. Delete bigkey.

After redis4.0, delete asynchronously;
Collection type: use scan, read part of the data to delete;
Hash type, use Hscan, read part of the data to delete

2. Split

string type, split into multiple keys
Set or hash type, split into multiple lists or hashes

5.5 Hot keys

Concept: The so-called hot key problem is that there are suddenly hundreds of thousands of requests to access a specific key on redis.
Then, this will cause the traffic to be too concentrated and reach the upper limit of the physical network card, which will cause the redis server to go down.

solve:

1. Secondary cache - local cache. For example, using ehcache, or a HashMap can be used. After you find the hot key, load the hot key into the system's JVM.
For this kind of hot key request, it will be taken directly from the jvm without going to the redis layer.

2. Backup hot keys. Don't let the key go to the same redis. It would be better if we store this key on multiple redis.
Next, when a hot key request comes in, we randomly select one on the redis with backup, access the value, and return the data.

Self-take of mind map: https://www.processon.com/embed/614c387e5653bb2ea6ddd4b4

Reference: https://segmentfault.com/a/1190000040709794

Redis cache design and analysis of classic problems

1 The basic idea of caching

2 Cache advantage

3 Cache cost

4 Three modes of caching

4.1 Cache Aside

4.2 Read/Write Through

4.3 write-Back or Write-Behind

5 redis common interview questions

5.1 Redis Avalanche

5.2 Breakdown

5.3 Penetration

5.4 bigkey

5.5 Hot keys

程序员伍六七

引用和评论

Redis HyperLogLog：数据统计的轻量级解决方案

嘎嘎好用！推荐三款开源的 Redis 桌面客户端！

C++ 中 VS 项目引入公共配置文件

疯狂推荐！从零开始 Dify 部署全攻略！

Cherry Studio 入门 MCP：为你的大模型插上翅膀

狂揽17k star！Docker可视化神器，一键部署项目真香！

OpenWebUI：一站式 AI 应用构建平台体验