1 The basic idea of caching
1. Different storage media have different access delays and different storage capacities at the same cost:
SSD/DISK, Memory, L3 cache, L2 cache, L1 cache five storage media, the access delay is gradually reduced, but the capacity of the same cost is gradually increased.
2. The principle of time limitation
Data that has been fetched once will be fetched multiple times in the future
3. Exchange space for time
Open up a high-speed independent space to provide high-speed access
4. Performance cost trade-off
Low access latency, higher performance, and higher equal-capacity costs
2 Cache advantage
- Improve access performance
- Reduce network congestion
- Reduce service load
- Enhanced scalability
3 Cache cost
- Increased system complexity
- Higher storage and deployment costs
- Data Consistency Issues
4 Three modes of caching
4.1 Cache Aside
Write operation: After updating the DB, delete the key directly from the cache;
Read operation: read the cache first, if not, read the DB, and synchronize the data of the DB to the cache.
Features: The business side handles all data access details, and at the same time uses the lazy loading idea. After updating the DB, directly delete the cache and update it through the DB to ensure that the data is subject to the DB, which can greatly reduce the probability of inconsistency between the cache and the DB.
shortcoming:
1. If the deletion of the cache fails, there may be problems;
Solution: Failure to increase monitoring
2. If there is a relatively high QPS to access the data that has just been inserted or updated at the same time, the DB may be overwhelmed;
Workaround: Use multithreading to execute queries asynchronously to prevent this problem.
Scenario: Read more and write less. For example, in user data, users rarely modify user information, but various business scenarios use user data in many reading scenarios.
4.2 Read/Write Through
Core idea: There is an intermediate data service proxy for read and write cache and DB operations.
Write operation: check the cache first, if the cache does not exist, only update the DB; if the cache exists, update the cache first, then update the DB, and then return;
Read operation: check the cache first, and return directly if hit. Otherwise load from DB, then seed back into the cache and return.
Features: The business side does not need to care about data details, and the system has good isolation.
4.3 write-Back or Write-Behind
Undertake Write Through, after the write operation updates the cache, asynchronously write data back to DB or batch write data back to DB.
Disadvantages: Asynchronous or batch write-back, which may lead to data loss.
Features: Merge or write DB asynchronously, with low DB pressure.
Usage scenario: The writing frequency is very high, but the business that does not require high data consistency.
5 redis common interview questions
5.1 Redis Avalanche
Concept: A large number of application requests cannot be processed in the Redis cache, and then,
The application sends a large number of requests to the database layer, causing a surge of pressure on the database layer.
reason:
1. A large amount of data in the cache expires at the same time, resulting in a large number of requests that cannot be processed
2. The Redis cache instance fails and goes down
solution:
Reason 1: Method 1: Avoid setting the same expiration time for a large amount of data; Method 2: Downgrade directly returns predefined information, null value or error information
Reason 2: Method 1: Implement service fuse or request current limiting mechanism in the business system; Method 2: Server current limiting
Precautions: Use master-slave nodes to build a Redis cache high-reliability cluster
5.2 Breakdown
Concepts: 1. In a scenario where a hot data fails, a large number of requests directly access the DB, the DB pressure increases sharply, and the business response is delayed.
Reason: The hotspot key expires or expires at the same time.
Solution: do not set the expiration time for the hotspot key; or set the expiration time as the base time + random time.
5.3 Penetration
Concept: The data to be accessed is neither in the Redis cache nor in the database, resulting in a cache miss when the request is accessing the cache. When accessing the database again, it is found that there is no data to be accessed in the database.
Reasons: 1. The business layer deleted the data by mistake; 2. Malicious attack: specially access the data that is not in the database.
Solution:
1. Default or empty value
2. Use the Bloom filter to quickly determine whether the data exists,
Avoid querying the database for the existence of data and reduce database pressure.
3. The front end performs request detection
5.4 bigkey
Concept: A cache key that stores too much data. For example, a List of tens of thousands of records;
harm:
1, causing uneven memory allocation. For example, in Redis cluster or codis, the memory usage of nodes will be uneven;
2. Timeout blocking. Because of the single-threaded feature of Redis, if it takes a long time to operate a Bigkey, subsequent requests will be blocked.
3. Network congestion consumes bandwidth.
4. Expired deletion will be very slow and will block redis. If Bigkey sets an expiration time, the key will be deleted when it expires. If the expired asynchronous deletion of Redis 4.0 is not used,
There is the possibility of blocking Redis, and it cannot be found in slow queries (because this deletion is an internal loop event).
How to find out:
redis command: redis-cli --bigkeys
Solution:
1. Delete bigkey.
- After redis4.0, delete asynchronously;
- Collection type: use scan, read part of the data to delete;
- Hash type, use Hscan, read part of the data to delete
2. Split
- string type, split into multiple keys
- Set or hash type, split into multiple lists or hashes
5.5 Hot keys
Concept: The so-called hot key problem is that there are suddenly hundreds of thousands of requests to access a specific key on redis.
Then, this will cause the traffic to be too concentrated and reach the upper limit of the physical network card, which will cause the redis server to go down.
solve:
1. Secondary cache - local cache. For example, using ehcache, or a HashMap can be used. After you find the hot key, load the hot key into the system's JVM.
For this kind of hot key request, it will be taken directly from the jvm without going to the redis layer.
2. Backup hot keys. Don't let the key go to the same redis. It would be better if we store this key on multiple redis.
Next, when a hot key request comes in, we randomly select one on the redis with backup, access the value, and return the data.
Self-take of mind map: https://www.processon.com/embed/614c387e5653bb2ea6ddd4b4
Reference: https://segmentfault.com/a/1190000040709794
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。