"How to prevent cache breakdown?"
This is a question that is frequently investigated during interviews with many first- and second-tier manufacturers.
In systems with high concurrency, caching can improve the performance of data query and relieve the concurrency pressure on the back-end storage system. It's a tried-and-true weapon.
I put the answers to this question into a 20W interview document. You can private message me to get it.
Check out the experts' answers below.
Expert:
In practice, we would add a caching layer between the program and the database.
On the one hand, it is to improve the efficiency of data retrieval and program performance, and on the other hand, it is to relieve the concurrency pressure of the database.
Cache breakdown means that all requests hit the database for some reason, and the cache does not play a role in traffic buffering.
I think there are 2 situations that can cause cache breakdown.
- The hot key saved in Redis, when the cache expires, a large number of requests come in, resulting in all requests hitting the database.
- The client maliciously initiates a large number of non-existing key requests. Since the data corresponding to the accessed key does not exist, it will inevitably penetrate into the database every time, causing the cache to become a decoration.
In short, when Redis assumes the traffic buffering function, it is necessary to consider the problem that the excessive concurrency pressure caused by Redis failure will impact the back-end storage device.
So I think it can be solved in several ways.
- For hotspot data, we can not set the expiration time, or renew the data expiration time when accessing the data.
- For cached data with high access volume, we can design multi-level caches to minimize the pressure on back-end storage devices.
Using distributed locks, when the cache is found to be invalid, it is not loaded from the database first, but the distributed lock is obtained first, and the thread that obtains the distributed lock queries the data from the database and writes it back to the cache.
Subsequent threads that do not acquire the lock just need to wait and retry.
This solution sacrifices some performance, but ensures that the database is not overwhelmed.
For malicious attack scenarios, you can use the Bloom filter to cache the existing data in the Bloom filter when the application starts.
Every time a request comes in, the bloom filter is accessed first,
If it does not exist, it means that the data must not be in the database, and there is no need to access the database.
In addition, in the design of the entire cache architecture, in addition to avoiding the problem of cache penetration as much as possible, we also need to consider it from a global perspective.
For example, business isolation, multi-level caching, deployment isolation, security considerations, etc.
Summarize
In my opinion, many interview questions are actually more about examining the technical background and thinking boundaries of job seekers. Some questions may not have answers, or they may not be able to come up with very good solutions immediately during the interview process. It only needs to say the general direction and ideas.
Remember to like, subscribe and follow! ! !
Copyright notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless otherwise stated. Please indicate the source forMic带你学架构
!
If this article is helpful to you, please help to follow and like, your persistence is the driving force for my continuous creation. Welcome to follow the WeChat public account of the same name to get more technical dry goods!
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。