The issue of cache and database data consistency has always been a question that interviewers like to ask. How many solutions do you know, and what are the advantages and disadvantages of each? Which solution is better in the end, do you think?
My public number: MarkerHub , Java website: https://markerhub.com
For more selected articles, please click: Java Notes
Author: Ye Buwen
background
Caching is a very useful concept in software development, and database caching is a scene that will inevitably be encountered in projects. The guarantee of cache consistency is even more frequently asked in interviews. Here is a summary and choose the right consistency scheme for different requirements.
What is cache
The speed of storage is different. Caching is a technology that temporarily stores the results of low-speed storage in high-speed storage.
As shown in the figure, the storage above the pyramid can be used as a cache for the storage below.
Our discussion this time focuses on the database caching scenario, and will take redis as the mysql cache as a case.
Why do you need caching
Storage such as mysql usually supports complete ACID features. Because of factors such as reliability and durability, the performance is generally low. Highly concurrent queries will put pressure on mysql and cause the instability of the database system. It is also prone to delays. According to the principle of locality, 80% of the requests will fall on 20% of the hot data. In the scenario of more reads and less writes, adding a layer of cache is very helpful to improve system throughput and robustness.
There is a problem
The stored data may change over time, and the data in the cache will be inconsistent. The specific inconsistency time that can be tolerated requires specific analysis of specific businesses, but the usual business needs to be finalized.
redis as mysql cache
In the usual development mode, mysql is used as storage, and redis is used as cache to accelerate and protect mysql. However, when the mysql data is updated, how does redis keep in sync?
The cost of strong consistency synchronization is too high. If you pursue strong consistency, there is no need to use cache, just use mysql. Eventually consistency is usually considered.
solution
Option One
Through the expiration time of the key, redis will not be updated when mysql is updated. This method is simple to implement, but the time of inconsistency will be very long. If the read request is very frequent and the expiration time is relatively long, a lot of long-term dirty data will be generated.
advantage:
- Low development cost and easy to implement;
- The management cost is low, and the probability of problems is relatively small.
insufficient
- Rely on expiration time, too short time will cause frequent cache invalidation, too long time will cause long update delay (inconsistent)
Option II
Extend on the basis of solution one, pass the expiration time of the key, and update redis at the same time when updating mysql.
advantage
- Compared with solution 1, the update delay is smaller.
insufficient
- If the update of mysql succeeds, but the update of redis fails, it degrades to solution one;
- In high-concurrency scenarios, the business server needs to connect with mysql and redis at the same time. This is a double loss of connection resources, which is likely to cause the problem of too many connections.
third solution
Optimize for the synchronous writing of redis in the second scheme, increase the message queue, hand over the redis update operation to kafka, the reliability of the message queue is guaranteed, and then build a consumer service to update redis asynchronously.
advantage
- The message queue can use a handle, and many message queue clients also support local cache sending, which effectively solves the problem of too many connections in the second solution;
- Use message queues to achieve logical decoupling;
- The message queue itself is reliable, and redis can be consumed at least once by means such as manual submission.
insufficient
- Still can’t solve the timing problem. If multiple business servers process two requests for the same row of data, for example, a = 1; a = 5;, if the first one in mysql is executed first, and the one that enters kafka The order is that the second is executed first, then the data will be inconsistent.
- The introduction of message queues, while increasing service consumption messages, is costly.
Option Four
Update redis by subscribing to binlog, use the consumer service we built as a slave of mysql, subscribe to binlog, parse out the updated content, and then update to redis.
advantage
- In the case of low mysql pressure, the delay is low;
- Completely decoupled from the business;
- Solve the timing problem.
Disadvantage
- It is costly to build a synchronization service separately and introduce the binlog synchronization mechanism.
to sum up
Scheme selection
- First, confirm the latency requirements of the product. If the requirements are extremely high and the data may change, do not use the cache.
- Generally speaking, Option 1 is sufficient. I have consulted 4 or 5 teams, and basically use Option 1. Because the cache solution can be used, it is usually read more and write less scenarios. At the same time, the business has a certain tolerance for latency. . Option 1 has no development cost, but it is actually more practical.
- If you want to increase the immediacy of the update, choose option 2, but there is no need to make retry guarantees and the like.
- Scheme 3 and Scheme 4 are aimed at services with higher latency requirements. One is the push mode and the other is the pull mode. The scheme 4 has stronger reliability. Since they are willing to spend time on the logic of processing messages, it is better to do it in one step. Use option 4.
in conclusion
In general, Option 1 is sufficient. If the delay requirement is high, choose option 4 directly. If it is an interview scene, from simple to complex, the interviewer will follow up step by step, and we will deduce it little by little, and the host and guest will enjoy it.
(Finish)
Recommended reading
is great, this Java website has all kinds of projects! https://markerhub.com
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。