Hello everyone, I am a side dish.
A man who hopes to be about architecture ! If you also want to be the person I want to be, otherwise click on your attention and be a companion, so that Xiaocai is no longer alone!
This article mainly introduces the
service current limit
If necessary, you can refer to
If it helps, don’t forget 1610774d7efe2a ❥
The WeChat public account has been opened, , students who have not followed please remember to pay attention!
The weather is cool, suitable for hot pot? let's go! Xiaocai came to the Hai Lao Hot Pot restaurant, and the expected crowd was overwhelmed. Want to leave? The mouth did not agree. You can only queue up with the number! Seeing that every store is overcrowded, I can't help but wonder, if I am the owner of a certain store, do I still use code?
Fantastic thinking is always possible. I can’t help but think about my future restaurant. A restaurant is always limited by the size of the venue and the number of passengers that can be carried by the staff. Therefore, many popular restaurants are in peak periods. It is necessary to queue up at all times. The restaurant arranges the number for the guests after the full load. Only after the customers have finished their meal, can the customers with the number plate and the corresponding number order enter the restaurant for dinner. So why is it designed like this? In fact, this is a flow-limiting measure. Strictly control the passenger flow to keep it stable within the restaurant's operating capacity, and the restaurant will not be unable to operate normally due to the sudden increase in passenger flow. This current limiting method ensures that the total number of customers (concurrent amount) dining in the restaurant is consistent. Only when one guest is gone, can one guest be allowed to enter, and the operation is well organized and reasonable!
service current limit should be considered by every concurrent program~! The purpose of current limiting is not only to control the total concurrent amount of access, but also to try to make the access traffic more balanced, so that the system load will not fluctuate
, so it is also called 1610774d7efe97 "traffic shaping".
Of course, in the era of the prevalence of service flow limit no longer only with the 1610774d7efeb9 single service, but more clearly how to perform service flow limit
distributed scenario
1. Monomer current limit
The above three are the single service , let’s get to know them separately!
1. Counter current limit
counter current limit is a relatively simple and rude way!
The design ideas are as follows:
We will limit the number of requests that can be passed in one second (for example, 50), and start counting from the first request coming in. In the following1s
, every time a request comes in, we will increase the count value by 1, if it accumulates When the number reaches 50, all subsequent requests will be rejected.1s
has passed, the count will be restored to 0 and restarted.
Using counters can be used to limit the total number of concurrency within a certain period of time, but in the final analysis this is a simple and rude current limiting method, rather than average rate current limiting, which can be used in some scenarios. But in some special circumstances, if the system load is only 50 , at the 59th second, it requests 50 times, and at 1:00 also requests 50 times. It 1 seconds, exceeding the total load in an instant, and it is very likely to directly defeat our application!
Of course, there is nothing absolute, we can use sliding window to solve the problem. Speaking of sliding window some friends are not unfamiliar, because the TCP protocol uses sliding window to control the flow, the unclear friends look down!
The sliding window algorithm refers to the current time as the cut-off time, taking a certain time forward, for example, taking 60 second time, the maximum number of visits running within 60 50 , at this time the algorithm The execution logic is: first clear all request records before 60 seconds, and then calculate whether the number of requests in the current set is greater than the set maximum number of requests 50 ? If it is greater than, the current limit rejection strategy will be executed, otherwise the request record will be inserted and the normal process will be executed.
As we can see in the above figure, a circled by a red line segment can be considered as a time window ( 1 minute ), and then we divide the time window into 5 small grids, which is equivalent to 1 The small cells are 12 s . Every 12 s , the time window will move forward one cell. Each cell has its own independent counter. Assuming a request comes at 35 Then 0:25~0:36 grid will increase by 1.
We look back on the map by lower counter problem limiting encounter, when 0:59 came 50 when requests will fall on the map violet area, if 1:00 again 50 8 requests will fall in the
pink area in the above figure. Because of the movement of the time window, a total of 100 requests fall in the same time window, and they will be detected to trigger the current limit. And this is the sliding window , and then we can use Redis to briefly demonstrate:
execution result:
Thread-0 正常执行
Thread-2 正常执行
Thread-3 正常执行
Thread-6 正常执行
Thread-7 正常执行
Thread-10 正常执行
Thread-11 正常执行
Thread-14 正常执行
Thread-1 正常执行
Thread-4 正常执行
Thread-5 正常执行
Thread-8 超出最大的系统负载量, 执行限流
Thread-8 正常执行
Thread-9 超出最大的系统负载量, 执行限流
Thread-9 正常执行
Thread-12 超出最大的系统负载量, 执行限流
Thread-12 正常执行
Thread-13 超出最大的系统负载量, 执行限流
Thread-13 正常执行
Of course, this simple code has many loopholes, but it only provides you with an idea of implementation!
2. Leaky bucket algorithm
restaurant row number we mentioned at the beginning is actually a kind of leaky bucket implementation. The capacity of the restaurant is equivalent to a bucket with a capacity of . The capacity of the bucket is fixed, and the water at the bottom of the bucket will continue to flow out ( dining) Customers who ), and the water on the top of the bucket ( customers waiting to dine ) kept flowing in. If the amount of inflowing water (requested amount) exceeds the outflowing bucket flow rate (maximum concurrent amount), the newly inflowing water will directly overflow when the bucket is full. This is the leaky bucket algorithm commonly used in current limiting applications.
In fact, Java already comes with a good tool to implement the leaky bucket algorithm, that is, Semaphore , which can effectively control the maximum number of concurrent services and prevent service overload. The following is a typical usage Semaphore
Through the above example, it is not difficult to find that the leaky bucket algorithm is mainly concerned with the current total amount of concurrency (total amount of signals). Only when a signal is released for a certain resource ( release operation), the request waiting for entry can get " Pass", can only enter when it is out. In this way, we can also ensure that the load of the system is controllable.
3. Token bucket algorithm
Another commonly used algorithm for current limiting is the token bucket algorithm. Its implementation principle is that the system puts tokens into the bucket at a constant speed. The request needs to be obtained from the bucket to be processed. Once there is no token in the bucket If it is advisable, service is denied.
We can use third-party tools to implement this algorithm. For example, the Google Guava's RateLimiter component uses the token bucket algorithm. The following is a simple usage example:
OUTPUT:
Thread-1 2021-08-01 00:09:14
Thread-10 2021-08-01 00:09:14
Thread-9 2021-08-01 00:09:15
Thread-8 2021-08-01 00:09:15
Thread-6 2021-08-01 00:09:16
Thread-7 2021-08-01 00:09:16
Thread-5 2021-08-01 00:09:17
Thread-3 2021-08-01 00:09:17
Thread-4 2021-08-01 00:09:18
Thread-2 2021-08-01 00:09:18
From the above results, it can be seen that 2 tokens are indeed generated in 1 second, and the acquire()
method is blocking waiting for tokens. It can pass a int to specify the number of tokens to be obtained. Of course, it also There is an alternative method tryAcquire()
, this method will directly return false when there is no available token, so that it will not block waiting. Of course tryAcquire()
can set a timeout event. If the maximum waiting time is not exceeded, it will block waiting for token acquisition. If the maximum waiting time is exceeded and no token is available, it will return false
OUTPUT:
limit
limit
limit
limit
limit
limit
limit
limit
Thread-10 2021-08-01 00:08:05
Thread-4 2021-08-01 00:08:05
Through the above example, we can summarize: RateLimiter can not only deal with the rate limit of normal traffic, but also can handle sudden surge requests to achieve smooth current limit.
2. Distributed current limiting
In single limiting scenario, each node is responsible for limiting their service machine, do not focus on other nodes, but do not call attention to the total amount of the cluster ~! However, the background resources are limited. In a distributed scenario, our focus can no longer be focused on a certain node. Sometimes although the traffic of each single node is not exceeded, the sum of the traffic of each node exceeds the background resources. Therefore, the total traffic of the service on all nodes must be controlled. This is the distributed current limit .
Speaking of distributed, we will think of the concept of gateway
Please read the relevant airborne: "thoroughly understand the micro-services" - Gateway service gateways
After we understand the concept and function of the gateway, it is naturally clear that the total flow control can limit the flow at the gateway level, but there is a P2P direct connection mode service cluster without the concept of a gateway. What should we do at this time?
We mentioned above that sometimes a single node does not exceed the total traffic, but the sum of the node traffic exceeds the total traffic. Then we might as well summarize the traffic of each service node first, and compare the aggregated traffic with the preset total traffic. If it exceeds the total traffic, it needs to be limited.
Simply put, if the carrying capacity of our cluster is 1000, but the aggregated total traffic is 1200, this time exceeds 200, we need to limit the flow! Then we need to reduce the flow to (1-(200/1200)) = 0.83 0.83
, and this 0610774d7f0567 is a single-machine threshold, which is our current limit ratio, and each node must reduce the current flow This ratio.
Calculate the respective current-limiting thresholds through the current-limiting ratio, and then call the above-mentioned single-machine current-limiting algorithms to perform single-machine current-limiting according to their respective current-limiting thresholds. Therefore, the current limit in the cluster environment is also based on a single-point current limit, but the flow determination is different. The above is the direction of a current-limiting idea. Next, we will talk about two specific current-limiting operations.
1、Redis + Lua
This current limiting strategy focuses on the writing of Lua scripts. What is the Lua script? Some students are not calm anymore~ Students who understand distributed locks should know that they can use Redis + Lua
implement distributed locks.
Lua is a lightweight and compact scripting language, written in standard C language and open in source code form. Its design purpose is to embed application programs, thereby providing programs with flexible extension and customization functions.
Since Lua is the core, let's first clarify the logical implementation of Lua
Looking at the flowchart, the code comes out naturally~
1. First, define two global variables to receive the key and current limit passed in the Redis application.
KEYS on the application side, which is a data list. In the Lua script, the value in the array is obtained by index subscript.
ARGV parameter on the application side is more flexible. It can be one or more independent parameters, but it corresponds to Lua script. The ARGV array is received, and the method of obtaining is also obtained through the array subscript.
After editing the Lua script, we can use it happily Java
In specific business scenarios, we can customize a current limit annotation, and cooperate with the AOP to achieve the current limit effect!
2、Nginx + Lua
Using Nginx + Lua is less invasive to the system! We look directly at the code
Lua part
You can refer to the current limiting example given by the official OpenResty
Then we need to modify the nginx.conf configuration file:
Add in the http
server block that requires current limiting:
Of course, in addition to the two implementation ideas mentioned above, we can also use the ready-made middleware Hystrix
and Sentinel
Sentinel
related reading, please airborne: "-Sentinel for Service Fault Tolerance
Speaking to the end
After talking about the current limit of the two scenarios, of course, it is only a general discussion on a very superficial level! May as well let me end with a poem: When we are used to single-application services, we will find that it is not difficult to simply limit the flow of a single point, because our focus is 1 , which 1 , the state of affairs It often becomes uncontrollable. If we talk about single application services, we still have many ready-made components to choose from, but if we have to consider the current limiting method of the entire distributed cluster, we are often at a loss. We need to consider many links such as service node call monitoring, log collection, log aggregation, calculation analysis, current limit decision making, etc., but don’t be afraid. Let’s think about it in a different direction. The more you think about it, it means that you will improve The more, the most feared is not the difficulties and challenges, but the ignorance you stand still!
I am a small dish, and I will go with you~
Don't talk about it, don't be lazy, and be a X as an architecture with Xiaocai~ Follow me to be a companion, so that Xiaocai is no longer alone. See you below!
If you work harder today, you will be able to say less begging words tomorrow!
I am Xiaocai, a man who becomes stronger with you.💋
WeChat public account has been opened, , students who have not followed please remember to pay attention!
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。