Delay queue: a message queue delay function
- Delay → an uncertain time in the future
- mq → consumption behavior is sequential
With this explanation, the whole design becomes clear. Your purpose is to delay, and the carrying container is mq.
background
List the possible scenarios in my daily business:
- Establish a delayed schedule and need to remind the teacher to go to class
- Delayed push → push the announcements and assignments needed by the teacher
In order to solve the above problems, the simplest and most direct way is to scan the table regularly:
When the service starts, start an asynchronous coroutine → scan msg table regularly, and call the corresponding handler when the event triggers
Several disadvantages:
- Every service that requires timed/delayed tasks requires an msg table for additional storage → storage and business coupling
- Timing scan → The time is not well controlled, and the trigger time may be missed
- It is a burden on msg table instance. There is a service repeatedly that continuously exerts continuous pressure on the database
What is the biggest problem?
scheduling model is basically unified, do not do repeated business logic
We can consider extracting the logic from the specific business logic and turning it into a public part.
And this scheduling model is delay queue .
In fact, to put it plainly:
delay queue model is to store the events to be executed in the future in advance, and then continuously scan the storage, and execute the corresponding task logic when the execution time is triggered.
So is there a ready-made solution in the open source world? The answer is yes. Beanstalk ( https://github.com/beanstalkd/beanstalkd ) It basically meets the above requirements
aim of design
- Consumer behavior at least
- High availability
- real-time
- Support message deletion
Let's talk about the design direction of the above-mentioned purposes in turn:
consuming behavior
This concept is taken from mq. Several directions of consumer delivery are provided in mq:
at most once
→ At most once, the message may be lost, but it will not be repeatedat least once
→ At least once, the message will certainly not be lost, but it may be repeatedexactly once
→ Yes and only once, the message will not be lost or repeated, and it will only be consumed once.
exactly once
is guaranteed at both ends of producer + consumer as much as possible. When the producer cannot guarantee that, the consumer needs to do a de-duplication before consumption, so that the consumption will not be repeated after consumption. This is directly guaranteed in the delay queue.
The simplest: uses redis setNX to reach the only consumption of job id
High availability
Support multi-instance deployment. After an instance is suspended, there are backup instances that continue to provide services.
This externally provided API uses the cluster model, which encapsulates multiple nodes internally, and stores redundantly among multiple nodes.
Why doesn't
Considering storage solutions based on message queues such as kafka/rocketmq, and finally giving up such choices from the storage design model.
For example, suppose that Kafka is a message queue storage to realize the delay function, and the time of each queue needs to create a separate topic (such as: Q1-1s, Q1-2s..). This design is not too problematic in scenarios where the delay time is relatively fixed, but if the delay time changes relatively large and the number of topics is too large, it will change the disk from sequential read and write to random read and write, which will cause performance degradation. At the same time, it will also bring other problems like restarting or too long recovery time.
- Too many topics → storage pressure
- Topic stores real time, reads at different times (topics) during scheduling, sequential read → random read
- Similarly, when writing, write sequentially → write randomly
Architecture design
API design
producer
producer.At(msg []byte, at time.Time)
producer.Delay(body []byte, delay time.Duration)
producer.Revoke(ids string)
consumer
consumer.Consume(consume handler)
After using the delay queue, the overall structure of the service is as follows, and the state transition of the job in the queue:
- service →
producer.At(msg []byte, at time.Time)
→ insert a delayed job into the tube - Timing trigger → job status is updated to ready
- The consumer gets the ready job → takes out the job and starts to consume; and changes the status to reserved
- Execute the handler logic processing function passed into the consumer
Production Practice
Mainly introduce what specific functions of the delay queue we use in daily development.
Production side
Production delay tasks development, 1614a801fda548 only needs to determine the task execution time
- Incoming At()
producer.At(msg []byte, at time.Time)
- The time difference will be calculated internally and inserted into the tube
- Incoming At()
If the task time is modified, and the task content is modified
- In production, it may be necessary to create an additional logic_id → job_id relational table
- Query job_id →
producer.Revoke(ids string)
, delete it, and then insert it again
Consumer side
First of all, the framework level guarantees the exactly once
the consumption behavior, but the upper-level business logic consumption fails or network problems occur, or various problems lead to consumption failure, and all the details are handed over to business development. Reasons for this:
- The framework and basic components only guarantee the correctness of the flow of job status
- The consumer side of the framework only guarantees the uniformity of consumer behavior
Delayed tasks do not behave uniformly in different businesses
- Emphasizes the necessity of the task, and when the consumption fails, it needs to keep retrying until the task is successful
- Emphasize the punctuality of the task, if the consumption fails, you can choose to discard if you are not sensitive to the business
Here is a description of how the consumer side of the framework guarantees the uniformity of consumer behavior:
Divided into cluster and node. cluster :
https://github.com/tal-tech/go-queue/blob/master/dq/consumer.go#L45
- Inside the cluster, the consume handler is made a layer and then encapsulated
- the consume body and use this hash as the redis deduplication key
- If it exists, it will not be processed and discarded
node:
https://github.com/tal-tech/go-queue/blob/master/dq/consumernode.go#L36
- The consuming node gets the ready job; execute Reserve (TTR) first, subscribe to this job, and execute the job for logical processing
Delete(job) in node; then consume
- fails, it will be thrown up to the business layer and retry
So for the consumer side, developers need to realize the idempotence of consumption by themselves.
project address
go-queue
is implemented based on go-zero
go-zero
on github. Used by
has 300+, and open source gets 11k+ stars in one year.
- go-zero: https://github.com/zeromicro/go-zero
- go-stash: https://github.com/tal-tech/go-queue
Welcome to use and star support us!
WeChat Exchange Group
Follow the " Practice " public account and click on the exchange group get the community group QR code.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。