vivo Internet Server Team - Wang Zhi
1. Business Background
From a technical point of view, the selection of technical solutions is limited by the actual business scenarios and aims to solve the actual business scenarios.
In our actual business scenario, it is necessary to collect and report behavioral data in the dimension of the game, consider the magnitude of the data, perform best-effort delivery and allow partial discarding of the data.
Data reporting supports batch reporting of game dimensions, and supports batch reporting of 128 behaviors of the same game.
Data reporting requires time-limitation control, and the reported data must be the data of the first 3 minutes of the reporting time.
The business form of the overall data is shown in the following figure:
2. Technical selection
From a business perspective, it includes data collection and data reporting. We compare data collection to producers and data reporting to consumers, which is a typical production and consumption model.
The production and consumption model is implemented within the JVM process through a queue + lock or a lock-free Disruptor, and is processed and decoupled through MQ (RocketMQ/kafka) in a cross-process scenario.
However, in terms of specific business scenarios, there are many restrictions on the consumption of messages, including: batch behavior reporting in the game dimension, time limit for behavior reporting, and refinement to the selection of various technical solutions for comparison.
Option One
Use message queues such as RocketMQ or Kafka to store reported messages, but the consumer side needs to consider aggregation according to the game dimension in the business process, and the technical details involve splitting according to the game dimension, under the premise of meeting the timeliness and batch nature of messages Trigger an escalation. The role played by the message middleware in this solution is essentially a transfer station for messages, and it does not solve the game dimension splitting, batching and timeliness mentioned in any business scenario.
Option II
On the basis of solution 1, a technical solution is sought to solve the message grouping, batch consumption and timeliness of the game dimension. The queue is implemented through the list structure of Redis (further requires the implementation of fixed-length queues) to solve the message grouping in the game dimension; batch consumption is realized through the Lrange supported by the list of Redis; the multi-threading on the business side is used to solve the timeliness problem, aiming at high frequency The game uses a separate thread pool for processing, and the above two methods can ensure that the consumption speed is greater than the production speed.
Scheme comparison
After comparing the two schemes, I decided to use Redis to implement a pseudo-message middleware:
- The fixed-length queue is implemented through the List object to save the behavior messages of the game dimension (the List object with the game as the key to save the user behavior);
- Save all game lists with behavior data through List;
- Deduplication judgment is performed by Set to ensure the uniqueness of the List object in 2.
The overall technical solution is shown in the figure below:
production process
Step 1: Push a certain behavior data of the game dimension to the queue of the game dimension.
Step 2: Determine whether the game is in the set of games, if so, return directly, if not, go to Step 3.
Step 3: Push the game to the game list.
consumption process
Step 1: Loop out a game from the list of game objects.
Step 2: Go to the game object obtained in step 1 to obtain data in batches from the behavior data queue of the game object for processing.
3. Technical principle
In the supported commands of Redis, the basic commands of List and Set are combined with Lua scripts to realize the entire technical solution.
At the message data level, the game dimension data to be consumed is maintained through a separate List cycle, and each game dimension uses a fixed-length List to store messages.
In the process of message production, a fixed-length queue in the game dimension is implemented by combining llen+lpop+rpush of List to ensure that the length of the queue is controllable.
In the process of message consumption, batch consumption of game-dimensional messages is realized by combining the lrange+ltrim of List.
At the complexity level of the entire execution, it is necessary to ensure that the time complexity is in the 0(N) constant dimension to ensure that the time is controllable.
3.1 Lua scripts
EVAL script numkeys key [key ...] arg [arg ...]
时间复杂度:取决于脚本本身的执行的时间复杂度。
> eval "return {KEYS[1],KEYS[2],ARGV[1],ARGV[2]}" 2 key1 key2 first second
1) "key1"
2) "key2"
3) "first"
4) "second"
Redis uses the same Lua interpreter to run all the commands.
Also Redis guarantees that a script is executed in an atomic way:
no other script or Redis command will be executed while a script is being executed.
This semantic is similar to the one of MULTI / EXEC.
From the point of view of all the other clients the effects of a script are either still not visible or already completed.
Redis uses the same Lua interpreter to run all commands, and we can guarantee that the execution of the script is atomic. The effect is similar to adding MULTI/EXEC.
- Multiple commands in a Lua script are executed in an atomic manner, which ensures thread safety of command execution.
- The Lua script combines the List command to implement a fixed-length queue for batch consumption.
- Lua scripts only support single-key operations, not multi-key operations.
3.2 List objects
LLEN key
计算List的长度
时间复杂度:O(1)。
LPOP key [count]
从List的左侧移除元素
时间复杂度:O(N),N为移除元素的个数。
RPUSH key element [element ...]
从List的右侧保存元素
时间复杂度:O(N),N为保存元素的个数。
- The basic commands of List include calculating the length of List, removing data, and adding data. The complexity of the overall command is O(N) constant time.
- Combining the above three commands, we can guarantee the realization of a fixed-length queue, which is accomplished by judging whether the queue length reaches a fixed length and combining with adding queue elements and removing queue elements.
LRANGE key start end
时间复杂度:O(S+N), S为偏移量start, N为指定区间内元素的数量。
下标(index)参数 start 和 stop 都以 0 为底,也就是说,以 0 表示列表的第一个元素,以 1 表示列表的第二个元素,以此类推。
你也可以使用负数下标,以 -1 表示列表的最后一个元素, -2 表示列表的倒数第二个元素,以此类推。
LTRIM key start stop
时间复杂度:O(N) where N is the number of elements to be removed by the operation.
修剪(trim)一个已存在的 list,这样 list 就会只包含指定范围的指定元素。
- The basic commands of List include returning data in batches and cutting data. The complexity of the overall command is O(N) constant time.
- Combining the above two commands, we are able to consume data in batches and remove queue data, return data in batches through LRANGE and retain the remaining data through LTRIM.
3.3 Set objects
SADD key member [member ...]
往Set集合添加数据。
时间复杂度:O(1)。
SISMEMBER key member
判断Set集合是否存在元素。
时间复杂度:O(1)。
4. Technical application
4.1 Production messages
定义LUA脚本
CACHE_NPPA_EVENT_LUA =
"local retVal = 0 " +
"local key = KEYS[1] " +
"local num = tonumber(ARGV[1]) " +
"local val = ARGV[2] " +
"local expire = tonumber(ARGV[3]) " +
"if (redis.call('llen', key) < num) then redis.call('rpush', key, val) " +
"else redis.call('lpop', key) redis.call('rpush', key, val) retVal = 1 end " +
"redis.call('expire', key, expire) return retVal";
执行LUA脚本
String data = JSON.toJSONString(nppaBehavior);
Long retVal = (Long)jedisClusterTemplate.eval(CACHE_NPPA_EVENT_LUA, 1, NPPA_PREFIX + nppaBehavior.getGamePackage(), String.valueOf(MAX_GAME_EVENT_PER_GAME), data, String.valueOf(NPPA_TTL_MINUTE * 60));
执行效果
实现固长队列的数据存储并设置过期时间
- A fixed-length queue is realized by integrating the three commands of llen+rpush+lpop.
- The atomic execution of the above commands is guaranteed through lua scripts.
- The overall execution process is shown in the figure above. The core concept ensures the atomic execution of queue length calculation (llen), queue data removal (lpop), and queue data storage (rpush) through the atomicity of lua scripts.
4.2 Consuming messages
定义LUA脚本
QUERY_NPPA_EVENT_LUA =
"local data = {} " +
"local key = KEYS[1] " +
"local num = tonumber(ARGV[1]) " +
"data = redis.call('lrange', key, 0, num) redis.call('ltrim', key, num+1, -1) return data";
执行LUA脚本
Integer batchSize = NppaConfigUtils.getInteger("nppa.report.batch.size", 1);
Object result = jedisClusterTemplate.eval(QUERY_NPPA_EVENT_LUA, 1,NPPA_PREFIX + gamePackage, String.valueOf(batchSize));
执行效果
取固定数量的对象,然后保留队列的剩余的消息对象。
- Bulk consumption of messages is realized by integrating the two commands lrange+ltrim.
- The atomic execution of the above commands is guaranteed through lua scripts.
- The overall execution process is shown in the figure above. The core concept ensures the atomic execution of data acquisition (Lrange) and data trimming (Ltrim) through the atomicity of lua scripts.
- The overall consumption process selects the pull mode, and polls the consumable queues for consumption through multi-threaded loops. Compared with the push mode that realizes the consumption process with the help of the pub/sub notification mechanism of redis, the pull mode has lower cost and better effect.
4.3 Notes
- In the Redis cluster mode, the leaflet key is recommended by executing the Lua script, and a redirection error will be reported if multiple keys are used.
- Under different Redis versions, Lua scripts handle the return value of null differently, please refer to the official documentation.
- In the consumer's consumption process, the game list is looped through, and then the corresponding message objects are obtained according to the game. However, different games have different degrees of popularity. Therefore, on the consumer side, we configure a separate consumption thread for popular games for consumption. It is equivalent to configuring consumers with different priorities for different games.
5. Online effect
- The QPS of production and consumption is about 1w qps, and the overall reported QPS will be much lower than the QPS of message production and consumption after batch reporting.
- The overall data is stored using the game package name as the key, and there is no hot spot in performance.
6. Applicable scenarios
After describing the principle and implementation details of the solution, the applicable business scenarios are further summarized. The overall solution is to build a pseudo message queue based on the basic data structure of redis to solve the scenario of single production batch consumption of messages, and realize the multi-topic mode of message queue through multi-key form. O(N) time complexity to complete batch consumption. In addition, this solution can also be downgraded to implement a FIFO fixed-length log queue.
7. Summary
This paper mainly explores the realization of MQ-like functions through the native commands of Redis in specific business scenarios, and innovatively combines the basic commands of List of Redis through Lua scripts to realize the grouping of messages, the fixed-length queue of messages, and the batch consumption of messages. ; The overall solution landed and ran smoothly in the online environment, providing a general solution for specific scenarios.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。