Why is Redis so fast?

First of all, Redis is an open source, network-enabled, memory-based, optional persistent key-value pair storage system written in ANSI C.

1 The history of Redis
Initial release in 2009 by Salvatore Sanfilippo (Father of Redis)
Sponsored by VMare until May 2013
May 2013-June 2015, sponsored by Pivotal
Sponsored by Redis Labs since June 2015 Redis remains by far the most popular key-value storage system according to rankings on db-engines.com.

2 Major versions of Redis
The initial version of Redis was released in May 2009
Redis 2.6.0 released in 2012
Released Redis 2.8.0 in November 2013
Redis 3.0.0 was released in April 2015, in which Redis was introduced to the cluster
Redis 4.0.0 was released in July 2017, in which Redis introduced the module system
Redis 5.0.0 was released in October 2018, in which the Streams structure was introduced
6.0.1 (stable version) was released on May 2, 2020, in which multithreading, RESP3 protocol, and diskless replicas were introduced
7.0 RC1 will be released on January 31, 2022. In this version, performance and memory are mainly optimized, and the new AOF mode

3 How fast is Redis?

Redis comes with a tool redis-benchmark that can perform performance testing. With this command, we can simulate the scenario where multiple clients initiate requests at the same time, and can detect the time it takes for Redis to process these requests.

According to the official documentation, Redis has been benchmarked on over 60,000 connections and is still able to sustain 50,000 q/s under these conditions. If the same request volume hits MySQL, it will definitely not be able to withstand it, and it will collapse directly.

With high-end configurations, the number of client connections is also an important factor. Being based on epoll/kqueue, the Redis event loop is quite scalable. Redis has already been benchmarked at more than 60000 connections, and was still able to sustain 50000 q/s in these conditions. As a rule of thumb, an instance with 30000 connections can only process half the throughput achievable with 100 connections. Here is an example showing the throughput of a Redis instance per number of connections;

4 Why is Redis so fast?

What is the reason for Redis to have such high performance? Mainly divided into the following aspects

4.1 Memory-based implementation

Mysql's data storage persistence is stored on the disk, and the read data is in the memory. If there is no disk I/O, the data is first read into the memory, and then the data is read. Redis, on the other hand, stores data directly in memory, reducing the consumption caused by disk I/O.

4.2 Efficient Data Structures

A reasonable data structure can make your application/program faster. In order to improve efficiency, Mysql index chooses the data structure of B+ tree. Let's first look at the data structure & internal coding diagram of Redis:

4.2.1 SDS Simple Dynamic String

Redis does not use the string type of the native C language but implements its own string structure - simple dynamic string (simple dynamic string)

The difference between SDS and C language strings:
Get string length: C strings have O(N) complexity while SDS has O(1).
Eliminate buffer overflow (the C language needs to be manually expanded each time). If the C string wants to expand, there will be a memory overflow without applying enough memory space, and SDS records the length of the string, if the length If it is not enough, it will be expanded.
Reduce the number of memory reallocations when modifying strings. Space pre-allocation,
Rule 1: Modified length < 1MB, pre-allocate unused space of the same size, free=len;
Rule 2: Modified length >= 1MB, pre-allocate 1MB of unused space.
Inert space release, when the SDS is shortened, instead of reclaiming the excess memory space, free records the excess space, and if there are subsequent changes, the space recorded in free is directly used to reduce allocation.

4.2.2 embstr & raw

Redis strings have two storage methods. When the length is particularly short, they are stored in emb form (embeded), and when the length exceeds 44, they are stored in raw form.

Why is the dividing line 44?
There is also a high-speed data buffer between the CPU and the main memory. There are L1, L2, and L3 caches. When the L1 cache is the closest to the CPU, the CPU will obtain data from the L1 cache in a limited manner, followed by L2 and L3.

L1 is the fastest, but its storage space is also limited, about 64 bytes, throw away the space occupied by the fixed attributes of the object, and '\0', the remaining space is at most 44 bytes, and the L1 cache exceeds 44 bytes. Can't save.

4.2.3 Dictionary (DICT)
Redis is a KV-type in-memory database, and all key values are stored in dictionaries. A dictionary is a hash table, such as a HashMap, and the corresponding value can be directly obtained through the key. And the characteristics of the hash table, the corresponding value can be obtained in O(1) time complexity.

 //字典结构数据typedef struct dict {    dictType *type;  //接口实现，为字典提供多态性    void *privdata;  //存储一些额外的数据    dictht ht[2];   //两个hash表    long rehashidx. //渐进式rehash时记录当前rehash的位置} dict;

Usually only one of the two hashtables has value, and the other one is used during rehash. During expansion, it is gradually migrated from one hashtable to another hashtable. After the relocation, the old hashtable will be is emptied.

 hashtable的结构如下：typedef struct dictht {    dictEntry **table;  //指向第一个数组    unsigned long size;  //数组的长度    unsigned long sizemask; //用于快速hash定位    unsigned long used; //数组中元素的个数} dictht;typedef struct dictEntry {    void *key;    union {        void *val;        uint64_t u64;        int64_t s64;        double d;   //用于zset，存储score值    } v;    struct dictEntry *next;} dictEntry;

4.2.4 ziplist

In order to save memory space in redis, zset and hash objects use the ziplist structure when the data is relatively small, which is a continuous memory space and supports bidirectional traversal.

4.2.5 Jump table

The jump table is a data structure unique to Redis, which is based on the linked list, adding multi-level indexes to improve the search efficiency. The skip table supports average O(logN), worst O(N) node lookups, and can batch process nodes through sequential operations.

4.3 Reasonable data encoding

Redis supports multiple data types, each basic type, and possibly multiple data structures. When, what data structure to use, and what encoding to use is the result of the redis designer's summary and optimization.

String: If a number is stored, it is encoded in int type; if a non-digital string less than or equal to 39 bytes is stored, it is embstr; if it is greater than 39 bytes, it is raw encoding.
List: If the number of elements in the list is less than 512, the value of each element of the list is less than 64 bytes (default), use ziplist encoding, otherwise use linkedlist encoding
Hash: The number of hash type elements is less than 512. If all values are less than 64 bytes, use ziplist encoding, otherwise use hashtable encoding.
Set: If the elements in the set are all integers and the number of elements is less than 512, use intset encoding, otherwise use hashtable encoding.
Zset: When the number of elements in the ordered set is less than 128 and the value of each element is less than 64 bytes, use ziplist encoding, otherwise use skiplist (skip table) encoding

4.4 Reasonable threading model

The first is the single-threaded model - to avoid the waste of time caused by context switching, single-threaded refers to the network request module using one thread, that is, one thread handles all network requests, other modules should use multiple threads, and multiple threads will still be used , using multi-threading, if there is no good design, it is very likely that the throughput rate in the early stage will increase when the number of multi-threads increases, but the increase in the throughput rate in the later stage is not so obvious.

In the case of multi-threading, a part of the resource is usually shared. When multiple threads modify this part of the shared resource at the same time, an additional mechanism is required to ensure that it will cause additional overhead.

Another point is the I/O multiplexing model. Without understanding the principle, let’s compare an example: let 30 students in the class do homework at the same time, after the teacher checks, all the homework of 30 students is After the inspection is completed, the get out of class can be dismissed. How to get out of class as quickly as possible with limited resources?

The first: Arrange a teacher to check one by one in order. Check A first, then B, then C, D. . . If one student gets stuck in the middle, the whole class will be delayed. This mode is like, you use a loop to process sockets one by one, and there is no concurrency capability at all. This method requires only one teacher, but it will take a long time.

The second: arrange 30 teachers, each teacher checks a student's homework. This is similar to creating a process or thread for each socket to handle the connection. This method requires 30 teachers (the most resource-intensive), but is the fastest.

The third type: Arrange a teacher to stand on the podium, and whoever finishes the answer raises his hand. At this time, C and D raised their hands, indicating that they had finished their homework. The teacher went down to check the answers of C and D in turn, and then continued to return to the podium. At this time, E and A raise their hands again, and then go to deal with E and A. In this way, tasks can be processed as quickly as possible with minimal resource consumption.

Multiple I/O multiplexing technology allows a single thread to efficiently process multiple connection requests, and Redis uses epoll as the implementation of I/O multiplexing technology. Moreover, Redis's own event processing model converts connections, reads and writes, and closes in epoll into events, without wasting too much time on network I/O.

5 Usage scenarios

6 Reference documents
DB-Engines Ranking https://db-engines.com/en/rankingRedis benchmarks
https://redis.io/topics/benchmarks/IO multiplexing mechanism in Redis
https://www.cnblogs.com/reecelin/p/13538382.html

Why is Redis so fast?

京东云开发者

引用和评论

JDK从8升级到21的问题集

嘎嘎好用！推荐三款开源的 Redis 桌面客户端！

MySQL慢查询日志：性能优化的终极指南

做到真正0丢失、0重复：Apache SeaTunnel 实现万亿级数据一致性全解密

MySQL 备份 Shell 脚本：支持远程同步与阿里云 OSS 备份

《SQL应用场景解析：如何通过SQL解决实际业务问题》

如何实现页面广告随时上下线、过期自动下线及到时自动上线