8
头图
Remember to like + follow Yo.

Preface

Redis has five basic data types, but do you know how the bottom layer of these five data types is implemented? Next, let’s take a look at how String, List, Hash, Set, and Sorted Set are implemented at the bottom. Before that, let’s take a look at the following basic data structures, including simple dynamic string (SDS), linked list, and dictionary. , Skip list, integer set and compressed list, they are the basic components of Redis data structure.

Five types of data structure underlying implementation

1. String

  • If a string object stores an integer value, and this integer value can be represented by the long type, then the string object will store the integer value in the ptr attribute of the string object structure (convert void* to long), and Set the encoding of the string object to int.
  • If the string object saves a string value, and the length of the string value is greater than 39 bytes, then the string object will use a simple dynamic string (SDS) to save the string value, and the encoding of the object Set to raw.
  • If the string object saves a string value, and the length of the string value is less than or equal to 39 bytes, then the string object will use embstr encoding to save the string value.

2. List

  • The encoding of the list object can be ziplist or linkedlist.
  • The length of all string elements saved by the list object is less than 64 bytes and the number of saved elements is less than 512, using ziplist encoding; otherwise, using linkedlist;

3. Hash

  • The encoding of the hash object can be ziplist or hashtable.
  • The key and value string length of all key-value pairs saved by the hash object is less than 64 bytes and the number of saved key-value pairs is less than 512, use ziplist encoding; otherwise, use hashtable;

4. Set

  • The encoding of the collection object can be intset or hashtable.
  • All elements stored in the collection object are integer values and the number of stored elements does not exceed 512, using intset encoding; otherwise, hashtable is used;

5. Sorted Set

  • The encoding of ordered sets can be ziplist or skiplist
  • The number of elements saved in an ordered set is less than 128 and the length of all element members saved is less than 64 bytes. Use ziplist encoding; otherwise use skiplist;

Next, talk about these underlying data structures separately.

1. Simple dynamic string (SDS)

Redis has built an abstract type called simple dynamic string (SDS) by itself, and uses SDS as the default string representation of Redis. Such as:

 set msg "hello world"  

The bottom layer of key and value is implemented by SDS.
The structure of SDS:

struct sdshdr {

    // 记录 buf 数组中已使用字节的数量
    // 等于 SDS 所保存字符串的长度
    int len;

    // 记录 buf 数组中未使用字节的数量
    int free;

    // 字节数组,用于保存字符串
    char buf[];

};
  • The value of the free attribute is 0, which means that this SDS has not allocated any unused space.
  • The value of the len attribute is 5, which means that this SDS stores a five-byte string.
  • The buf attribute is an array of char type. The first five bytes of the array store the five characters of'R','e','d','i', and's' respectively, and the last byte stores the five characters The null character'\0'.

SDS is similar to C language strings, but has more advantages:

1. SDS 获取字符串长度时间复杂度O(1):因为 SDS 通过 len 字段来存储长度,使用时直接读取就可以;C 语言要想获取字符串长度需要遍历整个字符串,时间复杂度O(N)。
2. SDS 能杜绝缓冲区的溢出:因为当 SDS API 要对 SDS 进行修改时,会先检查 SDS 的空间是否足够,如果不够的话 SDS 会自动扩容,So,不会造成缓冲区溢出。而 C 语言则不剧本这个功能。
3. SDS 能减少修改字符串时带来的内存重分配次数:  
    - 空间预分配:当SDS 扩容时不只是会增加需要的空间大小,还会额外的分配一些未使用的空间。分配的规则是:如果分配后SDS的长度小于 1MB,那么会分配等于分配后SDS 的大小的未使用空间,简单说就是,SDS 动态分配后是 16KB,那么就会多分配 16KB 的未使用空间;如果 小于 1MB,那么久分配 1MB 的未使用空间。  
    - 惰性空间释放: 惰性空间释放用于优化 SDS 的字符串缩短操作:当 SDS 的 API 需要缩短 SDS 保存的字符串时,并不会立即内存重分配来回收多出来的字节,而是用 free 来记录未使用空间。

Second, the linked list

The linked list provides efficient node rearrangement capabilities and sequential node access methods, and the length of the linked list can be flexibly adjusted by adding and deleting nodes. Linked lists are widely used in Redis. For example, the underlying implementation of List is a linked list. When a List contains a large number of elements, or the elements in the list are all relatively long strings, Redis will use the linked list. As the underlying implementation of List. In addition to being used as the underlying implementation of List, linked lists are also used for kinetic energy such as publish and subscribe, slow queries, and monitors. The Redis server itself also uses linked lists to store the status information of multiple clients, and uses linked lists to build client output buffers. Area.
Each linked list node uses an adlist.h/listNode structure to represent:

typedef struct listNode {

    // 前置节点
    struct listNode *prev;

    // 后置节点
    struct listNode *next;

    // 节点的值
    void *value;

} listNode;

Multiple listNodes can form a double-ended linked list through prev and next pointers, as shown in the figure below.

Although a linked list can be formed using only multiple listNode structures, it will be more convenient to operate if you use adlist.h/list to hold the linked list:

typedef struct list {

    // 表头节点
    listNode *head;

    // 表尾节点
    listNode *tail;

    // 链表所包含的节点数量
    unsigned long len;

    // 节点值复制函数
    void *(*dup)(void *ptr);

    // 节点值释放函数
    void (*free)(void *ptr);

    // 节点值对比函数
    int (*match)(void *ptr, void *key);

} list;

The list structure provides the list head pointer, the list tail pointer tail, and the list length counter len for the linked list. The dup, free, and match members are the type-specific functions needed to implement the polymorphic linked list:

  • The dup function is used to copy the value saved by the linked list node;
  • The free function is used to release the value saved by the linked list node;
  • The match function is used to compare whether the value stored in the linked list node is equal to another input value.
    The following figure is a linked list composed of a list structure and three listNode structures:

    Redis's linked list implementation can be summarized as follows:
  • Double-ended: The linked list node has prev and next pointers, and the complexity of obtaining the pre-node and post-node of a certain node is O(1).
  • Acyclic: The prev pointer of the head node and the next pointer of the tail node both point to NULL, and the access to the linked list ends with NULL.
  • With head pointer and tail pointer: Through the head pointer and tail pointer of the list structure, the complexity of the program to obtain the head node and the tail node of the linked list is O(1).
  • Linked list length counter: The program uses the len attribute of the list structure to count the linked list nodes held by the list. The complexity of the program to obtain the number of nodes in the linked list is O(1).
  • Polymorphism: Linked list nodes use void* pointers to store node values, and can set type-specific functions for node values through the dup, free, and match attributes of the list structure, so the linked list can be used to store various types of values.

    Three, dictionary

    A dictionary, also known as a symbol table, an associative array or a map, is an abstract data structure used to store key-value pairs. The Key is unique. Map similar to Java.
    Dictionaries are mainly used in Redis with:

  • The bottom layer of the Redis database is implemented with a dictionary. The addition, deletion, modification, and query operations of the database are all built on the operation of the dictionary, such as:

    > set msg "hello world"
    OK

    This is to create a key-value pair with the key "msg" and the value "hello world" and save it in the dictionary representing the database.

  • The dictionary is also one of the underlying implementations of the hash key: When a hash key contains more key-value comparisons, or the elements in the key-value pair are all relatively long strings, Redis will use the dictionary as the hash key The underlying implementation.

Redis's dictionary uses a hash table as the underlying implementation. There can be multiple hash table nodes in a hash table, and each hash table node stores a key-value pair in the dictionary.
The next three sections will introduce the implementation of Redis hash table, hash table node, and dictionary respectively.

Hash table

The hash table used by the Redis dictionary is defined by the dict.h/dictht structure:

typedef struct dictht {

    // 哈希表数组
    dictEntry **table;

    // 哈希表大小
    unsigned long size;

    // 哈希表大小掩码,用于计算索引值
    // 总是等于 size - 1
    unsigned long sizemask;

    // 该哈希表已有节点的数量
    unsigned long used;

} dictht;

The table attribute is an array. Each element in the array is a pointer to the dict.h/dictEntry structure. Each dictEntry structure stores a key-value pair.

The size attribute records the size of the hash table, that is, the size of the table array, while the used attribute records the number of nodes (key-value pairs) that the hash table currently has.

The value of the sizemask attribute is always equal to size-1. This attribute and the hash value together determine which index a key should be placed on the table array.

The figure below shows an empty hash table of size 4 (does not contain any key-value pairs).

Hash node

The hash table node is represented by the dictEntry structure, and each dictEntry structure holds a key-value pair:

typedef struct dictEntry {

    // 键
    void *key;

    // 值
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
    } v;

    // 指向下个哈希表节点,形成链表
    struct dictEntry *next;

} dictEntry;

The key attribute holds the key in the key-value pair, and the v attribute holds the value in the key-value pair, where the value of the key-value pair can be a pointer, or a uint64_t integer, or an int64_t integer.

The next attribute is a pointer to another hash table node. This pointer can connect multiple key-value pairs with the same hash value at once to solve the problem of key collision (collision).

For example, the following figure shows how to connect two keys k1 and k0 with the same index value through the next pointer.

dictionary

The dictionary in Redis is represented by the dict.h/dict structure:

typedef struct dict {

    // 类型特定函数
    dictType *type;

    // 私有数据
    void *privdata;

    // 哈希表
    dictht ht[2];

    // rehash 索引
    // 当 rehash 不在进行时,值为 -1
    int rehashidx; /* rehashing not in progress if rehashidx == -1 */

} dict;

The type attribute and the privdata attribute are set for different types of key-value pairs to create a polymorphic dictionary:

  • The type attribute is a pointer to a dictType structure. Each dictType structure stores a set of functions for operating a specific type of key-value pair. Redis will set different type-specific functions for dictionaries with different purposes.
  • The privdata attribute holds the optional parameters that need to be passed to those type-specific functions.

The ht attribute is an array containing two items. Each item in the array is a dictht hash table. Generally, the dictionary only uses the ht[0] hash table, and the ht[1] hash table is only used in the Used when rehashing the ht[0] hash table.

In addition to ht[1], another attribute related to rehash is rehashidx: it records the current progress of rehash, and if rehash is not currently in progress, its value is -1.

The following figure shows a dictionary in a normal state (without rehash):

Hash algorithm

When inserting a new key-value pair into the dictionary, you need to calculate the index value. Redis calculates the index value as follows:

# 使用字典设置的哈希函数,计算键 key 的哈希值
hash = dict->type->hashFunction(key);

# 使用哈希表的 sizemask 属性和哈希值,计算出索引值
# 根据情况不同, ht[x] 可以是 ht[0] 或者 ht[1]
index = hash & dict->ht[x].sizemask;

Similar to Java's HashMap, calculate the hash value of the key, and then hash & (len-1), while the sizemask of Redis is size-1.

What to do with hash conflicts

When there is a Hash conflict, Redis uses the chain address method to resolve the conflict. The chain address method is to form a linked list of conflicting nodes and place it at the index position, and Redis uses the header insertion method. There are three other methods to resolve hash conflicts, namely: open addressing (linear detection and then hashing, second detection and then hashing, pseudo-random detection and then hashing), re-hashing and establishing a public overflow area. Separately introduce some four methods to resolve hash conflicts.

rehash

With continuous operations, the key-value pairs in the hash table may increase or decrease. In order to keep the load factor of the hash table within a range, the hash table needs to be expanded or contracted. The process of shrinking and expanding is called rehash . The rehash process is as follows:

  1. Allocate space for the ht[1] hash table of the dictionary. The size of this hash table depends on the operation to be performed and the number of key-value pairs currently contained in ht[0] (that is, the ht[0].used attribute Value) (ht is the hash table in the dictionary, as described above):

    • If the expansion operation is performed, then the size of ht[1] is the first 2^n (2 raised to the power of n) greater than or equal to ht[0].used * 2;
    • If the shrinking operation is performed, then the size of ht[1] is the first 2^n greater than or equal to ht[0].used.
  2. Rehash all key-value pairs stored in ht[0] to ht[1]: rehash refers to recalculating the hash value and index value of the key, and then placing the key-value pair in the ht[1] hash table At the designated location.
  3. When all the key-value pairs contained in ht[0] have been migrated to ht[1] (ht[0] becomes an empty table), release ht[0], set ht[1] to ht[0], and ht[1] creates a new blank hash table to prepare for the next rehash

When any one of the following conditions is met, the program will automatically start to perform the expansion operation on the hash table:

The server is not currently executing the BGSAVE command or the BGREWRITEAOF command, and the load factor of the hash table is greater than or equal to 1;
The server is currently executing the BGSAVE command or the BGREWRITEAOF command, and the load factor of the hash table is greater than or equal to 5;
The load factor of the hash table can be determined by the formula:

# 负载因子 = 哈希表已保存节点数量 / 哈希表大小  
load_factor = ht[0].used / ht[0].size

Calculated.

For example, for a hash table with a size of 4 and containing 4 key-value pairs, the load factor of the hash table is:

load_factor = 4 / 4 = 1
For another example, for a hash table with a size of 512 and containing 256 key-value pairs, the load factor of the hash table is:

load_factor = 256 / 512 = 0.5
According to whether the BGSAVE command or the BGREWRITEAOF command is being executed, the load factor required by the server to perform the expansion operation is not the same. This is because in the process of executing the BGSAVE command or the BGREWRITEAOF command, Redis needs to create a child process of the current server process, and most The operating system uses copy-on-write technology to optimize the efficiency of the child process. Therefore, during the existence of the child process, the server will increase the load factor required to perform the expansion operation, so as to avoid the child process as much as possible. The hash table expansion operation is performed during the existence, which can avoid unnecessary memory write operations and save memory to the utmost extent.

On the other hand, when the load factor of the hash table is less than 0.1, the program automatically starts to shrink the hash table.

Progressive rehash

When rehashing, all the key-value pairs of ht[0] will be migrated to ht[1], but this action is not a one-time, but is done in multiples and gradually. The reason for this is that when the amount of data is large, a one-time migration will cause the server to customize the service for a period of time. In order to avoid this happening, progressive rehash .
The following are the detailed steps of the progressive rehash of the hash table:
1) Allocate space for ht[1] and let the dictionary hold two hash tables ht[0] and ht[1] at the same time.
2) Maintain an index counter variable rehashidx in the dictionary and set its value to 0, indicating that the rehash work has officially started.
3) During the rehash process, every time the dictionary is added, deleted, searched or updated, in addition to the specified operation, the program will also add all the key-value pairs of the ht[0] hash table on the rehashidx index. Rehash to ht[1], when the rehash work is completed, the program increments the value of the rehashidx attribute by one.
4) With the continuous execution of dictionary operations, eventually at a certain point in time, all key-value pairs of ht[0] will be rehashed to ht[1], and the program will set the value of the rehashidx attribute to -1, which means The rehash operation is complete.

The advantage of progressive rehash is that it adopts a divide-and-conquer approach, and all the calculation work required for rehash key-value pairs is applied to each addition, deletion, search and update operation of the dictionary, thus avoiding the centralized rehash. A huge amount of calculation. During the rehash period, the deletion, modification, and query operations of the dictionary are all applied to ht[0] and ht[1] at the same time. If you are looking for a key, you will find it at ht[0], and if you can’t find it, you will go to ht[1]. Note that the new key-value pair will only be saved to ht[1], not to ht[1]. On ht[0], this measure ensures that the key value of ht[0] will only decrease but not increase. With the rehash operation ht[0] will eventually become an empty table.

Redis's dictionary implementation features can be summarized as follows:

  • Dictionaries are widely used to implement various functions of Redis, including databases and hash keys.
  • The dictionary in Redis uses a hash table as the underlying implementation. Each dictionary has two hash tables, one for normal use, and the other only for rehash.
  • When the dictionary is used as the underlying implementation of the database, or the underlying implementation of the hash key, Redis uses the MurmurHash2 algorithm to calculate the hash value of the key.
  • The hash table uses the chain address method to resolve key conflicts, and multiple key-value pairs assigned to the same index will be connected into a singly linked list.
  • When expanding or shrinking the hash table, the program needs to rehash all the key-value pairs contained in the existing hash table into the new hash table, and this rehash process is not done all at once, but gradually Completed.

Four, jump table

Skiplist is an ordered data structure. It maintains multiple pointers to other nodes in each node to achieve the purpose of quickly accessing nodes.
The hop table supports node search with average O(\log N) and worst O(N) complexity, and can also process nodes in batches through sequential operations.

In most cases, the efficiency of skip tables can be comparable to balanced trees, and because the implementation of skip tables is simpler than balanced trees, many programs use skip tables instead of balanced trees.

Redis uses a jump table as one of the underlying implementations of the ordered set key: If an ordered set contains a large number of elements, or the member of the element in the ordered set is a relatively long string, Redis will Use skip lists as the underlying implementation of ordered set keys.

Redis only uses jump tables in two places, one is to implement ordered set keys, and the other is used as an internal data structure in cluster nodes. In addition, jump tables have no other purpose in Redis.

Redis's skip list is defined by two structures: redis.h/zskiplistNode and redis.h/zskiplist. The zskiplistNode structure is used to represent the skip list nodes, and the zskiplist structure is used to store the relevant information of the skip list nodes, such as the number of nodes. And pointers to the head and tail nodes of the table, and so on.

The figure above shows an example of a skip table. The zskiplist structure at the far left of the picture contains the following attributes:

  • header: Point to the header node of the skip table.
  • tail: Point to the tail node of the skip list.
  • level: Record the level of the node with the largest level in the current jump table (the level of the head node is not counted).
  • length: Record the length of the hop table, that is, the number of nodes currently contained in the hop table (the head node is not counted).

Located to the right of the zskiplist structure are four zskiplistNode structures, which contain the following properties:

  • Level: In the node, use the words L1, L2, L3, etc. to mark each level of the node. L1 represents the first level, L2 represents the second level, and so on. Each layer has two attributes: forward pointer and span. The forward pointer is used to visit other nodes at the end of the table, and the span records the distance between the node pointed to by the forward pointer and the current node. In the picture above, the arrow with a number on the line represents the forward pointer, and that number is the span. When the program traverses from the head to the end of the table, the access will proceed along the forward pointer of the layer.
  • Backward pointer: The back pointer of the node marked with BW in the node, which points to the previous node located at the current node. The back pointer is used when the program traverses from the end of the table to the head of the table.
  • Score: 1.0, 2.0 and 3.0 in each node are the scores saved by the node. In the jump table, the nodes are arranged from small to large according to their saved scores.
  • Member object (obj): o1, o2 and o3 in each node are the member objects saved by the node.

Note that the structure of the header node and other nodes is the same: the header node also has back pointers, scores and member objects, but these attributes of the header node will not be used, so these parts are omitted in the figure and only shown The various layers of the table head node.

The rest of this section will give a more detailed introduction to the two structures of zskiplistNode and zskiplist.

  1. Skip list node

The realization of the skip list node is defined by the redis.h/zskiplistNode structure:

typedef struct zskiplistNode {

    // 后退指针
    struct zskiplistNode *backward;

    // 分值
    double score;

    // 成员对象
    robj *obj;

    // 层
    struct zskiplistLevel {

        // 前进指针
        struct zskiplistNode *forward;

        // 跨度
        unsigned int span;

    } level[];

} zskiplistNode;

layer
The level array of a jump table node can contain multiple elements, each element contains a pointer to other nodes, and the program can use these layers to speed up access to other nodes. Generally speaking, the more layers there are, the more nodes they can access. The faster the speed.

Every time a new jump table node is created, the program randomly generates a value between 1 and 32 as the size of the level array according to the power law (the larger the number, the lower the probability). This The size is the "height" of the layer.

The following figure shows three nodes with heights of 1, 3, and 5 respectively. Because the array index of the C language always starts from 0, the first level of nodes is level[0], and the second level is level[1], and so on.

Span
The span of the layer (level[i].span attribute) is used to record the distance between two nodes:

The greater the span between two nodes, the farther they are apart.
The span of all forward pointers to NULL is 0, because they are not connected to any node.
At first glance, it’s easy to think that the span is related to the traversal operation, but in fact it is not the case-the traversal operation can be completed using only the forward pointer. The span is actually used to calculate the rank: During the node process, the spans of all layers visited along the way are accumulated, and the result obtained is the ranking of the target node in the jump table.

For example, the following figure uses a dotted line to mark the layers that are experienced along the way when searching for a node with a score of 3.0 and a member object of o3 in the jump table: The search process only passes through one layer, and the span of the layer is 3, so The target node's rank in the jump list is 3.

Back pointer
The back pointer (backward attribute) of the node is used to access the node from the end of the table to the head of the table: It is different from the forward pointer that can skip multiple nodes at a time, because each node has only one backward pointer, so it can only go back to the front each time. A node.

The following figure shows with a dotted line how to traverse all nodes in the jump list from the end of the list to the beginning: The program first accesses the end node of the list through the tail pointer of the jump list, then accesses the penultimate node through the back pointer, and then backs along again. The pointer visits the third-to-last node, and then encounters a back pointer pointing to NULL, so the visit ends.

points and members

The score (score attribute) of a node is a double type floating point number. All nodes in the jump table are sorted from small to large.

The member object (obj attribute) of the node is a pointer, which points to a string object, and the string object holds an SDS value.

In the same jump table, the member objects saved by each node must be unique, but the scores saved by multiple nodes can be the same: Nodes with the same score will be sorted according to the size of the member objects in the lexicographical order , Nodes with smaller member objects will be ranked in the front (toward the head of the table), and nodes with larger member objects will be ranked at the back (toward the bottom of the table).

For example, in the jump table shown in the figure below, the three jump table nodes all save the same score of 10086.0, but the node that saves the member object o1 is ranked before the node that saves the member objects o2 and o3, and the member is saved The node of the object o2 is arranged before the node of the member object o3. It can be seen that the order of the three member objects o1, o2, and o3 in the dictionary is o1 <= o2 <= o3.

  1. Skip list
    Although only multiple hop table nodes can form a hop table, as shown in the figure below.

    But by using a zskiplist structure to hold these nodes, the program can process the entire skip list more conveniently, such as quickly accessing the head and tail nodes of the skip list, or quickly obtain the number of skip list nodes (also It is the length of the skip list) and other information, as shown in the figure below.

    The definition of the zskiplist structure is as follows:
typedef struct zskiplist {

    // 表头节点和表尾节点
    struct zskiplistNode *header, *tail;

    // 表中节点的数量
    unsigned long length;

    // 表中层数最大的节点的层数
    int level;

} zskiplist;

The header and tail pointers point to the head and tail nodes of the jump table respectively. Through these two pointers, the complexity of locating the head node and the tail node of the program is O(1).

By using the length attribute to record the number of nodes, the program can return the length of the skip list in O(1) complexity.

The level attribute is used to obtain the number of levels of the node with the largest level in the jump table within O(1) complexity. Note that the level of the head node is not included in the calculation.

Summary of the jump table:

  • The jump list is one of the underlying implementations of ordered collections, except that it has no other applications in Redis.
  • Redis's skip list implementation consists of two structures: zskiplist and zskiplistNode, where zskiplist is used to store skip list information (such as table head node, table tail node, length), and zskiplistNode is used to represent the skip list node.
  • The height of each hop table node is a random number between 1 and 32.
  • In the same hop table, multiple nodes can contain the same score, but the member object of each node must be unique.
  • The nodes in the jump table are sorted according to the size of the score. When the score is the same, the nodes are sorted according to the size of the member object.

Five, the set of integers

Integer set (intset) is one of the underlying implementations of set keys: When a set contains only integer value elements, and the number of elements in this set is small, Redis will use integer sets as the underlying implementation of set keys.

For example, if we create a collection key with only five elements, and all elements in the collection are integer values, then the underlying implementation of the collection key will be an integer collection:

redis> SADD numbers 1 3 5 7 9
(integer) 5

redis> OBJECT ENCODING numbers
"intset"

Integer set (intset) is an abstract data structure used by Redis to store integer values. It can store integer values of type int16_t, int32_t or int64_t, and it is guaranteed that no duplicate elements will appear in the set.

Each intset.h/intset structure represents a set of integers:

typedef struct intset {

    // 编码方式
    uint32_t encoding;

    // 集合包含的元素数量
    uint32_t length;

    // 保存元素的数组
    int8_t contents[];

} intset;

The contents array is the underlying implementation of the integer collection: each element of the integer collection is an array item (item) of the contents array, and each item is arranged in an orderly manner according to the size of the value in the array, and the array does not contain any Duplicate items.

The length property records the number of elements contained in the integer set, which is the length of the contents array.

Although the intset structure declares the contents property as an array of type int8_t, in fact the contents array does not hold any value of type int8_t-the true type of the contents array depends on the value of the encoding property: if the value of the encoding property is INTSET_ENC_INT16, then contents It is an array of type int16_t, each item in the array is an integer value of type int16_t (minimum value is -32,768, maximum value is 32,767).
The figure below is an integer set containing five integer values of type int16_t.

Whenever we want to add a new element to the integer collection, and the type of the new element is longer than the types of all the existing elements in the integer collection, the integer collection needs to be upgraded before the new element can be added to Inside the set of integers.

Upgrading the set of integers and adding new elements is divided into three steps:

  • According to the type of the new element, the space size of the underlying array of the integer set is expanded, and space is allocated for the new element.
  • Convert all the existing elements of the underlying array to the same type as the new element, and place the converted elements in the correct position, and in the process of placing elements, it is necessary to continue to maintain the ordered nature of the underlying array. change.
  • Add new elements to the underlying array.

The integer set does not support downgrade operations. Once the array is upgraded, the encoding will always maintain the upgraded state.

A summary of the set of integers:

  • Integer collections are one of the underlying implementations of collection keys.
  • The bottom layer of the integer collection is an array. This array stores the collection elements in an orderly and non-repetitive manner. When necessary, the program will change the type of the array according to the type of the newly added element.
  • The upgrade operation brings operational flexibility to the integer set and saves memory as much as possible.
  • The integer set only supports upgrade operations, not downgrade operations.

Six, compressed list

The compressed list (ziplist) is one of the underlying implementations of list keys and hash keys.

When a list key contains only a small number of list items, and each list item is either a small integer value or a string with a relatively short length, Redis will use a compressed list as the underlying implementation of the list key.

For example, executing the following command will create a list key implemented by a compressed list:

redis> RPUSH lst 1 3 5 10086 "hello" "world"
(integer) 6

redis> OBJECT ENCODING lst
"ziplist"

Because the list keys contain small integer values such as 1, 3, 5, 10086, and short strings such as "hello" and "world".

In addition, when a hash key contains only a small number of key-value pairs, and the key and value of each key-value pair is either a small integer value or a string with a relatively short length, then Redis will use a compressed list for hashing The underlying implementation of the key.

For example, executing the following command will create a hash key implemented by a compressed list:

redis> HMSET profile "name" "Jack" "age" 28 "job" "Programmer"
OK

redis> OBJECT ENCODING profile
"ziplist"

Because all the keys and values contained in the hash key are small integer values or short strings.
compressed list:
The compressed list is developed by Redis to save memory. It is a sequential data structure composed of a series of specially coded contiguous memory blocks.

A compressed list can contain any number of entries, and each node can store a byte array or an integer value.

The following figure shows the various components of the compressed list.

The following table records the type, length, and purpose of each component. Table 7-1 records the type, length, and purpose of each component.

AttributesTypes oflengthuse
zlbytesuint32_t4 bytesRecord the number of bytes of memory occupied by the entire compressed list: used when re-allocating the compressed list or calculating the position of the zlend.
zltailuint32_t4 bytesRecord how many bytes the end node of the compressed list is from the start address of the compressed list: With this offset, the program can determine the address of the end node of the table without traversing the entire compressed list.
zllenuint16_t2 bytesThe number of nodes contained in the compressed list is recorded: When the value of this attribute is less than UINT16_MAX (65535), the value of this attribute is the number of nodes contained in the compressed list; when this value is equal to UINT16_MAX, the actual number of nodes needs to be traversed through the entire compressed list. Calculated.
entryXList nodeindefiniteEach node contained in the compressed list, the length of the node is determined by the content stored in the node.
zlenduint8_t1 byteThe special value 0xFF (decimal 255) is used to mark the end of the compressed list.

compressed list node:
Each compressed list node consists of three parts: previous_entry_length, encoding, and content, as shown in the figure below.

  • The previous_entry_length attribute of the node is in bytes and records the length of the previous node in the compressed list.
  • The encoding attribute of the node records the type and length of the data stored in the content attribute of the node:
  • The content attribute of the node is responsible for storing the value of the node. The value of the node can be a byte array or an integer. The type and length of the value are determined by the encoding attribute of the node.
    A summary of the compressed list:
  • The compressed list is a sequential data structure developed to save memory.
  • The compressed list is used as one of the underlying implementations of the list key and hash key.
  • The compressed list can contain multiple nodes, and each node can store a byte array or integer value.
  • Adding new nodes to the compressed list, or deleting nodes from the compressed list, may trigger chain update operations, but the probability of such operations is not high.
  • At last

If you think it's not bad, please like and follow and forward, I'm very grateful.

More exciting content Wechat search element: Mushroom can't sleep


蘑菇睡不着
175 声望1.9k 粉丝