[Redis series] redis learning sixteen, redis dictionary (map) and its core coding structure

redis is written in C language, but C language does not have a dictionary data structure, so C language itself uses a structure to customize a dictionary structure

typedef struct redisDb

Redis database data structure in src\server.h

 /* Redis database representation. There are multiple databases identified
 * by integers from 0 (the default database) up to the max configured
 * database. The database number is the 'id' field in the structure. */
typedef struct redisDb {
    dict *dict;                 /* The keyspace for this DB */
    dict *expires;              /* Timeout of keys with a timeout set */
    dict *blocking_keys;        /* Keys with clients waiting for data (BLPOP)*/
    dict *ready_keys;           /* Blocked keys that received a PUSH */
    dict *watched_keys;         /* WATCHED keys for MULTI/EXEC CAS */
    int id;                     /* Database ID */
    long long avg_ttl;          /* Average TTL, just for stats */
    unsigned long expires_cursor; /* Cursor of the active expire cycle. */
    list *defrag_later;         /* List of key names to attempt to defrag one by one, gradually. */
} redisDb;

redisDb stores the underlying data structure of the redis database:

dict

Dictionary type

expires

Expiration

blocking_keys

The key for the client to wait for data (BLPOP)

ready_keys

Keys that received PUSH are blocked

watched_keys

Monitor the keys of MULTI/EXEC CAS, such as transactions will be used

id of the database, 0 - 15

avg_ttl

statistical mean ttl

expires_cursor

record expiration period

defrag_later

list of keys

typedef struct dict

src\dict.h dictionary data structure

 typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
    int16_t pauserehash; /* If >0 rehashing is paused (<0 indicates coding error) */
} dict;

dict The data structure that stores the dictionary

type

type of dictionary

privdata

private data

Hash table, an old table, a new table, the new table will be used only when the hash table is expanded, that is ht[1]

typedef struct dictType

 typedef struct dictType {
    uint64_t (*hashFunction)(const void *key);
    void *(*keyDup)(void *privdata, const void *key);
    void *(*valDup)(void *privdata, const void *obj);
    int (*keyCompare)(void *privdata, const void *key1, const void *key2);
    void (*keyDestructor)(void *privdata, void *key);
    void (*valDestructor)(void *privdata, void *obj);
    int (*expandAllowed)(size_t moreMem, double usedRatio);
} dictType;

dictType Defines multiple function pointers to facilitate subsequent method implementation and invocation

For example keyCompare function pointer, he is a pointer, pointing to a function, this function has 3 parameters, and 1 return value:

3 parameters

privdata

specific data

key1

key1 The specific value of this key

key2

key2 The specific value of this key

The function pointed to by this pointer keyCompare is to compare the size of two keys

typedef struct dictht

 /* This is our hash table structure. Every dictionary has two of this as we
 * implement incremental rehashing, for the old to the new table. */
typedef struct dictht {
    dictEntry **table;
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;
} dictht;

dictht stores hash the data structure used by the table

table

the actual key-value pair

size

capacity of hashtable

sizemask

equal to size -1

used

number of hashtable elements

typedef struct dictEntry

 typedef struct dictEntry {
    void *key;
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
        double d;
    } v;
    struct dictEntry *next;
} dictEntry;

dictEntry is the actual data structure of the key-value pair

The key value is actually a sds type

value value, is a union

dictEntry pointer, pointing to the next data, mainly to solve the hash conflict

For example, in the previous article we introduced hash , as shown in the figure below, key is 1, v is (k3, v3), next points to (k2, v2), in general, next points to NULL by default

The above union v , the first element in it is, void *val;

In fact, this element is pointing to the real value, this element is a pointer, the actual data structure is like this

 typedef struct redisObject {
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
                            * LFU data (least significant 8 bits frequency
                            * and most significant 16 bits access time). */
    int refcount;
    void *ptr;
} robj;

type

Type, occupying 4 bits, is used to constrain the client api, such as string type, embstr, hash, zset, etc.

encoding

Encoding type, occupying 4 bits, the numbers used are 0 - 10, which respectively represent different data types

lru occupies 24 bits, 3 bytes, memory elimination algorithm

refcount

Reference count, int type, 4 bytes

The actual data pointer, in 64-bit operating systems, ptr occupies 8 bytes

Small case of bitmap

Set a bitmap key, which is used as the online user marked on the 11th

 127.0.0.1:6379> SETBIT login:9:11 25 1
(integer) 0
127.0.0.1:6379> SETBIT login:9:11 26 1
(integer) 0
127.0.0.1:6379> SETBIT login:9:11 27 1
(integer) 0
127.0.0.1:6379> BITCOUNT login:9:11
(integer) 3
127.0.0.1:6379> strlen login:9:11
(integer) 4

BITCOUNT key [start end]

Through BITCOUNT , it can be seen that there are 3 people online on the 11th, and login:9:11 occupies 4 bytes of bytes.

 127.0.0.1:6379> SETBIT login:9:12 26 1
(integer) 0
127.0.0.1:6379> SETBIT login:9:12 25 0
(integer) 0
127.0.0.1:6379> SETBIT login:9:12 27 1
(integer) 0
127.0.0.1:6379> STRLEN login:9:12
(integer) 4

Through BITCOUNT , it can be seen that there are 2 people online on the 12th, and login:9:12 occupies 4 bytes of bytes.

Next, we will take the AND operation of login:9:11 and login:9:12 to calculate the number of people who have been online for two days on the 11th and 12th

 127.0.0.1:6379> BITOP and login:and login:9:11 login:9:12
(integer) 4
127.0.0.1:6379> BITCOUNT login:and
(integer) 2

BITOP operation destkey key [key ...]

According to the above results, we can see that the number of people who have been online on the 11th and 12th for two days is 2, and the verification is ok

Let's look at the number of people online on any given day on the 11th and 12th

 127.0.0.1:6379> BITOP or login:or login:9:11 login:9:12
(integer) 4
127.0.0.1:6379> BITCOUNT login:or
(integer) 3

According to the above results, we can see that the number of people online on any day on the 11th and 12th is 3, and the verification is ok

 127.0.0.1:6379> type login:or
string
127.0.0.1:6379> OBJECT encoding login:or
"raw"
127.0.0.1:6379> OBJECT encoding login:9:12
"raw"
127.0.0.1:6379> OBJECT encoding login:and
"raw"

Let's take a look at the key used above, what data type is actually in redis,

OBJECT encoding [arguments [arguments ...]]

It can be seen that the above are all "raw" types, that is, the sds type of redis

cache line

Let's look at a small example, set a string key in redis

 127.0.0.1:6379> set name xiaoming
OK
127.0.0.1:6379> OBJECT encoding name
"embstr"

We can see that the type of name is "embstr", so how is the bottom layer of "embstr" implemented? How many bytes of data can "embstr" carry?

We mentioned above that the key-value pair in redis is stored in the dictEntry structure. The val pointer in the dictEntry structure points to a redisObject structure, which is like this

In a 64-bit machine, the CPU reads data in memory by reading cache lines.

A cache line has 64 bytes

A redisObject structure occupies 16 bytes

Then there are 48 bytes left to use, so which sds data structure in redis is used to store the data?

Using the hisdshdr8 type, the first 3 elements of the hisdshdr8 type sds occupy 3 bytes , then the remaining buf storage data can store 45 bytes (64 - 16 - 3) of data

If you think so, then you are a bit careless, because “embstr” will add a '\0' to the end of the string in order to be compatible with the C language standard. “embstr” actual number of bytes that can be stored is:

44 bytes

To review the previous article, it can be seen that

When the data occupies space in 0 - - 2^5-1, use the hisdshdr5 data type

2^5 – 2^8-1 when using the hisdshdr8 data type

little practice

We set the value of a test in redis to a 44-byte content, and check the type of this key, which is embstr

 127.0.0.1:6379> set test 99999999991111111111222222222233333333334444
OK
127.0.0.1:6379> OBJECT encoding test
"embstr"
127.0.0.1:6379> STRLEN test
(integer) 44

Then set test2 to be larger than 44 bytes , and then check that his content is raw

 127.0.0.1:6379> set test2 999999999911111111112222222222333333333344449
OK
127.0.0.1:6379> OBJECT encoding test2
"raw"

Finally, send a relationship diagram of the above data structure

References:

redis_doc
reids source code reids-6.2.5 Redis 6.2.5 is the latest stable version.

Welcome - like, follow, favorite

Friends, your support and encouragement are the motivation for me to persist in sharing and improve quality

Okay, here it is this time

Technology is open, and our mentality should be open. Embrace change, live in the sun, and move forward.

I'm the little devil boy Nezha , welcome to like, follow and collect, see you next time~

[Redis series] redis learning sixteen, redis dictionary (map) and its core coding structure

typedef struct redisDb

typedef struct dict

typedef struct dictType

typedef struct dictht

typedef struct dictEntry

Small case of bitmap

cache line

little practice

Welcome - like, follow, favorite

阿兵云原生

引用和评论

GO 语言如何用好变长参数？

JWT:速成框架搭配与入门

30w+数据使用RedisTemplate的pipeline空指针NullPointerException异常分析

得物自建 Redis 无人值守资源均衡调度设计与实现

阿里面试让聊一聊Redis 的内存淘汰（驱逐）策略

使用 redis rebloom redisbloom 做布隆过滤器的注意事项——潜在的 OOM 评估

grub内核启动参数(kernel command-line parameters)