6
头图

"Code Gebyte" sweeps the core knowledge of Redis with everyone from high-frequency interview questions, understands Redis fundamentally, and does not be a tool man of eight-legged essays, but a god who turns things around.

Brother Ma has written 9 Redis serials so far, and a small partner in the background also asked me to write some articles about interviews, so the "Mianba" series debuted.

If you read " Redis Series 160d19896b8865" carefully and understand, it is not a problem to

  1. Redis Core: The Secret that Only Fast and
  2. Redis log: AOF and RDB realize fast recovery from downtime without data loss
  3. Redis High Availability: Master-Slave Architecture Data Consistency and Synchronization Principle
  4. Redis actual combat: 6.x version Sentinel sentinel cluster construction
  5. Redis High Availability: Sentinel Sentinel Cluster Principle
  6. Redis actual combat: 6.x version Cluster cluster construction
  7. Redis High Availability: Can the Cluster expand infinitely? What is the principle?
  8. Redis Actual Combat: Using Bitmap to Realize Massive Data Statistics of
  9. Redis actual combat articles: clever use of data structure to achieve billions of data statistics

Why is Redis so fast?

Many people only know that it is K/V NoSQl in-memory database, single-threaded...This is because they do not have a comprehensive understanding of Redis and cannot continue to ask questions.

This question is a basic understanding. We can implement it from the underlying data structure of different data types in Redis, completely based on memory, IO multiplexing network model, thread model, progressive rehash......

How fast is it?

We can first talk about how fast it is. According to official data, Redis's QPS can reach about 100,000 (requests per second). If you are interested, you can refer to the official benchmark test "How fast is Redis?" ", address: https://redis.io/topics/benchmarks

基准测试

The horizontal axis is the number of connections, and the vertical axis is QPS.

This picture reflects an order of magnitude. Through quantification, the interviewer feels that you have read official documents and are very rigorous.

Based on memory

Redis is a memory-based database. Compared with a disk database, it can completely beat the speed of a disk.

Regardless of whether the read and write operations are done on the memory, let's compare the differences between memory operations and disk operations.

disk call

Memory operation

The memory is directly controlled by the CPU, that is, the memory controller integrated inside the CPU, so the memory is directly connected to the CPU to enjoy the optimal bandwidth for communication with the CPU.

Finally, a picture is used to quantify the various delay times of the system (part of the data quoted Brendan Gregg)

Efficient data structure

When I was learning MySQL, I knew that the B+ Tree data structure was used to improve the retrieval speed, so the speed of Redis should also be related to the data structure.

There are 5 data types in Redis, String、List、Hash、Set、SortedSet .

The bottom layer of different data types uses one or more data structures to support, the purpose is to pursue faster speed.

Ge Ge's message: We can separately explain the advantages of the underlying data structure of each data type. Many people only know the data type, and the underlying data structure can make people shine.

SDS simple dynamic string advantage

C 语言字符串与 SDS

  1. The len in SDS stores the length of the string, and O(1) time complexity queries the length of the string.
  2. Space pre-allocation: After the SDS is modified, the program will not only allocate the necessary space for the SDS, but also allocate additional unused space.
  3. Lazy space release: When the SDS is shortened, the program does not reclaim the excess memory space, but uses the free field to record the number of bytes without releasing it. If you need an append operation later, directly use the unused in free The space is reduced, and the memory allocation is reduced.

zipList compressed list

The compressed list is one of the underlying implementations of three data types: List, hash, and sorted Set.

When a list has only a small amount of data, and each list item is either a small integer value or a string with a relatively short length, then Redis will use a compressed list as the underlying implementation of the list key.

ziplist

This compact memory saves memory.

quicklist

Subsequent versions modified the list data structure, using quicklist instead of ziplist and linkedlist.

quicklist is a mixture of ziplist and linkedlist. It divides linkedlist into segments. Each segment uses ziplist for compact storage, and multiple ziplists are connected in series using two-way pointers.

skipList skip list

The sorting function of the sorted set type is realized through the "jump list" data structure.

Skiplist is an ordered data structure, which achieves the purpose of quickly accessing nodes by maintaining multiple pointers to other nodes in each node.

On the basis of the linked list, the jump list adds a multi-level index. Through several jumps of the index position, the rapid positioning of the data is realized, as shown in the following figure:

跳跃表

Integer array (intset)

When a collection contains only integer-valued elements and the number of elements in this collection is small, Redis will use the integer collection as the underlying implementation of the collection key to save memory.

Single thread model

Message from Brother Code: What we need to note is that Redis's single thread refers to Redis's network IO (network IO after version 6.x uses multi-threading) and key-value pair instruction reading and writing are executed by one thread. is executed by other threads for Redis persistence, cluster data synchronization, and asynchronous deletion.

Don't say that Redis has only one thread.

single thread means that the execution of Redis key value to read and write instructions is single thread.

Let me talk about the official answer first, which makes people feel rigorous enough, instead of reciting some blogs.

official answer: Because Redis is a memory-based operation, the CPU is not the bottleneck of Redis. The bottleneck of Redis is which may be the size of machine memory or network bandwidth . Since single-threaded is easy to implement, and the CPU will not become a bottleneck, it is logical to adopt a single-threaded solution. Original address: https://redis.io/topics/faq.

Why not make full use of the CPU with multi-threaded execution?

Before running each task, the CPU needs to know where the task is loaded and start running. In other words, the system needs to help it set up CPU registers and program counters in advance, which is called CPU context.

switching context, we need to complete a series of tasks, which is a very resource-consuming operation.

The introduction of multi-threaded development requires the use of synchronization primitives to protect concurrent reading and writing of shared resources, increasing code complexity and difficulty in debugging.

What are the benefits of single thread?
  1. No performance consumption caused by thread creation;
  2. Avoid CPU consumption caused by context switching, without the overhead of multi-thread switching;
  3. Avoid competition issues between threads, such as adding locks, releasing locks, deadlocks, etc., without having to consider various lock issues.
  4. The code is clearer and the processing logic is simple.

I/O multiplexing model

Redis uses I/O multiplexing technology to process connections concurrently. Using epoll + simple event framework implemented by itself.

Reading, writing, closing, and connection in epoll are all converted into events, and then using the multiplexing feature of epoll, never waste a little time on IO.

高性能 IO 多路复用

The Redis thread will not be blocked on a specific listening or connected socket, that is, it will not be blocked on a specific client request processing. Because of this, Redis can connect to multiple clients at the same time and process requests, thereby improving concurrency.

Redis global hash dictionary

Redis as a whole is a hash table to store all key-value pairs, no matter the data type is any of the five types. A hash table is essentially an array. Each element is called a hash bucket. No matter what data type, the entry in each bucket holds a pointer to the actual value.

Redis 全局哈希表

The time complexity of the hash table is O(1). You only need to calculate the hash value of each key to know the location of the corresponding hash bucket, locate the entry in the bucket to find the corresponding data, this is also the reason why Redis is fast One.

Redis uses the object (redisObject) to represent the key value in the database. When we create a key-value pair in Redis, at least two objects are created. One object is a key object used as a key-value pair, and the other is a key-value pair. The value object.

That is, each entry stores the redisObject object of the "key-value pair", and the corresponding data is found through the pointer of redisObject.

typedef struct redisObject{
    //类型
   unsigned type:4;
   //编码
   unsigned encoding:4;
   //指向底层数据结构的指针
   void *ptr;
    //...
 }robj;

What about Hash conflicts?

Redis resolves the conflict through the chain hash that is, the elements in the same bucket use a linked list to save . But when the linked list is too long, the search performance may deteriorate, so Redis uses two global hash tables in order to pursue speed. Used for rehash operations to increase the number of existing hash buckets and reduce hash conflicts.

Initially, "hash table 1" is used to save key-value pair data by default, and "hash table 2" has no space allocated at the moment. When more and more data triggers the rehash operation, perform the following operations:

  1. Allocate more space to "hash table 2";
  2. Remap and copy the data of "hash table 1" to "hash table 2";
  3. Free up the space of hash table 1.

It is worth noting that the process of remapping hash table 1 data to hash table 2 is not a one-time process, which will cause Redis to block and fail to provide services.

Instead, progressive rehash . Each time a client request is processed, start with the first index in "hash table 1" and copy all the data at this position to "hash table 2", and that's it Distribute rehash to multiple requests to avoid time-consuming blocking.

How does Redis achieve persistence? How to restore data after a downtime?

Redis data persistence uses the "RDB data snapshot" method to achieve rapid recovery from downtime. However, performing full data snapshots too frequently has two serious performance costs:

  1. RDB files are frequently generated and written to the disk, and the disk pressure is too high. It will appear that the previous RDB has not been executed yet, and the next one starts to be generated again, falling into an infinite loop.
  2. Forking out the bgsave child process will block the main thread. The larger the main thread's memory, the longer the blocking time.

Therefore, Redis also designed the AOF post-write log record to modify the memory instruction record.

Interviewer: What is RDB memory snapshot?

When Redis executes the "write" command, the memory data will keep changing. The so-called memory snapshot refers to the state data of the data in Redis memory at a certain moment.

For example, time is frozen at a certain moment. When we take a photo, we can completely record the moment of a certain moment through the photo.

Redis is similar to this, that is, the data at a certain moment is photographed in the form of a file and written to the disk. This snapshot file is called RDB file, RDB is the abbreviation of Redis DataBase.

RDB内存快照

When doing data recovery, directly read the RDB file into the memory to complete the recovery.

Interviewer: During the RDB generation, can Redis handle write requests at the same time?

Yes, Redis uses the operating system's multi-process copy-on-write technology COW (Copy On Write) to achieve snapshot persistence and ensure data consistency.

Redis will call glibc's function fork generate a child process during persistence. The snapshot persistence is completely handled by the child process, and the parent process continues to process client requests.

When the main thread executes the write instruction to modify the data, this data will be copied into a copy, and the bgsave child process reads this copy data and writes it to the RDB file.

This not only ensures the integrity of the snapshot, but also allows the main thread to modify the data at the same time, avoiding the impact on normal business.

写时复制技术保证快照期间数据客修改

Interviewer: So what is AOF?

The AOF log records all the modified instruction sequences since the creation of the Redis instance, so you can restore the state of the memory data structure of the current Redis instance by executing all the instructions sequentially on an empty Redis instance, that is, "replaying" .

The AOF configuration item appendfsync write-back strategy provided by Redis directly determines the efficiency and security of the AOF persistence function.

  • always : Synchronous write back, the content in the buffer of aof_buf will be flashed to the AOF file immediately after the write command is executed.
  • everysec : Write back every second. After the write command is executed, the log will only be written to the AOF file buffer, and the contents of the buffer will be synchronized to the disk every second.
  • no: operating system control, the write execution is completed, the log is written to the AOF file memory buffer, and the operating system decides when to flush to the disk.

There is no strategy that has the best of both worlds. We need to make a trade-off between performance and reliability.

Interviewer: Since RDB has two performance problems, why not use AOF.

The AOF pre-write log records each "write" command operation. It will not cause performance loss like RDB full snapshots, but the execution speed is not as fast as RDB. At the same time, excessive log files can also cause performance problems.

Therefore, Redis designed a killer feature "AOF rewrite mechanism". Redis provides the bgrewriteaof command to slim down the AOF log.

The principle is to open up a sub-process to traverse the memory and convert it into a series of Redis operation instructions, which are serialized into a new AOF log file. After the serialization is completed, the incremental AOF log that occurred during the operation is appended to this new AOF log file. After the appending is completed, the old AOF log file will be replaced immediately, and the slimming work is completed.

AOF重写机制(纠错:3条变一条)

Interviewer: How to achieve as little data loss as possible while taking into account performance?

When restarting Redis, we rarely use rdb to restore the memory state, because a lot of data will be lost. We usually use AOF log replay, but the performance of replaying AOF log is much slower than that of RDB, so when the Redis instance is large, it takes a long time to start.

In order to solve this problem, Redis 4.0 brings a new persistence option- mixed persistence . Store the contents of the rdb file with the incremental AOF log file. Here is the full amount is no longer AOF log log, but incremental AOF log persisted since that time beginning to end persistent occurrence of , usually this part of the AOF log small.

Therefore, can first load the content of rdb when Redis restarts, and then replay the incremental AOF log, which can completely replace the previous AOF full file replay, and the restart efficiency is greatly improved .

Redis master-slave architecture data synchronization

Redis provides a master-slave mode, which replicates data redundantly to other Redis servers through master-slave replication.

Interviewer: How to ensure data consistency between master and slave?

In order to ensure the consistency of the copy data, the master-slave architecture adopts a read-write separation method.

  • Read operation: both master and slave libraries can be executed;
  • Write operation: the master library executes first, and then synchronizes the write operation to the slave library;

Redis 读写分离

Interviewer: Is there any other role for master-slave replication?
  1. Failure recovery: When the master node is down, other nodes can still provide services;
  2. Load balancing: The Master node provides write services, and the Slave node provides read services to share the pressure;
  3. High-availability cornerstone: It is the foundation of sentinel and cluster implementation and the cornerstone of high-availability.
Interviewer: How is master-slave replication achieved?

Synchronization is divided into three situations:

  1. The first full copy of the master and slave libraries;
  2. Synchronization during normal operation of master and slave;
  3. Network disconnection and reconnection synchronization between master and slave libraries.
Interviewer: How to achieve the first synchronization?

master-slave library first copy process can be roughly divided into three stages: connection establishment phase (ie preparation phase), master library synchronization data to the slave library phase, sending new write commands during synchronization to the slave library phase ;

Redis全量同步

  1. Establish connection: slave library will establish a connection with the main library. The slave library executes replicaof and sends the psync command and tells the main library that synchronization is about to . After the main library confirms the reply, the master and slave libraries begin to synchronize 160d19896b9974.
  2. The master library synchronizes data to the slave library: the master executes the bgsave command to generate an RDB file and sends the file to the slave library. At the same time, the master library opens up a replication buffer for each slave. The buffer records all writes received since the RDB file is generated. command. Save the RDB from the library and clear the database and then load the RDB data into the memory.
  3. The new write command received after sending the RDB to the slave library: The write operation after the RDB file is generated is not recorded in the RDB file just now. In order to ensure the consistency of the master-slave library data, the master library will use one in the memory Call the replication buffer to record all write operations after the RDB file is generated. And send the data inside to the slave.
Interviewer: What if the network between the master and slave libraries is broken? Do you want to re-copy all the files after disconnecting?

Before Redis 2.8, if the master-slave library had a network interruption during command propagation, then the slave library would perform a full copy with the master library again, which would be very expensive.

Starting from Redis 2.8, after the network is disconnected, the master and slave libraries will continue to synchronize using incremental replication.

Incremental replication: used for replication after network interruption and other situations. Only the write commands executed by the master node during the interruption are sent to the slave node, which is more efficient than full replication .

The mystery of the realization of disconnected reconnection incremental replication is the repl_backlog_buffer buffer. Whenever the master will record the write operation in repl_backlog_buffer , because of the limited memory, repl_backlog_buffer is a fixed-length circular array, if the array content is full, It will overwrite the previous content from the beginning.

The master uses master_repl_offset record the position offset that it slave_repl_offset record the offset that it has read.

repl_backlog_buffer

When the master and slave are disconnected and reconnected, the slave will first send the psync command to the master, and at the same time send its own runID and slave_repl_offset to the master.

The master only needs to master_repl_offset and slave_repl_offset to the slave library.

The execution flow of incremental copy is as follows:

Redis增量复制

Interviewer: After completing the full synchronization, how to synchronize the data during normal operation?

When the master and slave libraries have completed the full replication, they will always maintain a network connection between them. The master library will resynchronize the subsequent command operations received successively to the slave library through this connection. This process is also called command propagation based on long connections. , The purpose of using long connections is to avoid the overhead caused by frequent connection establishment.

Sentinel principle serial question

Interviewer: Yes, you know so much, do you know the principle of sentinel clusters?

Sentinel is an operating mode of Redis. It focuses on , and can achieve master selection and master-slave switching through a series of mechanisms when the master node fails. Failover to ensure the availability of the entire Redis system .

His architecture diagram is as follows:

Redis哨兵集群

The capabilities of Redis Sentry are as follows:

  • monitoring : Continuously monitor whether the master and slave are in the expected working state.
  • automatically switches the main library : When the Master fails, the sentry starts the automatic failure recovery process: select one from the slave as the new master.
  • informs : Let the slave execute replicaof to synchronize with the new master; and notify the client to establish a connection with the new master.
Interviewer: How do the sentries know each other?

The sentry establishes communication with the master, and uses the master to provide a publish/subscribe mechanism to publish its own information, such as height and weight, single or not, IP, port...

The master has a __sentinel__:hello for publishing and subscribing to messages between sentries. is like a __sentinel__:hello group 060d19896cc71c. Sentinel uses the WeChat group established by the master to publish its own news, while paying attention to the news released by other sentries.

Interviewer: Although the sentinels have established a connection, they still need to establish a connection with the slave. Otherwise, they can't be monitored. How to know the slave and monitor them?

The key is to use the master to achieve this. The sentry sends the INFO command to the master. The master naturally knows all the salve brothers under his sect. So after receiving the command, the master tells the sentry the slave list.

The sentry establishes a connection with each salve according to the slave list information responded by the master, and continuously monitors the sentry based on this connection.

INFO命令获取slave信息

Cluster cluster gun

Interviewer: In addition to the sentinel, are there other high-availability methods?

There is a Cluster cluster to achieve high availability, and the Redis cluster monitored by the sentinel cluster is a master-slave architecture and cannot be expanded. uses the Redis Cluster cluster, which mainly solves various slow problems caused by large data storage, and also facilitates horizontal expansion.

When facing millions and tens of millions of users, the horizontally-scaling Redis slice cluster will be a very good choice.

Interviewer: What is a Cluster?

Redis cluster is a distributed database solution. The cluster performs data management through sharding (a practice of "divide and conquer"), and provides replication and failover functions.

The data is divided into 16384 slots, and each node is responsible for a part of the slots. The slot information is stored in each node.

It is decentralized. As shown in the figure, the cluster is composed of three Redis nodes. Each node is responsible for a part of the data of the entire cluster, and the amount of data each node is responsible for may be different.

Redis 集群架构

The three nodes are connected to each other to form a peer-to-peer cluster. They Gossip protocol. Finally, each node saves the slot allocation of other nodes.

Interviewer: How does the hash slot map to the Redis instance?
  1. According to the key of the key-value pair, use the CRC16 algorithm to calculate a 16-bit value;
  2. The 16-bit value is modulo 16384, and the number from 0 to 16383 represents the hash slot corresponding to the key.
  3. Locate the corresponding instance according to the slot information.

The mapping relationship between key-value pair data, hash slot, and Redis instance is as follows:

数据、Slot与实例的映射

Interviewer: How does Cluster implement failover?

Redis cluster nodes use the Gossip protocol to broadcast their own status and changes in their perception of the entire cluster. For example, if a node finds that a node is out of connection (PFail), it will broadcast this information to the entire cluster, and other nodes can also receive this point of loss of connection information.

If a node has received the number of lost connections of a node (PFail Count) has reached the majority of the cluster, it can mark the node as a certain offline status (Fail), and then broadcast to the entire cluster, forcing other nodes to also receive the The fact that the node has gone offline, and immediately perform a master-slave switch over the missing node.

Interviewer: How does the client determine which instance the data accessed is distributed on?

The Redis instance will send its own hash slot information to other instances in the cluster through the Gossip protocol, realizing the diffusion of hash slot allocation information.

In this way, each instance in the cluster has all the mapping relationship information between the hash slots and the instance.

When the client connects to any instance, the instance responds to the client with the mapping relationship between the hash slot and the instance, and the client caches the mapping information between the hash slot and the instance locally.

When the client requests, it will calculate the hash slot corresponding to the key, locate the instance where the data is located through the locally cached hash slot instance mapping information, and then send the request to the corresponding instance.

Redis 客户端定位数据所在节点

Interviewer: What is the Redis redirection mechanism?

The mapping relationship between hash slots and instances has changed due to new instances or load balancing redistribution. The client sends a request to the instance. This instance has no corresponding data. The Redis instance will tell the client to send the request. To other examples on .

Redis tells the client through MOVED errors and ASK errors.

MOVED

MOVED error (load balancing, data has been migrated to other instances): When the client sends a key-value pair operation request to an instance, and the slot where the key is located is not responsible for it, the instance will return A MOVED error directed to the node that is responsible for the slot.

At the same time, the client will also update the local cache, and update the corresponding relationship between the slot and the Redis instance to the correct .

MOVED 指令

ASK

If there is a lot of data in a slot, some will be migrated to the new instance, and some will not be migrated.

If the requested key is found in the current node, the command will be executed directly, otherwise an ASK error response will be required.

If the slot migration is not completed, if the slot where the key to be accessed is located is being migrated from instance 1 to instance 2 (if the key is no longer in instance 1), instance 1 will return an ASK error message client: 160d19896cd368 the key requested by the client The hash slot you are in is being migrated to instance 2. You first send an ASKING command to instance 2, and then send an operation command .

For example, the client requests to locate the slot 16330 with key = "Official Account: Code Gebyte". On instance 172.17.18.1, node 1 executes the command directly if it finds it, otherwise it responds with an ASK error message and directs the client to move to the migration process. The target node 172.17.18.2.

ASK 错误

Note: The ASK error command will not update the client-side cached hash slot allocation information .

To be continued

This article mainly goes through the core content of Redis, involving data structure, memory model, IO model, persistent RDB and AOF, master-slave replication principle, sentinel principle, cluster principle.

The "Mianba" series will be divided into several chapters, which will win Redis in all aspects from core principles, high availability, actual combat, and how to avoid pits.

If you find it helpful, I hope you can use your fingers to like, share, bookmark, leave a message, or you can add "MageByte1024" on WeChat to enter the exclusive reader group to discuss more interview questions. Ma knows everything.


码哥字节
2.2k 声望14.1k 粉丝