Understanding Redis persistence at the source level

The article was first published on the public account "Mushroom can't sleep", welcome to visit~

Preface

Everyone knows that Redis is an in-memory database, and data is stored in memory, which is one of the reasons Redis is very fast. Although the speed has been improved, if the data is kept in the memory, it is very easy to lose. For example, if the server is shut down or down, the data in the memory is gone. To solve this problem, Redis provides a persistence mechanism. They are RDB and AOF persistence.

RDB

What is RDB persistence?

RDB persistence can generate a point-in-time snapshot of the data set within a specified time interval.

What are the advantages of RDB?

RDB is a compact file that represents Redis data at a certain point in time. RDB files are suitable for backup. For example, you may want to archive RDB files for the last 24 hours every hour, and save RDB snapshots for the last 30 days every day. This allows you to easily restore different versions of data sets for disaster tolerance.
RDB is very suitable for disaster recovery, as a compact single file that can be transmitted to a remote data center.
RDB maximizes the performance of Redis. Because the only thing that the Redis parent process needs to do when it is persisted is to fork a child process, and the child process will complete all the remaining work. The parent process instance does not need to perform operations like disk IO.
RDB is faster than AOF in restarting instances that have saved large data sets.
Disadvantages of RDB?
When you need to minimize data loss when Redis stops working (such as a power outage), RDB may not be good. You can configure different save points to save RDB files (for example, at least 5 minutes and after 100 writes to the data set, but you can have multiple save points). However, you usually create an RDB snapshot every 5 minutes or more, so once Redis stops working because it is not shut down properly for any reason, you have to be prepared for data loss in the last few minutes.
RDB needs to call fork() child process frequently to persist to disk. If the data set is large, fork() is time-consuming. As a result, when the data set is very large and the CPU performance is not strong enough, Redis will stop serving the client for a few milliseconds or even a second. AOF also requires fork(), but you can adjust how often to rewrite the log without losing trade-off durability.

RDB file creation and loading

There are two Redis commands that can be used to generate RDB files, one is SAVE and the other is BGSAVE .
SAVE command will block the Redis server process until the RDB file is created. During the server process is blocked, the server cannot process any command requests.

> SAVE     // 一直等到 RDB 文件创建完毕
OK

Unlike the SAVE command that directly blocks the server process, the BGSAVE command spawns a child process, and then the child process is responsible for creating the RDB file, and the server process (parent process) continues to process the command process.
the fork is executed, the operating system (Unix-like operating system) will use the copy-on-write strategy, that is, when the fork function occurs, the parent and child processes share the same memory data, and when the parent process wants to change a piece of data (Such as executing a write command), the operating system will make a copy of the piece of data to ensure that the data of the child process is not affected, so the new RDB file stores the memory data at the moment the fork is executed.

> BGSAVE  // 派生子进程，并由子进程创建 RDB 文件
Background saving started

There are two ways to generate RDB files: One is manual, which is the command method described above; the other is automatic.
Next, I will introduce the process of automatically generating RDB files in detail.
Redis allows users to set the save option of the server configuration, so that the server automatically executes the BGSAVE command at regular intervals.
The user can set multiple save conditions through the save option under SNAPSHOTTING in the redis.conf configuration file, but as long as any one of the conditions is met, the server will execute the BGSAEVE command.
For example, the following configuration:
save 900 1
save 300 10
save 60 10000
The meaning of the above three configurations are:

The server made at least 1 modification to the database within 900 seconds.
The server made at least 10 modifications to the database within 300 seconds.
The server made at least 10,000 changes to the database in 60 seconds.

If you do not manually configure the save option, the server will configure the default parameters for the save option:
save 900 1
save 300 10
save 60 10000
Then, the server will set the saveparams property of the server state redisServer structure according to the configuration of the save option:

struct redisServer{

  // ...
  
  // 记录了保存条件的数组
  struct saveparams *saveparams;
  
  // ...
};

The saveparams attribute is an array, each element in the array is a saveparam structure, and each saveparam structure saves a save condition set by the save option:

struct saveparam {

  // 秒数
  time_t seconds;
  
  // 修改数
  int changes;
};

In addition to the saveparams array, the server state also maintains a dirty counter and a lastsave attribute;

struct redisServer {
    // ...
    
    // 修改计数器
    long long dirty;
    
    // 上一次执行保存时间
     time_t lastsave;
     
     // ...
}

The dirty counter records how many changes (including write, delete, update, etc.) the server has made to the database state (all databases in the server) since the last successful execution of the SAVE or BGSAVE command.
The lastsave attribute is a UNIX timestamp, which records the last time the server executed the SAVE or BGSAVE command.

Check whether the conditions are met to trigger RDB

Redis's server periodic operation function serverCron is executed every 100 milliseconds by default. This function is used to maintain the running server. One of its tasks is to check whether the save conditions set by the save option have been met, and if so Just execute the BGSAVE command.
Redis serverCron source code analysis is as follows:

The program will traverse and check all the save conditions in the saveparams array. As long as any one of the conditions is met, the server will execute the BGSAVE command.
The following is the source code flow of rdbSaveBackground:

RDB file structure

The following figure shows the various parts of a complete RDB file.

The beginning of the redis file is the REDIS part. The length of this part is 5 bytes, and it stores five characters of "REDIS". Through these five characters, the program can quickly check whether the loaded file is an RDB file when loading a file.

db_version is 4 bytes in length, and its value is an integer represented by a string. This integer records the version number of the RDB file. For example, "0006" means that the version of the RDB file is the sixth version.

database part contains zero or any number of databases, as well as the key-value pair data in each database:

If the server's database status is empty (all databases are empty), then this part is also empty, with a length of 0 bytes.
If the database status of the server is non-empty (at least one database is not empty), then this part is also non-empty. The length of this part will vary according to the number, type, and content of key-value pairs stored in the database.

EOF constant has a length of 1 byte. This constant marks the end of the RDB file body content. When the reading program encounters this value, he knows that all the key-value pairs of all databases have been loaded.

check_sum is an 8-byte unsigned integer that stores a checksum. This checksum is calculated by the program through the four parts of REDIS, db_version, database, and EOF. When the server loads the RDB file, it compares the checksum calculated by the loaded data with the checksum recorded by check_sum to check whether the RDB is faulty or damaged.
For example: the figure below is an RDB file of database 0 and database 3. The first one is "REDIS" which means it is an RDB file, the following "0006" means this is the sixth edition of the REDIS file, then two databases, followed by the EOF end identifier, and finally check_sum.

AOF persistence

What is AOF persistence

The AOF persistence method records each write operation to the server. When the server restarts, these commands will be re-executed to restore the original data. The AOF command uses the redis protocol to append and save each write operation to the end of the file. Redis can also perform AOF The file is rewritten in the background so that the volume of the AOF file is not too large.

The advantages of AOF?

Using AOF will make your Redis more durable: You can use different fsync strategies: no fsync, fsync per second, fsync every time you write. Using the default fsync per second strategy, Redis performance is still very good (fsync is caused by If the background thread is processing, the main thread will try its best to process the client request), once a failure occurs, you can lose up to 1 second of data.
The AOF file is a log file that is only appended, so it does not need to be written to seek. Even if for some reasons (disk space is full, downtime during writing, etc.), the complete write command is not executed, you can still Use the redis-check-aof tool to fix these problems.
Redis can automatically rewrite AOF in the background when the volume of the AOF file becomes too large: The new AOF file after rewriting contains the minimum set of commands required to restore the current data set. The entire rewriting operation is absolutely safe, because Redis will continue to append commands to the existing AOF file during the process of creating a new AOF file. Even if there is a downtime during the rewriting process, the existing AOF file will not be lost. . Once the new AOF file is created, Redis will switch from the old AOF file to the new AOF file and start appending the new AOF file.
The AOF file stores all the write operations performed on the database in an orderly manner. These write operations are saved in the format of the Redis protocol, so the content of the AOF file is very easy to be read by people, and it is easy to analyze the file (parse). Exporting AOF files is also very simple: For example, if you accidentally execute the FLUSHALL command, but as long as the AOF file is not rewritten, just stop the server, remove the FLUSHALL command at the end of the AOF file, and restart Redis, You can restore the data set to the state before FLUSHALL was executed.

Disadvantages of AOF?

For the same data set, the volume of the AOF file is usually larger than the volume of the RDB file.
Depending on the fsync strategy used, the speed of AOF may be slower than RDB. In general, the performance of fsync per second is still very high, and turning off fsync can make AOF as fast as RDB, even under high load. However, when dealing with huge writes and loads, RDB can provide a more guaranteed maximum latency (latency).

Implementation of AOF persistence

The implementation of the AOF persistence function can be divided into three steps: command append (append), file writing, and file synchronization (sync).

Command append

When the AOF persistence function is turned on, the server will append the executed write command to the end of the aof_buf buffer in the server state in the protocol format after executing a write command.

struct redisServer {
  // ...
  // AOF 缓冲区  
  sds aof_buf;
  
  // ..
};

If the client sends the following command to the server:

> set KEY VALUE
OK

Then the server will append the following protocol content to the end of the aof_buf buffer after executing the set command;

*3\r\n$3\r\nSET\r\n$3\r\nKEY\r\n$5\r\nVALUE\r\n

AOF file writing and synchronization

The Redis server process is an event loop (loop), the file event in this loop is responsible for receiving the client
Command request, and send command reply to the client, while the time event is responsible for executing the required commands like serverCron function
The function to be run regularly.
Because the server may execute write commands when processing file events, some content is appended to the aof_buf buffer
Inside, so every time before the server ends an event loop, it will call flushAppendOnlyFile function, consider
Consider whether you need to write and save the contents of the aof_buf buffer to the AOF file. This process can use the following pseudo-replacement
Code means:

def eventLoop():
  while True:
  
  #处理文件事件，接收命令请求以及发送命令回复
  #处理命令请求时可能会有新内容被追加到 aof_buf缓冲区中
  processFileEvents()
  
  #处理时间事件
  processTimeEvents()
  
  #考虑是否要将 aof_buf中的内容写入和保存到 AOF文件里面
  flushAppendOnlyFile()

The behavior of the flushAppendOnlyFile function is determined by appendfsync option configured by the server, which is different
The behavior of value generation is shown in the following table.

the value of the appendfsync option	Behavior of flushAppendOnlyFile function
always	Write and synchronize all the contents in the aof_buf buffer to the AOF file
everysec	Write all the contents in the aof_buf buffer to the AOF file. If the time from the last synchronization of the AOF file is more than one second now, then synchronize the AOF file again, and this synchronization operation is performed by a thread.
no	Write all the contents in the aof_buf buffer to the AOF file, but do not synchronize the AOF file. When to synchronize is determined by the operating system

If the user does not actively set a value for the appendfsync option, the default value of the appendfsync option is everysec.
Some friends who have written here may confuse the meaning of writing and synchronization mentioned above, here is what:
write: write the data in aof_buf to the AOF file.
synchronization: call the fsync and fdatasync functions to save the data in the AOF file to the disk.
In layman's terms, you want to write to a file. The process of writing is to write, while synchronization is to save the file and the data to the disk.
When you read the article before, did most of you say that AOF loses up to one second of data? That's because redis AOF is the everysec strategy by default. This strategy is executed every second, so AOF persistence can lose up to one second of data.

AOF file loading and data restoration

Because the AOF file contains all the write commands needed to rebuild the database state, the server only needs to read in and re-execute the write commands saved in the AOF file to restore the database state before the server shuts down. The detailed steps for Redis to read the AOF file and restore the database state are as follows:

Create a fake client without a network connection: Because Redis commands can only be executed in the context of the client, and the commands used when loading the AOF file are directly derived from the AOF file instead of the network connection, so the server A pseudo client without a network connection is used to execute the write command saved in the AOF file. The effect of the pseudo client executing the command is exactly the same as that of the client with a network connection.
Analyze and read a write command from the AOF file.
Use the pseudo client to execute the read command.
Continue to perform steps 2 and 3 until all write commands in the AOF file have been processed.
After completing the above steps, the database state saved in the AOF file will be completely restored. The whole process is shown in the figure below.

AOF rewrite

Because AOF persistence records the state of the database by saving the write commands that are executed, so as the server running time passes, the content in the AOF file will become more and more, and the volume of the file will become larger and larger, if not If controlled, an AOF file that is too large is likely to affect the Redis server and even the entire host computer. The larger the AOF file, the more time it takes to restore data using the AOF file.
If the client executes the following command:

> rpush list "A" "B"
OK
> rpush list "C"
OK
> rpush list "D"
OK
> rpush list "E" "F"
OK

So just to record the state of the list key, the AOF file needs to save four commands.
For the actual application level, the number and frequency of execution of the write command will be much higher than the simple example above, so the problem will be much more serious. In order to solve the problem of AOF file volume expansion, Redis provides AOF file rewrite (rewrite) function. With this function, the Redis server can create a new AOF file to replace the existing AOF file. The new and old AOF files have the same database status, but the new AOF file will not contain any redundant commands that waste space, so the new The volume of the AOF file is usually much smaller than the volume of the old AOF file. In the following content, we will introduce the realization principle of AOF file rewriting and the realization principle of BGREWEITEAOF command.
Although Redis names the function of generating new AOF files to replace old AOF files as "AOF file rewriting", in fact, AOF file rewriting does not require any reading, analysis or writing operations on existing AOF files. This function is realized by reading the current database status of the server.
Just like the above situation, the server can merge these six commands into one.

> rpush list "A" "B" "C" "D" "E" "F"

In addition to the list keys listed above, all other types of keys can use the same method to reduce the number of commands in the AOF file. First read the current value of the key from the database, and then use one command to record the key-value pair instead of multiple commands that previously recorded the key-value pair. This is the realization principle of the AOF rewrite function.
In practice, in order to avoid overflow of the client input buffer when executing commands, the rewrite program will first process the four keys that may contain multiple elements: list, hash table, set, and ordered set. Check the number of elements contained in the key. If the number of elements exceeds the value of the redis.h/REDIS_AOF_REWRITE_ITEMS_PER_CMD constant, the rewrite program will use multiple commands to record the value of the key instead of using one command. In the current version, the value of the REDIS_AOF_REWRITE_ITEMS_PER_CMD constant is 64, which means that if a set key contains more than 64 elements, the rewrite program will use multiple SADD commands to record the set, and the elements set by each command The number is also 64.

AOF background rewrite

AOF rewriting will perform a lot of write operations, which will affect the main thread, so redis AOF rewriting is placed in the child process for execution. This can achieve two goals:

While the child process is performing AOF rewriting, the server process (parent process) can continue to process command requests.
The child process has a data copy of the server process, and the child process is used instead of the thread to ensure the security of the data while avoiding the use of locks.

But there is a problem. When the child process rewrites the data, the main process is still processing the new data, which will also cause data inconsistencies.
In order to solve this data inconsistency problem, the Redis server sets up an AOF rewrite buffer. This buffer is used after the server creates a child process. After the Redis server executes a write command, it will send the write command at the same time. Rewrite the AOF buffer and AOF buffer , as shown below:

This means that during the AOF rewrite performed by the child process, the server process needs to perform the following three tasks:

Execute the command sent by the client.
Append the executed write command to the AOF buffer.
Append the executed write command to the AOF rewrite buffer.

In this way, you can guarantee:

The contents of the AOF buffer will be periodically written and synchronized to the AOF file, and the processing of the existing AOF file will proceed as usual.
Starting from the creation of the child process, all write commands executed by the server will be recorded in the AOF rewrite buffer.

When the child process completes the AOF rewriting work, it will send a signal to the parent process. After receiving the signal, the parent process will call a signal processing function and perform the following tasks:

Write all the contents in the AOF rewrite buffer to the new AOF file. At this time, the database status saved in the new AOF file will be consistent with the current database status of the server.
Rename the new AOF file, atomically overwrite the existing AOF file, and complete the replacement of the old and new AOF files.

After the signal processing function is executed, the parent process can continue to accept command requests as usual.
In the entire AOF background rewriting process, only the signal processing function will block the server process (the parent process) when the signal processing function is executed. At other times, the AOF background rewriting will not block the parent process, which will cause AOF rewriting to the server performance The impact is minimized.

Redis hybrid persistence

Redis can also use AOF persistence and RDB persistence at the same time. In this case, when Redis restarts, it will use the AOF file to restore the data set first, because the data set saved by the AOF file is usually more complete than the data set saved by the RDB file. But AOF recovery is relatively slow, Redis 4.0 introduced hybrid persistence .

Hybrid persistence: The contents of the rdb file and the incremental AOF log file are stored together. The AOF log here is no longer a full log, but an incremental AOF log that occurred during the period from the start of persistence to the end of persistence. Usually this part of the AOF log is very small.

Therefore, when Redis restarts, you can load RDB first, and then replay the incremental AOF log to completely replace the previous AOF full file replay, and the restart efficiency is greatly improved.

If you think the article is good, please give me a thumbs up, close a note, and post it~ Your support is the motivation for me to write the article.

For more exciting articles, please pay attention to the public "160d015d87afa2 Mushroom can't sleep ".

The more proactive you are, the more proactive you will be. See you next time~

Understanding Redis persistence at the source level

Preface

RDB

What is RDB persistence?

What are the advantages of RDB?

Disadvantages of RDB?

RDB file creation and loading

Check whether the conditions are met to trigger RDB

RDB file structure

AOF persistence

What is AOF persistence

The advantages of AOF?

Disadvantages of AOF?

Implementation of AOF persistence

Command append

AOF file writing and synchronization

AOF file loading and data restoration

AOF rewrite

AOF background rewrite

Redis hybrid persistence

蘑菇睡不着

引用和评论

详解Redis主从复制

Java8的新特性

Java11的新特性

Java5的新特性

Java9的新特性

Java13的新特性

Java7的新特性