1
头图

The directory structure is as follows:

Introduction

Redis is a high-performance key-value database. Redis operations on data are all atomic.

Pros and cons

advantage:

  1. Based on memory operations, memory reads and writes are fast.
  2. Redis is single-threaded, avoiding thread switching overhead and multi-thread competition. Single thread means that there is only one thread to process network requests (one or more redis client connections). There is more than one thread when redis is running, and another thread will be started when data is persisted or AOF is synchronized to slave.
  3. Support multiple data types, including String, Hash, List, Set, ZSet, etc.
  4. Support persistence. Redis supports two persistence mechanisms, RDB and AOF, and the persistence function effectively avoids the problem of data loss.
  5. redis uses IO multiplexing technology. Multiplexing refers to multiple socket connections, and multiplexing refers to multiplexing a thread. Redis uses a single thread to poll the descriptors, converting database opening, closing, reading, and writing into events. There are three main technologies for multiplexing: select, poll, and epoll. Epoll is the latest and best multiplexing technology.

Disadvantages: poor support for joins or other structured queries.

io multiplexing

Register the file description corresponding to the user socket into epoll, and then epoll will help you monitor which sockets have messages arrived. When a socket is readable or writable, it can give you a notification. Only when the system informs which descriptor is readable, the read operation is performed, which can ensure that valid data can be read every time it is read. In this way, the I/O operations of multiple descriptors can be performed concurrently and alternately in one thread. This is called I/O multiplexing. The multiplexing here refers to multiplexing the same thread.

Application scenario

  1. Cache hot data to relieve the pressure on the database.
  2. Using the atomic self-increment operation in Redis, you can use the functions of the calculator, such as counting the number of users' likes, and the number of user visits. If MySQL is used for this type of operation, frequent reads and writes will bring considerable pressure.
  3. For simple message queues, when high reliability is not required, you can use Redis's own publish/subscribe mode or List to implement a queue for asynchronous operation.
  4. Friendship, use some commands of the set, such as intersection, union, difference, etc. It is convenient to get some common friends, common hobbies and other functions.
  5. The rate limiter, a typical usage scenario is to limit the frequency of a user's access to a certain API, and it is commonly used to prevent unnecessary pressure from users' crazy clicks when they are snapped up.
To share with you a github repository, there are more than 200 classic computer books , including C language, C++, Java, Python, front-end, database, operating system, computer network, data structure and algorithm, machine learning, programming life Wait, you can star, next time you find a book, search directly on it, the warehouse is continuously updating~
github address: https://github.com/Tyson0314/java-books
If github is not accessible, you can visit the gitee repository.
gitee address: https://gitee.com/tysondai/java-books

The difference between Memcached and Redis

  1. Redis only uses a single core, while Memcached can use multiple cores.
  2. MemCached has a single data structure and is only used to cache data, while Redis supports richer data types, and can also perform rich operations on data directly on the server side, which can reduce the number of network IOs and data volume.
  3. MemCached does not support data persistence, and the data disappears after a power failure or restart. Redis supports data persistence and data recovery, allowing single points of failure.

type of data

Redis supports five data types:

  • string
  • hash
  • list
  • set
  • zset(sorted set)

String type

The value of the string type can be a string, number, or binary, but the maximum value cannot exceed 512MB.

Common commands: set, get, incr, incrby, desr, keys, append, strlen

  • Assignment and value
SET name tyson
GET name
  • Increment number
INCR num       //若键值不是整数时,则会提示错误。
INCRBY num 2   //增加指定整数
DESR num       //递减数字
INCRBY num 2.7 //增加指定浮点数
  • other

keys list* lists matching keys

APPEND name " dai" additional value

STRLEN name Get string length

MSET name tyson gender male set multiple values at the same time

MGET name gender Get multiple values at the same time

GETBIT name 0 get the value of the binary bit at 0 index

FLUSHDB delete all keys of the current database

FLUSHALL delete all keys in the database

SETNX and SETEX

SETNX key value : When the key does not exist, set the value of the key to value. If the given key already exists, SETNX does nothing.

SETEX key seconds value : The seconds parameter is more than SET, which is equivalent to SET KEY value + EXPIRE KEY seconds , and SETEX is an atomic operation.

keys and scan

Single-threaded redis. The keys instruction will cause the thread to block for a period of time, and the service cannot be restored until the execution is complete. Scan uses a gradual traversal method to solve the blocking problem that may be caused by the keys command. The time complexity of each scan command is O(1), but to realize the function of keys, you need to execute multiple scans.

Disadvantages of scan: If there is a key change (addition, deletion, modification) during the scan process, the traversal process may have the following problems: the newly added key may not be traversed, and the traversal of duplicate keys etc., that is Say that scan does not guarantee a complete traversal of all keys.

The scan command is used to iterate the database keys in the current database: SCAN cursor [MATCH pattern] [COUNT count]

scan 0 match * count 10 //返回10个元素

SCAN related commands include SSCAN command, HSCAN command and ZSCAN command, which are used for collection, hash key and ordered collection respectively.

expire

SET password 666
EXPIRE password 5
TTL password //查看键的剩余生存时间,-1为永不过期
SETEX password 60 123abc //SETEX可以在设置键的同时设置它的生存时间

The unit of EXPIRE time is seconds, and the unit of PEXPIRE time is milliseconds. The expiration time can be reset before the key expires, and the key is destroyed after expiration.

In Redis 2.6 and earlier versions, if the key does not exist or has expired, it returns -1 .

Starting from Redis2.8, the result of the error return value has been changed as follows:

  • If the key does not exist or has expired, return -2
  • If the key exists and the expiration time is not set (permanent effective), return -1 .

type

The TYPE command is used to return the type of the value stored in the key.

127.0.0.1:6379> type NEWBLOG
list

Hash type

Common commands: hset, hget, hmset, hmget, hgetall, hdel, hkeys, hvals

  • Assignment and value
HSET car price 500 //HSET key field value
HGET car price

Set to get the value of multiple fields at the same time

HMSET car price 500 name BMW
HMGET car price name
HGETALL car

When using the HGETALL command, if the number of hash elements is large, Redis may be blocked. If you only need to get part of the field, you can use hmget. If you must get all the field-values, you can use the hscan command, which will gradually traverse the hash type.

HSETNX car price 400 //Assign value when the field does not exist, HSETNX is an atomic operation, there is no race condition

  • Increase the number
    HINCRBY person score 60
  • Delete field
    HDEL car price
  • other
HKEYS car //获取key
HVALS car //获取value
HLEN car  //长度

List type

Common commands: lpush, rpush, lpop, rpop, lrange, lrem

Add and delete elements

LPUSH numbers 1
RPUSH numbers 2 3
LPOP numbers
RPOP numbers

get list fragment

LRANGE numbers 0 2
LRANGE numbers -2 -1 //支持负索引 -1是最右边第一个元素
LRANGE numbers 0 -1

inserts the value

First look for the value of pivot from left to right, and insert value into the list

LINSERT numbers AFTER 5 8 //往5后面插入8
LINSERT numbers BEFORE 6 9 //往6前面插入9

delete element

LTRIM numbers 1 2 delete all elements except index 1 to 2

LPUSH is often used together with LTRIM to limit the number of elements in the list, such as keeping the most recent 100 logs

LPUSH logs $newLog
LTRIM logs 0 99

delete the value specified in the list

LREM key count value

  1. count < 0, 则从右边开始删除前count个值为value的元素
  2. count > 0, 则从左边开始删除前count个值为value的元素
  3. count = 0, 则删除所有值为value的元素 `LREM numbers 0 2`

other

LLEN numbers       //获取列表元素个数LINDEX numbers -1  //返回指定索引的元素,index是负数则从右边开始计算LSET numbers 1 7   //把索引为1的元素的值赋值成7

Collection type

Common commands: sadd, srem, smembers, scard, sismember, sdiff

There cannot be the same elements in the set.

add/delete elements

SADD letters a b cSREM letters c d

Get element

SMEMBERS lettersSCARD letters   //获取集合元素个数

Determine whether the element is in the set
SISMEMBER letters a

Operations between sets

SDIFF setA setB  //差集运算SINTER setA setB //交集运算SUNION setA setB //并集运算

Three commands can be passed into multiple keys SDIFF setA setB setC

Other

SDIFFSTORE result setA setB performs set operations and stores the results

SRANDMEMBER key count

Randomly obtain an element in the set. If count is greater than 0, then randomly obtain count unique elements from the set. If count is less than 0, the randomly obtained count elements may be the same.

SPOP letters

Ordered collection type

Common commands: zadd, zrem, zscore, zrange

zadd zsetkey 50 e1 60 e2 30 e3

Zset (sorted set) is an ordered set of string type. Like set, zset is also a collection of string type elements, and duplicate members are not allowed. The difference is that each element of Zset is associated with a double (more than 17 digits are expressed in scientific calculations, and precision may be lost) type scores, and the members in the set are sorted by the scores. The members of zset are unique, but the score (score) can be repeated.

Ordered set and list are the same:

  1. All in order;
  2. All elements within a certain range can be obtained.

difference between ordered set and list:

  1. The list is implemented based on a linked list, and the speed of obtaining the elements at both ends is fast, and the speed of accessing the middle elements is slow;
  2. Ordered collection is implemented based on hash table and jump table, and the time complexity of accessing intermediate elements is OlogN;
  3. The list cannot simply adjust the position of an element, the ordered list can (change the score of the element);
  4. Ordered collections consume more memory.

add/delete elements

The time complexity is OlogN.

ZADD scoreboard 89 Tom 78 Sophia
ZADD scoreboard 85.5 Tyson      //支持双精度浮点数
ZREM scoreboard Tyson
ZREMRANGEBYRANK scoreboard 0 2  //按照排名范围删除元素
ZREMRANGEBYSCORE scoreboard (80 100 //按照分数范围删除元素,"("代表不包含

Get element score

Time complexity is O1.

ZSCORE scoreboard Tyson

Get a list of elements ranked in a certain range

The time complexity of the ZRANGE command is O(log(n)+m), n is the number of ordered set elements, and m is the number of returned elements.

ZRANGE scoreboard 0 2
ZRANGE scoreboard 1 -1  //-1表示最后一个元素
ZRANGE scoreboard 0 -1 WITHSCORES  //同时获得分数

Get the elements of the specified score range

The time complexity of the ZRANGEBYSCORE command is O(log(n)+m), n is the number of ordered set elements, and m is the number of returned elements.

ZRANGEBYSCORE scoreboard 80 100
ZRANGEBYSCORE scoreboard 80 (100  //不包含100
ZRANGEBYSCORE scoreboard (60 +inf LIMIT 1 3 //获取分数高于60的从第二个人开始的3个人

Increase the score of an element

The time complexity is OlogN.

ZINCRBY scoreboard 10 Tyson

Other

ZCARD scoreboard          //获取集合元素个数,时间复杂度O1
ZCOUNT scoreboard 80 100  //指定分数范围的元素个数
ZRANK scoreboard Tyson    //按从小到大的顺序获取元素排名
ZREVRANK scoreboard Tyson //按从大到小的顺序获取元素排名

Bitmaps

Bitmaps itself is not a data structure, in fact it is a string, but it can operate on the bits of the string. You can think of Bitmaps as an array in bits. Each unit of the array can only store 0 and 1 .

The length of the bitmap has nothing to do with the number of elements in the set, but with the upper limit of the base. If you want to calculate a base with an upper limit of 100 million, you need a bitmap of 12.5M bytes. Even if there are only 10 elements in the set, 12.5M is required.

HyperLogLog

HyperLogLog is an algorithm used for cardinality statistics. Its advantage is that when the number or volume of input elements is very very large, the space required to calculate the cardinality is always fixed and small.

Cardinality: For example, the data set {1, 3, 5, 7, 5, 7, 8}, then the cardinality set of this data set is {1, 3, 5 ,7, 8}, and the cardinality is 5.

Application scenario: statistics of unique visitor (uv).

data structure

Dynamic string

SDS definition:

struct sdshdr {

    // 记录 buf 数组中已使用字节的数量
    // 等于 SDS 所保存字符串的长度
    int len;

    // 记录 buf 数组中未使用字节的数量
    int free;

    // 字节数组,用于保存字符串
    char buf[];

};
C stringSDS
The complexity of obtaining the length of a string is O(N).The complexity of obtaining the length of the string is O(1).
API is not safe and may cause buffer overflow.The API is safe and will not cause buffer overflow.
Modifying the string length N times will necessarily need to perform memory reallocations NModifying the string length N times requires at most N memory reallocations.
Only text data can be saved.You can save text or binary data.
All functions in the <string.h> library can be used.You can use part of the functions in the <string.h>

dictionary

The dictionary uses hashtable as the underlying implementation. The value of a key-value pair can be a pointer, or a uint64_t integer, or an int64_t integer.

typedef struct dictEntry {

    // 键
    void *key;

    // 值
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
    } v;

    // 指向下个哈希表节点,形成链表
    struct dictEntry *next;

} dictEntry;

Integer set

Integer set (intset) is an abstract data structure used by Redis to store integer values. It can store integer values of type int16_t, int32_t or int64_t, and it is guaranteed that no duplicate elements will appear in the set.

Compressed list

Ziplist is developed by Redis to save memory. It is a sequential data structure composed of a series of specially coded contiguous memory blocks. Each compressed list node consists of three parts: previous_entry_length, encoding, and content.

The previous_entry_length attribute of the node is in bytes and records the length of the previous node in the compressed list.
The encoding attribute of the node records the type and length of the data stored in the content attribute of the node. There are two encoding methods, byte array encoding and integer encoding.

The traversal operation of the compressed list from the end of the table to the head of the table is implemented using this principle: as long as we have a pointer to the starting address of a certain node, then through this pointer and the previous_entry_length attribute of this node, the program can go all the way The previous node backtracks and finally reaches the header node of the compressed list.

Skip table

The skip list can be regarded as a multi-layer linked list, which has the following properties:

  • Multi-layer structure, each layer is an ordered linked list
  • The lowest linked list contains all the elements
  • The number of lookups of the jump table is similar to the number of layers, the time complexity is O(logn), and the insertion and deletion are also O(logn)

Object

Redis's object system also implements a memory recycling mechanism based on reference counting technology: When the program no longer uses an object, the memory occupied by this object will be automatically released; in addition, Redis also implements the object through reference counting technology Sharing mechanism, this mechanism can save memory by allowing multiple database keys to share the same object under appropriate conditions.

Low-level implementation

string

The encoding of string objects can be int, raw or embstr.

  1. If a string object holds an integer value, and this integer value can be represented by a long type, then the encoding will be set to int.
  2. If the string object saves a string value, and the length of the string value is greater than 39 bytes, then the string object will use a simple dynamic string (SDS) to save the string value, and the encoding of the object Set to raw.
  3. If the string object saves a string value, and the length of the string value is less than or equal to 39 bytes, then the string object will use embstr encoding to save the string value.
valuecoding
You can use long type to save integers.int
Can use long double type to save floating point numbers.embstr or raw
String value, or because the length is too large to use long type, or because the length is too large, the floating point number represented long doubleembstr or raw

hash

There are two types of internal coding for hash types:

  1. ziplist, compressed list. When the number of hash type elements is less than 512, and all values are less than 64 bytes, Redis will use ziplist as the internal implementation of the hash. Ziplist uses a more compact structure to achieve continuous storage of multiple elements, which saves more memory.
  2. hashtable. When the hash type cannot meet the conditions of the ziplist, Redis will use the hashtable as the internal implementation of the hash, because at this time the read and write efficiency of the ziplist will decrease, and the read and write time complexity of the hashtable is O(1).

When using ziplist as the underlying implementation of hash, when adding elements, the two nodes of the same key-value pair are always close to each other, the node that saves the key is first, and the node that saves the value is behind.

Usage scenario: Record the number of blog likes. hset MAP_BLOG_LIKE_COUNT blogId likeCount , the key is MAP_BLOG_LIKE_COUNT, the field is the blog id, and the value is the number of likes.

list

There are two types of internal codes for the list type:

  1. ziplist, compressed list. When the number of elements in the list is less than 512, and the value of each element in the list is less than 64 bytes, Redis will use ziplist as the internal implementation of the list to reduce memory usage.
  2. When the list type cannot meet the conditions of the ziplist, Redis will use the linkedlist as the internal implementation of the list.

Redis3.2 version provides quicklist internal coding. Simply put, it is a linkedlist with a ziplist as the node. It combines the advantages of both ziplist and linkedlist, and provides a more excellent internal coding implementation for list types.

scenes to be used:

  1. message queue. The combination of Redis's lpush+brpop command can realize the blocking queue.

set

The encoding of the collection object can be intset or hashtable.

  1. The intset encoded collection object uses the integer collection as the underlying implementation, and all the elements contained in the collection object are stored in the integer collection (array).
  2. The collection object encoded by hashtable uses a dictionary as the underlying implementation. Each key of the dictionary is a string object, and the values of the dictionary are all set to NULL.

zset

The encoding of ordered sets can be ziplist or skiplist. When the number of elements in an ordered set is less than 128, and the value of each element is less than 64 bytes, Redis will use ziplist as the internal implementation of the ordered set. Ziplist can effectively reduce memory usage. Otherwise, use skiplist as the internal implementation of the ordered set.

  1. The ordered collection object encoded by ziplist uses compressed list as the underlying implementation. Each collection element is saved by two compressed list nodes next to each other. The first node saves the member of the element, while the second element is Save the score of the element. The set elements in the compressed list are sorted from smallest to largest score.
  2. The ordered set of objects encoded by skiplist is implemented using dictionaries and skip lists. Use the dictionary to find the score of a given member, and the time complexity is O(1) (the time complexity of looking up the skip table is O(logN)). Use the skip list to perform range operations on ordered sets.

scenes to be used

string: 1. Conventional key-value caching application. Regular counting such as the number of Weibo and the number of fans. 2. Distributed locks.

hash: Store structured data, such as user information (nickname, age, gender, points, etc.).

list: a list of popular blogs, a message queue system. Use lists to build a queue system. For example, Redis is used as a log collector, multiple endpoints write log information to Redis, and then a worker uniformly writes all logs to disk.

set: 1. Friendship, Weibo fans' common attention, common preferences, common friends, etc.; 2. Use uniqueness to count all independent IPs that visit the website.

zset: 1. Ranking; 2. Priority queue.

Database management

Switch database: select 1 . There are 16 databases in the default configuration of Redis. The data between the 0th database and the 15th database is not related in any way, and the same key can exist. It is not recommended to use the Redis multi-database function. You can deploy multiple Redis instances on one machine and use the port number to distinguish them to realize the multi-database function.

The flushdb/flushall command is used to clear the database. The difference between the two is that flushdb only clears the current database, and flushall clears all databases. If the number of key values in the current database is relatively large, flushdb/flushall may block Redis, and these two commands will clear all data, and the consequences of misoperation will be disastrous.

Sort

LPUSH myList 4 8 2 3 6
SORT myList DESC
LPUSH letters f l d n c
SORT letters ALPHA

BY parameter

LPUSH list1 1 2 3
SET score:1 50
SET score:2 100
SET score:3 10
SORT list1 BY score:* DESC

GET parameter

The function of the GET parameter command is to make the return result of the SORT command the key value specified by the GET parameter.

SORT tag:Java:posts BY post:*->time DESC GET post:*->title GET post:*->time GET #

GET #Return the article ID.

STORE parameters

SORT tag:Java:posts BY post:*->time DESC GET post:*->title STORE resultCache

EXPIRE resultCache 10 //STORE combined with EXPIRE can cache sort results

Affairs

The principle of transaction is to send several commands within the scope of a transaction to Redis, and then let Redis execute these commands in turn.

The life cycle of the transaction:

  1. Use MULTI to start a transaction
  2. When the transaction is started, the command for each operation will be inserted into a queue, and the command will not be actually executed
  3. EXEC command to commit the transaction

DISCARD: Abandon the transaction, that is, all commands in the transaction will be cancelled

An error in a command within the scope of a transaction will not affect the execution of other commands, and atomicity is not guaranteed:

127.0.0.1:6379> multi
OK
127.0.0.1:6379> set a 1
QUEUED
127.0.0.1:6379> set b 1 2
QUEUED
127.0.0.1:6379> set c 3
QUEUED
127.0.0.1:6379> exec
1) OK
2) (error) ERR syntax error
3) OK

The latest value will be read when the command in the transaction is executed:

WATCH command

The WATCH command can monitor one or more keys. Once one of the keys is modified, subsequent transactions will not be executed (similar to optimistic locking). After executing the EXEC command, the monitoring will be cancelled automatically.

127.0.0.1:6379> watch name
OK
127.0.0.1:6379> set name 1
OK
127.0.0.1:6379> multi
OK
127.0.0.1:6379> set name 2
QUEUED
127.0.0.1:6379> set gender 1
QUEUED
127.0.0.1:6379> exec
(nil)
127.0.0.1:6379> get gender
(nil)

UNWATCH: To cancel the monitoring of multiple keys by the WATCH command, all monitoring locks will be cancelled.

message queue

Use a list to let the producer put tasks into the list using the LPUSH command, and the consumer keeps using RPOP to take out tasks from the list.

The BRPOP and RPOP commands are similar, the only difference is that when there are no elements in the list, the BRPOP command will block the connection until a new element is added.
BRPOP queue 0 //0 means unlimited waiting time

Priority queue

BLPOP queue:1 queue:2 queue:3 0
If multiple keys have elements, the elements are taken from left to right

Publish/subscribe model

PUBLISH channel1 hi
SUBSCRIBE channel1
UNSUBSCRIBE channel1 //退订通过SUBSCRIBE命令订阅的频道。

PSUBSCRIBE channel?* subscribe according to the rules
PUNSUBSCRIBE channel?* Unsubscribe from channels subscribed by PSUBSCRIBE according to certain rules. Among them, the subscription rules need to perform strict string matching, and PUNSUBSCRIBE * cannot unsubscribe from the channel?* rule.

Disadvantages: When the consumer goes offline, the produced message will be lost.

Delay queue

Use sortedset, take the timestamp as the score, the message content as the key, and call zadd to produce the message. Consumers use the zrangebyscore instruction to obtain data polling N seconds before for processing.

Persistence

Redis supports two ways of persistence, one is RDB and the other is AOF. The former will regularly store the data in the memory on the hard disk according to the specified rules, while the latter will record the command every time it is executed. Generally use a combination of the two.

RDB way

RDB is the default persistence scheme of Redis. When RDB persists, the data in the memory is written to the disk, and a dump.rdb file is generated in the specified directory. Redis restart will load the dump.rdb file to restore the data.

The process of RDB persistence (except for executing the SAVE command):

  • Create a child process;
  • The parent process continues to receive and process client requests, while the child process begins to write the data in the memory into the temporary file on the hard disk;
  • When the child process finishes writing all the data, it will replace the old RDB file with the temporary file.

When Redis starts, it will read the RDB snapshot file and load the data from the hard disk into the memory. Through RDB persistence, once Redis exits abnormally, the data changed after the last persistence will be lost.

Trigger RDB snapshot:

  1. Manual trigger:

    • The user executes the SAVE or BGSAVE command. The process of executing the snapshot by the SAVE command will block all requests from the client. Avoid using this command in a production environment. The BGSAVE command can perform the snapshot operation asynchronously in the background, and the server can continue to respond to the client's request while the snapshot is taken. Therefore, it is recommended to use the BGSAVE command when you need to execute the snapshot manually;
  2. Passive trigger:

    • Automatic snapshots are taken according to the configuration rules, such as SAVE 300 10 , if at least 10 keys are modified within 300 seconds, a snapshot will be taken.
    • If the slave node performs a full copy operation, the master node automatically executes bgsave to generate the RDB file and send it to the slave node.
    • When the shutdown command is executed by default, bgsave is automatically executed if the AOF persistence function is not enabled.
    • When the debug reload command is executed to reload Redis, the save operation will also be automatically triggered.

Advantages: Redis loads RDB to recover data much faster than AOF.

shortcoming:

  1. There is no way to achieve real-time persistence/second-level persistence for RDB data. Because bgsave has to perform a fork operation to create a child process every time it runs, it is a heavyweight operation, and the cost of frequent execution is too high.
  2. There is a compatibility issue between the old version of the Redis service and the new version of the RDB format. RDB files are saved in a specific binary format. There are multiple RDB versions in the Redis version evolution process, and there is a problem that the old version of Redis services cannot be compatible with the new version of RDB format.

AOF method

AOF (append only file) persistence: Each write command is recorded in an independent log. When Redis restarts, it will re-execute the commands in the AOF file to achieve the purpose of restoring data. The main function of AOF is that solves the real-time data persistence , and it is currently the mainstream Redis persistence method.

By default, Redis does not enable AOF persistence. You can enable appendonly yes through the appendonly parameter. After the AOF mode persistence is turned on, every time a write command is executed, Redis will write the command into the aof_buf buffer, and the AOF buffer will synchronize to the hard disk according to the corresponding strategy.

By default, the system will perform a synchronization operation every 30 seconds. In order to prevent data loss in the buffer, you can actively request the system to synchronize the data in the buffer to the hard disk after Redis writes the AOF file. The timing of synchronization can be appendfsync

appendfsync always //每次写入aof文件都会执行同步,最安全最慢,只能支持几百TPS写入,不建议配置
appendfsync everysec  //保证了性能也保证了安全,建议配置
appendfsync no //由操作系统决定何时进行同步操作

Rewriting mechanism:

As commands continue to be written into the AOF, the file will become larger and larger. In order to solve this problem, Redis introduces the AOF rewrite mechanism to compress the file size. AOF file rewriting is the process of converting the data in the Redis process into write commands and syncing to the new AOF file.

advantage:
(1) AOF can better protect data from loss. Generally, AOF will perform an fsync operation every second. If the redis process hangs up, it will lose up to 1 second of data.
(2) AOF is written in an append-only mode, so there is no disk addressing overhead, and the write performance is very high.
shortcoming
(1) For the same file, the AOF file is larger than the RDB data snapshot.
(2) Not suitable for scenarios where writing more and less reading.
(3) Data recovery is relatively slow.

How to choose between RDB and AOF
(1) Using only RDB will lose a lot of data.
(2) Only use AOF, because this will have two problems. First, the recovery speed through AOF is slow; the second RDB simply generates data snapshots every time, which is more secure and robust.
(3) Integrate the two persistence methods of AOF and RDB, use AOF to ensure that data is not lost, as the first choice to restore data; use RDB to do different degrees of cold backup, when AOF files are lost or damaged and unavailable , You can use RDB for fast data recovery.

Cluster

Master-slave replication

The replication function of redis is to support data synchronization between multiple databases. The master database can perform read and write operations, and when the data of the master database changes, it will automatically synchronize the data to the slave database. The slave database is generally read-only, and it will receive the data synchronized from the master database. A master database can have multiple slave databases, and a slave database can only have one master database.

redis-server //启动Redis实例作为主数据库 
redis-server --port 6380 --slaveof  127.0.0.1 6379  //启动另一个实例作为从数据库 
slaveof 127.0.0.1 6379
SLAVEOF NO ONE //停止接收其他数据库的同步并转化为主数据库。

Synchronization mechanism

  1. Save the master node information.
  2. The master and slave establish a socket connection.
  3. The ping command is sent from the node for the first communication, which is mainly used to detect the network status.
  4. Authority authentication. If the master node has set the requirepass parameter, password authentication is required. The slave node must configure the masterauth parameter to ensure that the password is the same as that of the master node to pass the verification.
  5. Synchronize the data set. During the first synchronization, the SYNC command will be sent to the main database after the slave database is started. After receiving the command, the main database starts to save the snapshot in the background (RDB persistence process), and caches the commands received during the process of saving the snapshot. When the snapshot is completed, Redis will send the snapshot file and cached commands to the slave database. After receiving it from the database, the snapshot file will be loaded and the cached command will be executed. The above process is called copy initialization.
  6. After the replication initialization is completed, the master database will synchronize the command to the slave database every time it receives a write command, thereby achieving data consistency between the master and slave databases.

Redis uses the psync command to complete master-slave data synchronization in version 2.8 and above. The synchronization process is divided into: full replication and partial replication.

Full replication: Generally used in initial replication scenarios. The early replication function supported by Redis is only full replication. It will send all the data of the master node to the slave node at one time. Big overhead.

Partial replication: Used to deal with data loss scenarios caused by network interruptions in master-slave replication. When the slave node connects to the master node again, if conditions permit, the master node will reissue the lost data to the slave node. Because the reissued data is far smaller than the full amount of data, it can effectively avoid the excessive overhead of full copying.

Read and write separation

The replication function of redis can realize the separation of read and write of the database and improve the load capacity of the server. The main database mainly performs write operations, while the slave database is responsible for read operations. In many scenarios, the frequency of reading the database is greater than writing. When the stand-alone Redis cannot handle a large number of read requests, multiple slave database nodes can be established through the replication function. The master database is responsible for writing operations and the slave database is responsible for reading operations. This one-master, multiple-slave structure is very suitable for scenarios where read more and write less.

Persist from the database

The operation of persistence is time-consuming. In order to improve performance, you can create a slave database and perform persistence in the slave database, while disabling persistence in the master database.

Sentinel

When the master node crashes, you can manually promote the slave to the master and continue to provide services.

  • First of all, use SLAVE NO ONE from the database to promote the main database to continue to serve;
  • SLAVEOF master database, and set it as the slave database of the new master database through the 061476592af0b6 command to synchronize the data.

The master and slave nodes can be automatically switched through the sentinel mechanism. The sentinel is an independent process used to monitor whether the redis instance is running normally.

effect

  1. Monitor the status of the redis instance
  2. If the master instance is abnormal, it will automatically switch between master and slave nodes

When the client connects to redis, it first connects to the sentry. The sentry will tell the client the address of the redis master node, and then the client connects to redis and performs subsequent operations. When the master node is down, the sentinel detects that the master node is down, and will re-elect a well-behaved slave node to become the new master node, and then notify other slave servers through a publish-subscribe mode to allow them to switch hosts.

Timed task

  1. Every 10s, each Sentinel node will send info commands to the master and slave nodes to get the latest topology.
  2. Every 2s, each Sentinel node will obtain the judgment of other Sentinel nodes on the main node and the information of the current Sentinel node, which is used to judge whether the main node is objectively offline and whether there is a new Sentinel node to join.
  3. Every 1s, each Sentinel node will send a ping command to the master node, slave node, and other Sentinel nodes to do a heartbeat check to confirm whether these nodes are reachable.

working principle

  • Each Sentinel sends a PING command to the Master, Slave, and other Sentinel instances it knows once a second.
  • If an instance is more than the specified value since the last valid reply to the PING command, the instance will be marked as subjective offline by Sentinel.
  • If a Master is marked as subjectively offline, all Sentinels that are monitoring this Master must confirm whether the Master has actually entered the subjective offline state at a frequency of once per second.
  • When a sufficient number of Sentinel (greater than or equal to the value specified in the configuration file) confirm that the Master has indeed entered the subjective offline state within the specified time range, the Master will be marked as objectively offline. If there is not enough Sentinel to agree that the Master has been offline, the objective offline status of the Master will be removed. If the Master returns a valid reply to Sentinel's PING command, the subjective offline status of the Master will be removed.
  • The sentinel node will elect a sentinel leader to be responsible for failover.
  • The sentinel leader will select a well-behaved slave node to become the new master node, and then notify other slave nodes to update the master node.
/**
 * 测试Redis哨兵模式
 * @author liu
 */
public class TestSentinels {
    @SuppressWarnings("resource")
    @Test
    public void testSentinel() {
        JedisPoolConfig jedisPoolConfig = new JedisPoolConfig();
        jedisPoolConfig.setMaxTotal(10);
        jedisPoolConfig.setMaxIdle(5);
        jedisPoolConfig.setMinIdle(5);
        // 哨兵信息
        Set<String> sentinels = new HashSet<>(Arrays.asList("192.168.11.128:26379",
                "192.168.11.129:26379","192.168.11.130:26379"));
        // 创建连接池
        JedisSentinelPool pool = new JedisSentinelPool("mymaster", sentinels,jedisPoolConfig,"123456");
        // 获取客户端
        Jedis jedis = pool.getResource();
        // 执行两个命令
        jedis.set("mykey", "myvalue");
        String value = jedis.get("mykey");
        System.out.println(value);
    }
}

cluster

The cluster is used to share the write pressure, and the master-slave is used for disaster backup and high availability as well as to share the read pressure.

The problem of master-slave replication cannot fail over automatically and cannot achieve high availability.
The sentinel mode solves the problem that the master-slave replication cannot automatically failover and cannot achieve high availability. However, there are still problems that the write capability and capacity of the master node are limited by the stand-alone configuration.
The cluster mode realizes the distributed storage of Redis, each node stores different content, and solves the problem that the write ability and capacity of the master node are limited by the single-machine configuration.

Hash partition algorithm

The node takes the remaining partition. Use specific data, such as Redis key or user ID, to take the remainder of the number of nodes N: hash(key)%N calculates the hash value, which is used to determine which node the data is mapped to.
The advantage is simplicity. Double expansion is usually used to expand the capacity to avoid the situation that all the data mapping is disrupted and the full amount of migration is caused.

Consistent Hash Partition: Assign a token to each node in the system, generally ranging from 0 to 232. These tokens form a hash ring. When reading and writing data to perform a node search operation, first calculate the hash value according to the key, and then find the first token node that is greater than or equal to the hash value in a clockwise direction.
The biggest advantage of this method compared to node surplus is that adding and deleting nodes only affects adjacent nodes in the hash ring, and has no effect on other nodes.

Redis Cluser uses virtual slot partitions. All keys are mapped to 0-16383 integer slots according to the hash function. The calculation formula is: slot=CRC16(key)&16383. Each node is responsible for maintaining a part of the slot and the key value data mapped by the slot.

Failover

The nodes in the Redis cluster implement node communication through ping/pong messages. The messages can not only propagate node slot information, but also other states such as master-slave status, node failure, etc. Therefore, fault detection is also realized through the message dissemination mechanism, and the main links include: subjective offline (pfail) and objective offline (fail).

Lua script

Redis creates atomic commands through LUA scripts: When lua script commands are running, no other scripts or Redis commands will be executed, achieving atomic operations of combined commands.

There are two ways to execute Lua scripts in Redis: eval and evalsha.

The eval command uses the built-in Lua interpreter to evaluate Lua scripts.

//第一个参数是lua脚本,第二个参数是键名参数个数,剩下的是键名参数和附加参数
> eval "return {KEYS[1],KEYS[2],ARGV[1],ARGV[2]}" 2 key1 key2 first second
1) "key1"
2) "key2"
3) "first"
4) "second"

evalsha

Redis also provides the evalsha command to execute Lua scripts. First, load the Lua script to the Redis server to get the SHA1 checksum of the script. The Evalsha command executes the script cached in the server based on the given sha1 checksum.

The script load command can load script content into Redis memory.

redis 127.0.0.1:6379> SCRIPT LOAD "return 'hello moto'"
"232fd51614574cf0867b83d384a5e898cfd24e5a"

redis 127.0.0.1:6379> EVALSHA "232fd51614574cf0867b83d384a5e898cfd24e5a" 0
"hello moto"

The process of using evalsha to execute a Lua script is as follows:

Lua script function

1. Lua scripts are executed atomically in Redis, and no other commands will be inserted during execution.

2. Lua scripts can pack multiple commands at once, effectively reducing network overhead.

Application scenario

Limit the frequency of interface access.

Redis maintains a key-value pair for the number of interface accesses, where key is the interface name and value is the number of accesses. Each time the interface is accessed, the following operations are performed:

  • Intercept interface requests through aop and count interface requests. Each time a request comes in, the corresponding interface count is incremented by 1, and stored in redis.
  • If it is the first request, it will set count=1 and set the expiration time. Because the set() and expire() combined operations are not atomic operations, lua scripts are introduced to implement atomic operations and avoid concurrent access problems.
  • If the maximum number of visits is exceeded within a given time range, an exception will be thrown.
private String buildLuaScript() {
    return "local c" +
        "\nc = redis.call('get',KEYS[1])" +
        "\nif c and tonumber(c) > tonumber(ARGV[1]) then" +
        "\nreturn c;" +
        "\nend" +
        "\nc = redis.call('incr',KEYS[1])" +
        "\nif tonumber(c) == 1 then" +
        "\nredis.call('expire',KEYS[1],ARGV[2])" +
        "\nend" +
        "\nreturn c;";
}

String luaScript = buildLuaScript();
RedisScript<Number> redisScript = new DefaultRedisScript<>(luaScript, Number.class);
Number count = redisTemplate.execute(redisScript, keys, limit.count(), limit.period());

Delete strategy

  1. Passively delete. When accessing the key, if it is found that the key has expired, the key will be deleted.
  2. Take the initiative to delete. Clean up keys regularly. Each time you clean up, you will traverse all DBs in turn, and randomly take out 20 keys from the db, and delete them if they expire. If 5 of them expire, then continue to clean up this db, otherwise start to clean up the next db.
  3. Clean up when the memory is insufficient. Redis has a maximum memory limit. The maximum memory can be set through the maxmemory parameter. When the used memory exceeds the set maximum memory, the memory must be released. When the memory is released, the memory will be cleaned up according to the configured elimination strategy, and the elimination strategy Generally there are 6 kinds, and 2 kinds have been added after Redis4.0 version, mainly divided into three categories:

    • The first category does not deal with noeviction. When the memory is found to be insufficient, the key is not deleted, and an error message is directly returned when the write command is executed. (Default configuration)
    • The second type is selected from all the keys in the result set and eliminated

      • allkeys-random is to randomly select keys from all keys and eliminate them
      • allkeys-lru is to select the least recently used data from all keys and eliminate
      • allkeys-lfu is to select the least frequently used key from all keys and eliminate it. (This is a new strategy after Redis 4.0)
    • The third type is selected from the keys with an expiration time and eliminated

      This is to select a part of the keys from the result set with the expires expiration time set, and the selected algorithms are:

      • volatile-random randomly selects keys to delete from the result set with expiration time set.
      • Volatile-lru selects the least recently used data from the result set with expiration time set and eliminated
      • Volatile-ttl selects the key with the shortest survivable time from the result set with the expiration time set to delete (that is, delete the key that is about to expire first)
      • Volatile-lfu selects the least frequently used key from the result set of expiration time to start deleting (this is a new strategy added after Redis 4.0)

other

Client

The communication protocol between the Redis client and the server is built on top of the TCP protocol.

The Redis Monitor command is used to print out the commands received by the Redis server in real time for debugging.

redis 127.0.0.1:6379> MONITOR 
OK
1410855382.370791 [0 127.0.0.1:60581] "info"
1410855404.062722 [0 127.0.0.1:60581] "get" "a"

Slow query

slowlog get{n} provides the slow query statistics function. You can execute the 061476592b5f18 command to get the latest n slow query commands. By default, commands executed over 10 milliseconds will be recorded in a fixed-length queue. It is recommended to set 1 millisecond for online instances to facilitate timely discovery of milliseconds. Commands above level. The length of the slow query queue defaults to 128, which can be increased appropriately.

A command executed by the Redis client is divided into 4 parts: sending the command; queuing the command; executing the command; returning the result. Slow query only counts the time for the command to execute this step, so the absence of slow query does not mean that the client has no timeout problem.

Redis provides slowlog-log-slower-than (setting the slow query threshold in microseconds) and slowlog-max-len (slow query queue size) to configure slow query parameters.

Related commands:

showlog get n //获取慢查询日志
slowlog len //慢查询日志队列当前长度
slowlog reset //重置,清理列表

Slow query solution:

  1. Modified to low-time complexity commands, such as hgetall changed to hmget, etc., disable keys, sort and other commands.
  2. Adjust the large object: reduce the data of the large object or split the large object into multiple small objects to prevent one command from operating too much data.

pipeline

The redis client executes a command in 4 processes: Send command -> command queue -> command execution -> return result. Use Pipeline to request batches and return results in batches, and the execution speed is faster than one by one.

The number of commands assembled using the pipeline cannot be too many, otherwise the data volume will be too large, which will increase the waiting time of the client, and may also cause network congestion. A large number of commands can be split into multiple small pipeline commands to complete.

Comparison of native batch commands (mset, mget) and Pipeline:

  1. Native batch commands are atomic, and pipelines are non-atomic. The pipeline command exits abnormally in the middle, and the previously executed command will not be rolled back.
  2. The native batch command has only one command, but the pipeline supports multiple commands.

Data consistency

How to ensure data consistency between the cache and the DB:
Read operation: read the cache first, read the DB if there is no cache, then take out the data into the cache, and finally respond to the data
Write operation: delete the cache first, and then update the DB
Why delete the cache instead of updating it?

  1. Thread safety issues. At the same time there are request A and request B to update the operation, then there will be (1) thread A updated the cache (2) thread B updated the cache (3) thread B updated the database (4) thread A updated the database, due to the network, etc. The reason is that B is requested to update the database first, which leads to the problem of inconsistency between the cache and the database.
  2. If the business needs more scenarios for writing databases, but fewer scenarios for reading data, using this solution will result in frequent updates of the cache before the data is read at all, which wastes performance.
  3. If you write a value to the database, it is not directly written to the cache, but to be written to the cache after a series of complex calculations. Then, after each write to the database, the value written to the cache is calculated again, which is undoubtedly a waste of performance.

Delete the cache first, and then update the DB, there are also problems. If A deletes the cache first, but has not updated the DB, then B comes to request the data, and finds that the cache is not there, asks the DB to get the old data, and then writes to the cache. After A finishes updating the DB, the cache and DB will appear. The data is inconsistent.

Solution: adopt a delayed double deletion strategy. After updating the database, delay a period of time and delete the cache again to ensure that the dirty data in the cache caused by the read request can be deleted. Evaluate the time-consuming business logic of reading data of the project. Then, the sleep time of writing data can be increased by a few hundred ms on the basis of the time-consuming business logic of reading data.

public void write(String key,Object data){
    redis.delKey(key);
    db.updateData(data);
    Thread.sleep(1000);//确保读请求结束,写请求可以删除读请求造成的缓存脏数据
    redis.delKey(key);
}

The second deletion can be treated as asynchronous. Start a thread by yourself and delete it asynchronously. In this way, the write request does not need to sleep for a period of time and increase the throughput.

When deleting the cache fails, data inconsistency will also occur.

Solution:

Image source: https://tech.it168.com

程序员大彬
468 声望489 粉丝

非科班转码,个人网站:topjavaer.cn