10
头图

Redis internally uses a RedisObject object to represent all keys and values. The type in RedisObject represents the specific data type of a value object. It includes a string (String), a linked list (List), and a hash structure (Hash ), Set, Sorted set.

When we store object information in daily work, there are generally two methods, one is to store with Hash, and the other is to store with String. But it seems that there is no so-called best practice, so what data structure is actually better for storage?

First, briefly review the Hash and String structure of Redis.

String

The String data structure is a simple key-value type, and the value is actually not only a String, but also a number. String in Redis can represent many semantics:

  • String (bits)
  • Integer
  • Floating point

For these three types, Redis will complete the automatic conversion according to the specific scenario, and select the underlying bearer mode according to the needs. String is a string in Redis internal storage by default, which is referenced by RedisObject. When it encounters incr, decr and other operations, it will be converted into a numeric value for calculation. At this time, the encoding field of RedisObject is int.

In the storage process, we can use Json to serialize the user information into a string, and then store the serialized string in Redis for caching.

String 数据结构

Because Redis strings are dynamic strings and can be modified, the internal structure is similar to Java's ArrayList, which uses pre-allocation of redundant space to reduce frequent memory allocation. As shown in the figure above, the internal space capacity actually allocated for the current string is generally higher than the actual string length len.

Suppose the structure we want to store is:

{
  "name": "xiaowang",
  "age": "35"
}

If the name of this user information is changed to "xiaoli" at this time, and then stored in Redis, Redis does not need to reallocate space. And we only need to do Json serialization and deserialization when reading and storing data, which is more convenient.

Hash

Hash has a wide range of applications in many programming languages, and the same is true in Redis. In Redis, Hash is often used to cache some object information, such as user information, product information, configuration information, etc., so it is also called a dictionary. Redis dictionary uses Hash table as the underlying implementation. A Hash table can contain Multiple hash table nodes, and each hash table node stores a key-value pair in the dictionary. In fact, the bottom layer of Redis database also uses Hash table to store key-value pairs.

Redis's Hash is equivalent to Java's HashMap, and its internal structure is consistent with HashMap, that is, array + linked list structure. It's just that the reHash method is different.

Hash 数据结构

As mentioned earlier, String is suitable for storing user information, and the Hash structure can also store user information, but it is stored separately for each field, so you can obtain the information of some fields during query and save network traffic. However, the value of Redis Hash can only be a string. It is okay to store the example above. If the stored user information becomes:

{
  "name": "xiaowang",
  "age": 25,
  "clothes": {
    "shirt": "gray",
    "pants": "read"
  }
}

So how to store the "clothes" attribute becomes a question of whether to use String or Hash.

Comparison of the memory occupied by String and Hash

Since both data structures can store structure information. Which one is more suitable?

First, we use code to insert 10,000 pieces of data, and then use a visualization tool to see the memory usage.

const Redis = require("ioRedis");
const Redis0 = new Redis({port: 6370});
const Redis1 = new Redis({port: 6371});


const user = {
  name: 'name12345',
  age: 16,
  avatar: 'https://dss3.bdstatic.com/70cFv8Sh_Q1YnxGkpoWK1HF6hhy/it/u=256767015,24101428&fm=26&gp=0.jpg',
  phone: '13111111111',
  email: '1111111@11.email',
  lastLogon: '2021-04-28 10:00:00',
}


async function main() {
  for (let i = 0; i < 10000; i++) {
    await Redis0.set(`String:user:${i}`, Json.Stringify(user));
    await Redis1.hmset(`Hash:user:${i}`, user);
  }
}

main().then(process.exit);

Look at Redis0 first:

Let's look at Redis1 again:

It can be seen that there is still a little gap, but the gap is not obvious.

Netizens discuss

Internet users also have the same question, because the length of the value is uncertain, so I don't know whether it is more efficient to use String or Hash to store it.

截图来源于 StackOverflow(Redis Strings vs Redis Hashes to represent Json: efficiency?)

Here I mainly translate the high-quality answers to this question for everyone:

suitable for String storage:

  • Need to access a large number of fields each time
  • When the stored structure has multiple levels of nesting

suitable for Hash storage:

  • In most cases only a small number of fields need to be accessed
  • You always know which fields are available to prevent you from not getting the data you want when using mget

to sum up

This article mainly introduces whether Redis uses Hash or String to store object information. It is recommended to use String storage in most cases. After all, it is much more convenient to store objects with multiple levels of nesting, and it takes up less space than Hash. When we need to store a particularly large object, and in most cases only need to access a small number of fields of the object, we can consider using Hash.

Recommended reading

Say goodbye to DNS hijacking, read DoH in one article

Flink's practice in batch processing of cloud logs


云叔_又拍云
5.9k 声望4.6k 粉丝

又拍云是专注CDN、云存储、小程序开发方案、 短视频开发方案、DDoS高防等产品的国内知名企业级云服务商。