MongoDB learning rich index

MongoDB of the index of MySql the index of 061c20820efa42 and the principles to be optimized are basically similar. The MySql can be basically divided into:

Single key index-joint index
Primary key index (clustered index)-non-primary key index (non-clustered index)

In MongoDB , there are also some special index types, such as: array index | sparse index | geospatial index | TTL index, etc.

In order to facilitate the test below, we use the script to insert the following data

for(var i = 0;i < 100000;i++){
    db.users.insertOne({
        username: "user"+i,
        age: Math.random() * 100,
        sex: i % 2,
        phone: 18468150001+i
    });
}

Single key index

A single-key index means that there is only one field in the index, which is the most basic indexing method.

username field in the collection to create a single-key index, MongoDB will automatically name this index username_1

db.users.createIndex({username:1})
'username_1'

After creating the index, look at the query plan username stage is IXSCAN , which means that the index scan is used.

db.users.find({username:"user40001"}).explain()
{ 
   queryPlanner: 
   { 
     winningPlan: 
     { 
        ......
        stage: 'FETCH',
        inputStage: 
        { 
           stage: 'IXSCAN',
           keyPattern: { username: 1 },
           indexName: 'username_1',
           ......
        } 
     }
     rejectedPlans: [] ,
   },
   ......
   ok: 1 
}

Among the principles of index optimization, there is a very important principle that the index should be built on a field with a high cardinality. The so-called cardinality is the number of non-repeated values on a field, that is, the value of the age when users 0-99 then age this field will have 100 unique values, namely age base field is 100, and sex this field will only appear 0 | 1 the two values, namely sex underlying field is 2, which is a fairly low Cardinality, in this case, the efficiency of the index is not high and will cause the index to fail.

Below is a sex field index on the ship. When you query the execution plan, you will find that the full table scan is used when the query is performed, and the relevant index is not used.

db.users.createIndex({sex:1})
'sex_1'

db.users.find({sex:1}).explain()
{ 
  queryPlanner: 
  { 
     ......
     winningPlan: 
     { 
        stage: 'COLLSCAN',
        filter: { sex: { '$eq': 1 } },
        direction: 'forward' 
     },
     rejectedPlans: [] 
  },
  ......
  ok: 1 
}

Joint index

The joint index means that there will be multiple fields on the index. Below, use the age and sex create an index

db.users.createIndex({age:1,sex:1})
'age_1_sex_1'

Then we use these two fields to perform a query, view the execution plan, and successfully go through this index

db.users.find({age:23,sex:1}).explain()
{ 
  queryPlanner: 
  { 
     ......
     winningPlan: 
     { 
        stage: 'FETCH',
        inputStage: 
        { 
           stage: 'IXSCAN',
           keyPattern: { age: 1, sex: 1 },
           indexName: 'age_1_sex_1',
           .......
           indexBounds: { age: [ '[23, 23]' ], sex: [ '[1, 1]' ] } 
        } 
     },
     rejectedPlans: [], 
  },
  ......
  ok: 1 
 }

Array index

The array index is to create an index on the array field, also called a multi-value index. The following is to add a part of the array field to the data in the users

db.users.updateOne({username:"user1"},{$set:{hobby:["唱歌","篮球","rap"]}})
......

Create an array index and view its execution isMultiKey: true that 061c20820efc87 indicates that the index used is a multi-value index.

db.users.createIndex({hobby:1})
'hobby_1'

db.users.find({hobby:{$elemMatch:{$eq:"钓鱼"}}}).explain()
{ 
   queryPlanner: 
   { 
     ......
     winningPlan: 
     { 
        stage: 'FETCH',
        filter: { hobby: { '$elemMatch': { '$eq': '钓鱼' } } },
        inputStage: 
        { 
           stage: 'IXSCAN',
           keyPattern: { hobby: 1 },
           indexName: 'hobby_1',
           isMultiKey: true,
           multiKeyPaths: { hobby: [ 'hobby' ] },
           ......
           indexBounds: { hobby: [ '["钓鱼", "钓鱼"]' ] } } 
         },
     rejectedPlans: [] 
  },
  ......
  ok: 1 
}

The array index is compared to other index entries and index volume necessarily increases exponentially, for example, the average of each document hobby array size is 10, then the set hobby number of entries in the array index would be 10 times normal index.

associative array index

Joint array index is a joint index containing array fields. This kind of index does not support multiple array fields in an index, that is, there can be at most one array field in an index. This is to avoid the explosive growth of index entries. Suppose an index There are two array fields in the index, then the number of index entries will be n*m times the normal index

Geospatial index

Add some geographic information to the original users

for(var i = 0;i < 100000;i++){
    db.users.updateOne(
    {username:"user"+i},
    {
        $set:{
            location:{
                type: "Point",
                coordinates: [100+Math.random() * 4,40+Math.random() * 3]
            }
        }
    });
}

Create a two-dimensional spatial index

db.users.createIndex({location:"2dsphere"})
'location_2dsphere'

//查询500米内的人
db.users.find({
  location:{
    $near:{
      $geometry:{type:"Point",coordinates:[102,41.5]},
      $maxDistance:500
    }
  }
})

type geospatial index contains many Ponit (point) | LineString (line) | Polygon (polygon), etc.

`TTL index`

The full spelling of TTL is time to live , which is mainly used for the automatic deletion of expired data. To use this index, you need to declare a time type field in the document, and then you need to set a expireAfterSeconds expiration time unit when creating a TTL index for this field. Seconds, after the creation is completed, MongoDB will periodically check the data in the collection. When it appears:

$$ 当前时间 - TTL索引字段时间 > expireAfterSrconds $$

MongoDB will automatically delete these documents. This index has the following requirements:

TTL index can only have one field, there is no joint TTL index
TTL cannot be used for fixed collections
After the TTL index is traversed one by one, it is found that the delete function will be used to delete if the deletion conditions are met, which is not efficient

First add or subtract a time field on our document

for(var i = 90000;i < 100000;i++){
    db.users.updateOne(
    {username:"user"+i},
    {
        $set:{
            createdDate:new Date()
        }
    });
}

Create a TTL index and set the expiration time to 60s. After 60s, you will find that the data no longer exists.

db.users.createIndex({createdDate:1},{expireAfterSeconds:60})
'createdDate_1'

In addition, you can use the CollMod command to change the expiration time of the TTL index

db.runCommand({
  collMod:"users",
  index:{
    keyPattern:{createdDate:1},
    expireAfterSeconds:120
  }
})

{ expireAfterSeconds_old: 60, expireAfterSeconds_new: 120, ok: 1 }

`Condition index`

Conditional index is also called partial index (partial), only the data that meets the condition is indexed.

Only 50-year-old user be established username_1 index, view the execution plan will find isPartial this field will become true

db.users.createIndex({username:1},{partialFilterExpression:{
    age:{$gt:50}
  }})
'username_1'

db.users.find({$and:[{username:"user4"},{age:60}]}).explain()
{ 
  queryPlanner: 
  { 
     ......
     winningPlan: 
     { 
        stage: 'FETCH',
        filter: { age: { '$eq': 60 } },
        inputStage: 
        { 
           stage: 'IXSCAN',
           keyPattern: { username: 1 },
           indexName: 'username_1',
           ......
           isPartial: true,
           ......
         } 
     },
     rejectedPlans: [] 
  },
  ......
  ok: 1 
}

`Sparse index`

The general index will create an index for the entire collection based on a certain field.Even if a certain document does not store this field, then the index will use this field of the document as null built in the index.

A sparse index will not index a field that does not exist in the document.If this field exists but is null , an index will be created.

Let's create a sparse index for part of the data in the users

for(var i = 5000;i < 10000;i++){
  if(i < 9000){
    db.users.updateOne(
      {username:"user"+i},
      { $set:{email:(120000000+i)+"@qq.email"}}
    )
  }else{
    db.users.updateOne(
      {username:"user"+i},
      { $set:{email:null}}
    )
  }
}

{email:null} condition without indexing, we will find that the detected documents contain documents without the email field

db.users.find({email:null})
{ 
  _id: ObjectId("61bdc01ba59136670f6536fd"),
  username: 'user0',
  age: 64.41483801726282,
  sex: 0,
  phone: 18468150001,
  location: 
  { 
    type: 'Point',
    coordinates: [ 101.42490900320335, 42.2576650823515 ] 
  } 
}
......

Then email this field to create a sparse index using {email:null} conditions query, the query to find all documents email field exists and is null document.

db.users.createIndex({email:1},{sparse:true});
'email_1'

db.users.find({email:null}).hint({email:1})
{ 
  _id: ObjectId("61bdc12ca59136670f655a25"),
  username: 'user9000',
  age: 94.18397576757012,
  sex: 0,
  phone: 18468159001,
  hobby: [ '钓鱼', '乒乓球' ],
  location: 
  { 
    type: 'Point',
    coordinates: [ 101.25903151863596, 41.38450145025062 ] 
  },
  email: null 
}
......

`Text index`

The text index will perform word segmentation on the indexed document field first and then search, but currently does not support Chinese word segmentation.

Add two text fields below to create a joint text index

db.blog.insertMany([
  {title:"hello world",content:"mongodb is the best database"},
  {title:"index",content:"efficient data structure"}
])

//创建索引
db.blog.createIndex({title:"text",content:"text"})
'title_text_content_text'
//使用文本索引查询
db.blog.find({$text:{$search:"hello data"}})
{ 
  _id: ObjectId("61c092268c4037d17827d977"),
  title: 'index',
  content: 'efficient data structure' 
},
{ 
  _id: ObjectId("61c092268c4037d17827d976"),
  title: 'hello world',
  content: 'mongodb is the best database' 
}

`Unique index`

The unique index means that no duplicate elements can appear in the field where the index is created. In addition to the single-field unique index, there are joint unique indexes and array unique indexes (that is, there can be no element intersection between )

//对title字段创建唯一索引
db.blog.createIndex({title:1},{unique:true})
'title_1'
//插入一个已经存在的title值
db.blog.insertOne({title:"hello world",content:"mongodb is the best database"})
MongoServerError: E11000 duplicate key error collection: mock.blog index: title_1 dup key: { : "hello world" }
//查看一下执行计划,isUnique为true
db.blog.find({"title":"index"}).explain()
{ 
  queryPlanner: 
  { 
     ......
     winningPlan: 
     { 
        stage: 'FETCH',
        inputStage: 
        { 
           stage: 'IXSCAN',
           keyPattern: { title: 1 },
           indexName: 'title_1',
           isMultiKey: false,
           multiKeyPaths: { title: [] },
           isUnique: true,
           ......
         } 
     },
     rejectedPlans: [] 
  },
  .......
  ok: 1 
}

MongoDB learning rich index

Single key index

Joint index

Array index

Geospatial index

`TTL index`

`Condition index`

`Sparse index`

`Text index`

`Unique index`

eacape

`引用和评论`

ByteBuddy入门笔记

再见 XShell！一款万能通用的终端工具，用完爱不释手！

🔥必看！AnythingLLM+DeepSeek 快速构建私有知识库！

Redis 持久化原理分析和使用建议

MyBatis-Plus结合Spring Boot实现数据权限

阿里云ECS服务器部署Node.js项目全过程详解

被 Manus 带火的 MCP 是什么｜一文看懂