1. doc与params._source

script中有时候用doc, 有时候用params._source, 是不是不容易不清楚?
script中直接使用可以用doc也可以用params._source;
只是用法不太一样:
doc用的时候是一个包装器, 要.value才能操作;params._source是直接取source原始数据,不用.value

但是如果是要写source原始内容(比如_update_by_query里用)

必用source(如: ctx._source['field']='value')

那么我们doc[xxx]或 prarams._source用的到底是什么?

1.1 doc_values和source数据到底是什么

要点:

1.1.1 doc_values

在 Painless 脚本中,使用 doc['field_name'] 访问的就是字段的 doc_values;

什么是doc_values?

doc_values是字段的"正排索引",索引时创建,默认情形下每个字段的doc_values都是被激活的(除了text类型: 因为text类型字段会被分词, 所以没有, 但一般都keyword多字段);

1.1.2 source数据

使用 params._source 访问的就是source原始数据

小结: 所以容易理解:

如果要读取数据可以使用doc['xxx'], 访问的是正排索引;
要访问source原数据, 就用 params._source;

2.实例: script_fields中的script

2.1 es中数据准备和查询写法

2.1.1 es中的test_test001索引数据:

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test_test001",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "zhangsan4",
          "birth": "0",
          "age": 20
        }
      },
      {
        "_index": "test_test001",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "zhangsan2",
          "birth": 1763308800000,
          "age": 23
        }
      }
    ]
  }
}

2.1.2 查询时使用 script_fields

GET test_test001/_search
{
  "_source": ["name", "age", "birth"],  // 指定返回的源字段
  "query": {
    "match_all": {}
  },
  "script_fields": {
    "nextYearAge": {
      "script": {
        "lang": "painless",
        "source": "doc['age'].value + params.num",
        "params": {
          "num": 1
        }
      }
    },
    "nextYearAge1": {
      "script": {
        "lang": "painless",
        "source": "doc.age.value + params.num",
        "params": {
          "num": 1
        }
      }
    },
    "nextYearAge2": {
      "script": "params._source.age + 1 "
    },
    "nextYearAge3": {
      "script": "params._source['age'] + 1 "
    },
    "nameLength": {
      "script": {
        "lang": "painless",
        "source": "doc['name.keyword'].value.length()"
        // error: "source": "doc['name'].value.length()"
      }
    },
    "nameLength2": {
      "script": "params._source.name.length()"
    }
  }
}

都是可以的:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "test_test001",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "zhangsan4",
          "birth": "0",
          "age": 20
        },
        "fields": {
          "nameLength": [
            9
          ],
          "nextYearAge": [
            21
          ],
          "nameLength2": [
            9
          ],
          "nextYearAge1": [
            21
          ],
          "nextYearAge2": [
            21
          ],
          "nextYearAge3": [
            21
          ]
        }
      },
      {
        "_index": "test_test001",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "zhangsan2",
          "birth": 1763308800000,
          "age": 23
        },
        "fields": {
          "nameLength": [
            9
          ],
          "nextYearAge": [
            24
          ],
          "nameLength2": [
            9
          ],
          "nextYearAge1": [
            24
          ],
          "nextYearAge2": [
            24
          ],
          "nextYearAge3": [
            24
          ]
        }
      }
    ]
  }
}

2.2 script_fields的字段访问

2.2.1 用法1:script->source 里:

  • case1: doc['fieldname'].value

    "nextYearAge": {
    "script": {
      "lang": "painless",
      "source": "doc['age'].value + params.num",
      "params": {
        "num": 1
      }
    }
    }
  • case2: doc.fieldname.value

    "nextYearAge1": {
    "script": {
      "lang": "painless",
      "source": "doc.age.value + params.num",
      "params": {
        "num": 1
      }
    }
    }
  • case3: 当然这里也可以写成:

    "source": "doc.age.value + params['num']"

2.2.2 用法2: script 中直接使用:

  • case4: params._source.fieldname

    "nextYearAge2": {
    "script": "params._source.age + 1 "
    }
  • case5: params._source['fieldname']
"nextYearAge3": {
  "script": "params._source['age'] + 1 "
},

3. doc_values

3.1 text类型不支持 doc_values

因为 text 字段会被分词,不适合用于排序和聚合。

3.2 什么是 doc_values

注意:text 字段不能用于排序、聚合和脚本中的 doc[] 访问,因为 text 字段默认没有 doc_values。但是,text 字段可以有一个多字段(multi-field)是 keyword 类型,该子字段可以启用 doc_values。

如上面的 script_fields中:

"nameLength": {
  "script": {
    "lang": "painless",
    "source": "doc['name.keyword'].value.length()"
    // error: "source": "doc['name'].value.length()"
  }
}

丰木
325 声望21 粉丝

遇见超乎想象的自己!