1. 题目描述

有一个index=index_a,只有一列title;
请以此index_a为基础, 保留title;
增加len列,内容为title列的长度;
增加split_title列,内容为title列用空格分割的数组;

2. 题目准备

PUT /index_a/_doc/1
{
  "title": "Thinking in java 4th"
}

3. 创建 ingest pipeline

创建一个名为pipeline_a的pipeline
PUT _ingest/pipeline/pipeline_a
{
  "processors": [
    {
      "script": { ## 3.1 script
        "source": "ctx.len=ctx.title.length();"
      }
    },
    {
      "set": { ## 3.2 set
        "field": "split_title",
        "value": ""
      }
    },
    {
      "split": { ## 3.3 split
        "field": "title",
        "separator": " ",
        "target_field": "split_title"
      }
    }
  ]
}

这个pipeline的创建里,使用了pipeline的3个processor, 分别如下:

3.1 script

script 给index增加一个len字段, 值为 title 字段的长度
ctx.len=ctx.title.length();

3.2 set

split_title 给index增加了一个字段 split_title, 值设置为空字符串

3.3 split

split 给index做一个split处理, 输入目标是 title 字段, 输出到字段 split_title

由于processor是一个一个流水执行的, 下一个,可以用到上一个的, 所以会正确达到我们预期

4. reindex并使用pipeline

POST _reindex
{
  "source": {
    "index": "index_a"
  },
  "dest": {
    "index": "index_b",
    "op_type": "create",
    "pipeline": "pipeline_a"
  }
}

5. 验证

GET /index_b/_search
{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "index_b",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "len" : 20,
          "split_title" : [
            "Thinking",
            "in",
            "java",
            "4th"
          ],
          "title" : "Thinking in java 4th"
        }
      }
    ]
  }
}

丰木
322 声望19 粉丝

遇见超乎想象的自己!