1

前言

同义词文档格式

  • 单向同义词

    ipod, i-pod, i pod => ipod
  • 双向同义词

    马铃薯, 土豆, potato

试验步骤

添加同义词文件

  • 在 Elasticsearch 的 config 目录下新建 analysis 目录,在 analysis 下添加同义词文件 synonym.txt/etc/elasticsearch/analysis/synonym.txt
  • 在检索时使用同义词,不需要重启 Elasticsearch,也不需要重建索引,需要重载搜索分词器

    POST my-index-000001/_reload_search_analyzers

创建索引

PUT my_index
{
  "settings": {
    "analysis": {
      "filter": {
        "word_syn": {
          "type": "synonym_graph",
          "synonyms_path": "analysis/synonym.txt",
          "updateable": true   # 允许热更新
        }
      },
      "analyzer": {
        "ik_smart_syn": {
          "filter": [          # token filter
            "stemmer",
            "word_syn"
          ],
          "type": "custom",
          "tokenizer": "ik_smart"
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "ik_max_word",
        "search_analyzer": "ik_smart"
      },
      "author": {
        "type": "keyword"
      }
    }
  }
}

直接测试分词器

  • 查询语句

    GET my_index/_analyze
    {
    "analyzer": "ik_smart_syn",
    "text": "马铃薯"
    }
  • 输出

    {
    "tokens" : [
      {
        "token" : "马铃薯",
        "start_offset" : 0,
        "end_offset" : 3,
        "type" : "CN_WORD",
        "position" : 0
      },
      {
        "token" : "土豆",
        "start_offset" : 0,
        "end_offset" : 3,
        "type" : "SYNONYM",
        "position" : 0
      },
      {
        "token" : "potato",
        "start_offset" : 0,
        "end_offset" : 3,
        "type" : "SYNONYM",
        "position" : 0
      }
    ]
    }

添加测试数据

  • 添加数据

    POST my_index/_doc/1
    {
      "title": "马铃薯",
      "author": "土豆"
    }
  • 查看某个文档某个字段的分词结果

    GET my_index/_termvectors/1?fields=title

检索测试

  • 查询语句

    GET my_index/_search
    {
    "query": {
      "query_string": {
        "analyzer": "ik_smart_syn", 
        "query": "title:potato AND author:potato"
      }
    }
    }
  • 结果输出

    {
    "took" : 38,
    "timed_out" : false,
    "_shards" : {
      "total" : 1,
      "successful" : 1,
      "skipped" : 0,
      "failed" : 0
    },
    "hits" : {
      "total" : {
        "value" : 1,
        "relation" : "eq"
      },
      "max_score" : 0.5753642,
      "hits" : [
        {
          "_index" : "my_index",
          "_type" : "_doc",
          "_id" : "1",
          "_score" : 0.5753642,
          "_source" : {
            "title" : "马铃薯",
            "author" : "土豆"
          }
        }
      ]
    }
    }

相关文档

本文出自 qbit snap

qbit
268 声望279 粉丝