ElasticSearch搜索建议与上下文提示
搜索建议
通过Suggester Api实现
原理是将输入的文本分解为Token,然后在词典中查找类似的Term返回
根据不同场景,ElasticSearch设计了4中类别的Suggesters。
- Term Suggester
- Phrase Suggester
- Complete Suggester
- Context Suggester
Term Suggester
类似Google搜索引擎,我给的是一个错误的单词elasticserch,但引擎友好地给出了搜索建议。
要实现这个功能,在ElasticSearch中很简单。
-
创建索引,并写入一些文档
POST articles/_bulk { "index" : { } } { "body": "lucene is very cool"} { "index" : { } } { "body": "Elasticsearch builds on top of lucene"} { "index" : { } } { "body": "Elasticsearch rocks"} { "index" : { } } { "body": "elastic is the company behind ELK stack"} { "index" : { } } { "body": "Elk stack rocks"} { "index" : {} } { "body": "elasticsearch is rock solid"}
-
搜索文档,调用suggest api。
当中有3种Suggestion Mode
- missing 索引中已经存在,就不提供建议
- popular 推荐出现频率更加高的词
-
always 无论是否存在,都提供建议
POST /articles/_search { "size": 1, "query": { "match": { "body": "elasticserch" } }, "suggest": { "term-suggestion": { "text": "elasticserch", "term": { "suggest_mode": "missing", "field": "body" } } } }
-
返回结果
{ "took" : 6, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "term-suggestion" : [ { "text" : "elasticserch", "offset" : 0, "length" : 12, "options" : [ { "text" : "elasticsearch", "score" : 0.9166667, "freq" : 3 } ] } ] } }
Phrase Suggester
Phrase Suggester可以在Term Suggester上增加一些额外的逻辑
其中一些参数
- max_errors 最多可以拼错的terms
-
confidence 限制返回结果数,默认1
POST /articles/_search { "suggest": { "my-suggestion": { "text": "lucne and elasticsear rock hello world ", "phrase": { "field": "body", "max_errors":2, "confidence":2, "direct_generator":[{ "field":"body", "suggest_mode":"missing" }], "highlight": { "pre_tag": "<em>", "post_tag": "</em>" } } } } }
Completion Suggester
自动完成功能,用户每输入一个字符。就需要即时发送一个查询请求到后端查找匹配项。
它对性能要求比较苛刻。
elastic将Analyse的数据编码成FST与索引放在一起,它会被整个加载进内存里面,速度非常快
FST只能支持前缀查找。
类似百度这样的提示功能
在ElasticSearch要实现这样的功能也很简单。
-
建立索引
PUT titles { "mappings": { "properties": { "title_completion":{ "type": "completion" } } } }
-
写入文档
POST titles/_bulk { "index" : { } } { "title_completion": "php是什么"} { "index" : { } } { "title_completion": "php是世界上最好的语言"} { "index" : { } } { "title_completion": "php货币"} { "index" : { } } { "title_completion": "php面试题2019"}
-
搜索数据
POST titles/_search?pretty { "size": 0, "suggest": { "article-suggester": { "prefix": "php", "completion": { "field": "title_completion" } } } }
-
返回结果
{ "took" : 173, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "article-suggester" : [ { "text" : "php", "offset" : 0, "length" : 3, "options" : [ { "text" : "php是世界上最好的语言", "_index" : "titles", "_type" : "_doc", "_id" : "pv8V8WwBISxFcLcZfDXl", "_score" : 1.0, "_source" : { "title_completion" : "php是世界上最好的语言" } }, { "text" : "php是什么", "_index" : "titles", "_type" : "_doc", "_id" : "pf8V8WwBISxFcLcZfDXl", "_score" : 1.0, "_source" : { "title_completion" : "php是什么" } }, { "text" : "php货币", "_index" : "titles", "_type" : "_doc", "_id" : "p_8V8WwBISxFcLcZfDXl", "_score" : 1.0, "_source" : { "title_completion" : "php货币" } }, { "text" : "php面试题2019", "_index" : "titles", "_type" : "_doc", "_id" : "qP8V8WwBISxFcLcZfDXl", "_score" : 1.0, "_source" : { "title_completion" : "php面试题2019" } } ] } ] } }
Context Suggester
是Completion Suggester的扩展,加入了上下文信息场景。
例如:
你在电器商城,输入苹果,想要找到的苹果笔记本...
你在水果商城,输入苹果,想要找的是红苹果、绿苹果...
-
建立索引,定制mapping
PUT comments { "mappings": { "properties": { "comment_autocomplete": { "type": "completion", "contexts": [ { "type": "category", "name": "comment_category" } ] } } } }
-
并为每个文档加入Context信息
POST comments/_doc { "comment":"苹果电脑", "comment_autocomplete":{ "input":["苹果电脑"], "contexts":{ "comment_category":"电器商城" } } } POST comments/_doc { "comment":"红红的冰糖心苹果", "comment_autocomplete":{ "input":["苹果"], "contexts":{ "comment_category":"水果商城" } } }
-
结合Context进行Suggestion查询
POST comments/_search { "suggest": { "MY_SUGGESTION": { "prefix": "苹", "completion":{ "field":"comment_autocomplete", "contexts":{ "comment_category":"电器商城" } } } } }
-
返回结果
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 0, "relation" : "eq" }, "max_score" : null, "hits" : [ ] }, "suggest" : { "MY_SUGGESTION" : [ { "text" : "苹", "offset" : 0, "length" : 1, "options" : [ { "text" : "苹果", "_index" : "comments", "_type" : "_doc", "_id" : "qf_s9WwBISxFcLcZszWh", "_score" : 1.0, "_source" : { "comment" : "苹果电脑", "comment_autocomplete" : { "input" : [ "苹果电脑" ], "contexts" : { "comment_category" : "电器商城" } } }, "contexts" : { "comment_category" : [ "电器商城" ] } } ] } ] } }
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。