版本信息:
docker for Windows : 18.03.1-ce-win65 (17513)
springBoot : 2.2.2.RELEASE
springDataElasticSearch : 3.2.3
elasticSearch Image : 6.8.5
elasticSearch-analysis-ik : 6.8.5
mySql : 5.6.40-log
JDK : 1.8
gradle : 6.0.1
项目介绍:
为什么要学习elasticSearch?因为快,因为能提供良好的中文分词,因为分布式,因为springBoot已经集成了。其实因为最近项目中我们对接了京东大约百万条商品数据,导致以前的一些查询出现十几秒加载的情况,让我重新进行了sql的优化(拆分join,设置联合索引,异步请求)使得我对索引进行了复习,并且想去了解搜索引擎与mysql全文索引的具体区别。这里我是用了docker + elasticSearch + springBoot来初步了解elasticsearch。
docker安装elasticsearch
因为在dockers pull elasticsearch 的时候提示没有latest版本所以从docker hub上找到6.8.5来测试,这个版本比较稳定也比较新。
- docker pull elasticsearch:6.8.5
- docker images
- docker run -p 9200:9200 -p 9300:9300 elasticsearch:6.8.5
- docker ps
- 浏览器访问:http://localhost:9200
- curl -i -XGET 'http://localhost:9200/_analyze?pretty' -H "Content-Type:application/json" -d '{"text":"我爱中国"}'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 578
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "<IDEOGRAPHIC>",
"position" : 0
},
{
"token" : "爱",
"start_offset" : 1,
"end_offset" : 2,
"type" : "<IDEOGRAPHIC>",
"position" : 1
},
{
"token" : "中",
"start_offset" : 2,
"end_offset" : 3,
"type" : "<IDEOGRAPHIC>",
"position" : 2
},
{
"token" : "国",
"start_offset" : 3,
"end_offset" : 4,
"type" : "<IDEOGRAPHIC>",
"position" : 3
}
]
}
分词效果不好,和老外一样。
进入container安装IK分词器:
- docker exec -it 容器id /bin/bash
- 进入elasticsearch容器->plugins 目录下 : cd plugins/
- 下载资源 : wget https://github.com/medcl/elas...
- 解压 : unzip elasticsearch-analysis-ik-6.8.5.zip -d /ik
- 退出容器 : exit
- 重启容器: docker stop 容器Id , docker start 容器Id
- curl -i -XGET 'http://localhost:9200/_analyze?pretty' -H "Content-Type:application/json" -d '{"text":"我爱中国","analyzer":"ik_smart"}'
HTTP/1.1 200 OK
content-type: application/json; charset=UTF-8
content-length: 424
{
"tokens" : [
{
"token" : "我",
"start_offset" : 0,
"end_offset" : 1,
"type" : "CN_CHAR",
"position" : 0
},
{
"token" : "爱",
"start_offset" : 1,
"end_offset" : 2,
"type" : "CN_CHAR",
"position" : 1
},
{
"token" : "中国",
"start_offset" : 2,
"end_offset" : 4,
"type" : "CN_WORD",
"position" : 2
}
]
}
接入springboot
具体接入网上很多,只提一点,要使用IK分词器不能使用@Field这些注解,只能自己写JSON文件进行mapping:
@Getter
@Mapping(mappingPath = "es_article_mapping.json")
@Document(indexName = "article",type = "article")
public class ArticleEsEntity {
@Id
private String id;
private String title;
private String content;
private long createTime;
public ArticleEsEntity(String title, String content) {
this.id = System.nanoTime() + "";
this.title = title;
this.content = content;
this.createTime = System.currentTimeMillis();
}
}
{
"article":{
"properties":{
"id":{
"type":"text"
},
"create\_time":{
"type":"long"
},
"content":{
"type":"text",
"analyzer":"ik\_smart",
"search\_analyzer":"ik\_smart",
"fields":{
"keyword":{
"type":"keyword",
"ignore\_above":10000
}
}
},
"title":{
"type":"text",
"analyzer":"ik\_smart",
"search\_analyzer":"ik\_smart",
"fields":{
"keyword":{
"type":"keyword",
"ignore\_above":256
}
}
}
}
}
}
最后测试一下:
总共12w+的记录,mysql与elasticsearch都是。
- SELECT* FROM article WHERE title LIKE '%spring%' OR content LIKE '%spring%' 12.81s --- 9810;
- SELECT * FROM article WHERE MATCH(title,content) AGAINST ('spring') 4.296s --- 9810 ;
- curl 'http://127.0.0.1:9200/article/article/_search' -H "Content-Type:application/json" -d {"query":{"bool":{"should":[{"match":{"title":"spring"}},{"match":{"content":"spring"}}]}}} 125ms --- 9810;
另外:mysql的fullIndex不好分词哦~~~
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。