问题
最近一次上线后 线上突然多了好多ES查询超时异常
2020-02-01 11:40:43.851 ERROR 2852 --- [o-8022-exec-438] c.c.p.common.dao.SearchDao : Search person by id failed
java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-895 [ACTIVE]
at org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:789) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:225) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:212) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1433) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1403) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1373) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
at org.elasticsearch.client.RestHighLevelClient.get(RestHighLevelClient.java:699) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
Caused by: java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-895 [ACTIVE]
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:387) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
觉得很奇怪 因为就是简单的query by id 怎么也会超时呢?
curl 'http://localhost:9201/person/_doc/301024026?pretty'
原因
当超时异常发生的时候 执行hot_threads
查询 得知原来底层执行了refresh操作
curl 'http://localhost:9201/_nodes/hot_threads?pretty'
101.4% (507.1ms out of 500ms) cpu usage by thread 'elasticsearch[node-1][get][T#1]'
8/10 snapshots sharing following 32 elements
app//org.apache.lucene.index.SingletonSortedNumericDocValues.nextDoc(SingletonSortedNumericDocValues.java:53)
app//org.apache.lucene.codecs.lucene80.Lucene80DocValuesConsumer.writeValues(Lucene80DocValuesConsumer.java:161)
app//org.apache.lucene.codecs.lucene80.Lucene80DocValuesConsumer.addNumericField(Lucene80DocValuesConsumer.java:111)
app//org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsWriter.addNumericField(PerFieldDocValuesFormat.java:109)
app//org.apache.lucene.index.ReadersAndUpdates.handleDVUpdates(ReadersAndUpdates.java:368)
app//org.apache.lucene.index.ReadersAndUpdates.writeFieldUpdates(ReadersAndUpdates.java:570)
app//org.apache.lucene.index.ReaderPool.writeAllDocValuesUpdates(ReaderPool.java:228)
app//org.apache.lucene.index.IndexWriter.writeReaderPool(IndexWriter.java:3308)
app//org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:520)
app//org.apache.lucene.index.StandardDirectoryReader.doOpenFromWriter(StandardDirectoryReader.java:294)
app//org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:269)
app//org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:259)
app//org.apache.lucene.index.FilterDirectoryReader.doOpenIfChanged(FilterDirectoryReader.java:112)
app//org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:140)
app//org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:156)
app//org.apache.lucene.search.SearcherManager.refreshIfNeeded(SearcherManager.java:58)
app//org.apache.lucene.search.ReferenceManager.doMaybeRefresh(ReferenceManager.java:176)
app//org.apache.lucene.search.ReferenceManager.maybeRefreshBlocking(ReferenceManager.java:253)
app//org.elasticsearch.index.engine.InternalEngine.refresh(InternalEngine.java:1548)
app//org.elasticsearch.index.engine.InternalEngine.get(InternalEngine.java:652)
app//org.elasticsearch.index.shard.IndexShard.get(IndexShard.java:916)
app//org.elasticsearch.index.get.ShardGetService.innerGet(ShardGetService.java:169)
app//org.elasticsearch.index.get.ShardGetService.get(ShardGetService.java:93)
app//org.elasticsearch.index.get.ShardGetService.get(ShardGetService.java:84)
app//org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:106)
app//org.elasticsearch.action.get.TransportGetAction.shardOperation(TransportGetAction.java:45)
查看官网文档 得知如果该文档有更新 且尚未执行refresh 此时查询该文档会显式执行refresh操作
By default, the get API is realtime, and is not affected by the refresh rate of the index (when data will become visible for search). If a document has been updated but is not yet refreshed, the get API will issue a refresh call in-place to make the document visible. This will also make other documents changed since the last refresh visible. In order to disable realtime GET, one can set therealtime
parameter tofalse
.
https://www.elastic.co/guide/...
这就能解释的通了 因为进入人物详情页面后 会对人物索引的viewCount加一 即文档有更新 默认30秒后才会执行refresh操作("refresh_interval" : "30s"
) 故后续查询该文档的时候 会显式执行refresh操作 导致get by id 超时 尤其是有大量文档更新的情况下
解决
禁止relatime即可
GetRequest getRequest = new GetRequest(personIndex, id.toString());
getRequest.realtime(false);
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。