Elasticsearch中将Doc根据A字段排序获得第一个Doc的B字段值的方法

注：本文基于Elasticsearch 6.1.2编写

最近遇到这样一个需求，要通过Elasticsearch将Doc根据A字段降序，然后获得B字段的值，最终根据B字段的值再去做Pipeline Aggregation。

先尝试了Max Aggregation，但是Max Aggregation只能获得A字段的最大值。

然后尝试了Top Hits Aggregation，但是Top Hits Aggregation的结果无法被Pipeline Aggregation使用。

最终尝试Scripted Metric Aggregation成功。下面举例说明

比如现在我们有一堆股票价格数据，我们现在需要获得股票每天的收盘价比前一天的差值（Delta）。下面先倒入一段股票数据，date字段代表时间戳，price字段代表当时的价格：

POST /_bulk

{"index":{"_index":"stock-price","_type":"data"}}
{"date":"2018-01-01T10:00:00","price":10}
{"index":{"_index":"stock-price","_type":"data"}}
{"date":"2018-01-01T10:30:00","price":15}
{"index":{"_index":"stock-price","_type":"data"}}
{"date":"2018-01-02T10:00:00","price":20}
{"index":{"_index":"stock-price","_type":"data"}}
{"date":"2018-01-02T10:30:00","price":19}
{"index":{"_index":"stock-price","_type":"data"}}
{"date":"2018-01-03T10:00:00","price":30}
{"index":{"_index":"stock-price","_type":"data"}}
{"date":"2018-01-03T10:30:00","price":35}
{"index":{"_index":"stock-price","_type":"data"}}
{"date":"2018-01-04T10:00:00","price":40}
{"index":{"_index":"stock-price","_type":"data"}}
{"date":"2018-01-04T10:30:00","price":20}
{"index":{"_index":"stock-price","_type":"data"}}
{"date":"2018-01-05T10:00:00","price":10}

先分解一下看这个查询如何实现：

把股票数据按照“天”分bucket，这个会用到Date Histogram Aggregation
获得每个bucket里的最后一次的价格数据，这个会用到Scripted Metric Aggregation
最后根据算每个bucket的差值，这个会用到Serial Differencing Aggregation

下面是查询代码：

GET /stock-price/_search

{
  "size": 0,
  "aggs": {
    "minute_histo": {
      "date_histogram": {
        "field": "date",
        "interval": "day"
      },
      "aggs": {
        "latest_price": {
          "scripted_metric": {
            "init_script": "params._agg.tmp_rs = ['latest_date': 0, 'latest_price' : -1];",
            "map_script": "def tmp_rs = params._agg.tmp_rs; boolean newer = doc['date'].value.millis > tmp_rs['latest_date']; if (newer) { tmp_rs['latest_date'] = doc['date'].value.millis; tmp_rs['latest_price'] = doc.price.value; }",
            "combine_script": "return params._agg.tmp_rs;",
            "reduce_script": "long rs_date = 0; long rs_price = -1; for (a in params._aggs) {  if (a == null) { continue; } boolean newer = a['latest_date'] > rs_date;   if (newer) {     rs_date = a['latest_date'];     rs_price = a['latest_price'];   } } return rs_price;"
          }
        },
        "delta_price": {
          "serial_diff": {
            "buckets_path": "latest_price.value",
            "lag": 1
          }
        }
      }
    }
  }
}

最后得到的结果是：

{
  ...
  "aggregations": {
    "minute_histo": {
      "buckets": [
        {
          "key_as_string": "2018-01-01T00:00:00.000Z",
          "key": 1514764800000,
          "doc_count": 2,
          "latest_price": {
            "value": 15
          }
        },
        {
          "key_as_string": "2018-01-02T00:00:00.000Z",
          "key": 1514851200000,
          "doc_count": 2,
          "latest_price": {
            "value": 19
          },
          "delta_price": {
            "value": 4.0
          }
        },
        {
          "key_as_string": "2018-01-03T00:00:00.000Z",
          "key": 1514937600000,
          "doc_count": 2,
          "latest_price": {
            "value": 35
          },
          "delta_price": {
            "value": 16.0
          }
        },
        {
          "key_as_string": "2018-01-04T00:00:00.000Z",
          "key": 1515024000000,
          "doc_count": 2,
          "latest_price": {
            "value": 20
          },
          "delta_price": {
            "value": -15.0
          }
        },
        {
          "key_as_string": "2018-01-05T00:00:00.000Z",
          "key": 1515110400000,
          "doc_count": 1,
          "latest_price": {
            "value": 10
          },
          "delta_price": {
            "value": -10.0
          }
        }
      ]
    }
  }
}

Elasticsearch中将Doc根据A字段排序获得第一个Doc的B字段值的方法

chanjarster

引用和评论

记一次K8S VXLAN Overlay网络8472端口冲突问题的排查

试试 Elasticsearch 的 unsigned_long（qbit）

换掉ES！SpringBoot + Meilisearch实现商品搜索，太方便了！

超越Elasticsearch！号称下一代搜索引擎，性能炸裂！

优秀！一款基于 SpringBoot + Vue 开发的网盘系统！

day01-基本查询

ElasticSearch 可观测性最佳实践