Introduction: The Alibaba Cloud ES full observation engine TimeStream time series enhancement function is newly released. Based on the cloud native ELK full hosting, the TimeStream time series enhancement function plug-in can realize high-performance, low-cost time series data storage and query analysis. This article introduces the applicable scenarios, functional advantages, performance test results and practical cases of TimeStream

The full observation capability of Elasticsearch

Video introduction>>

With the increasingly complex topological structure of enterprise IT systems, the system architecture has changed from single-channel distributed to micro-services, the deployment mode has changed from physical server deployment to virtualization to containerized applications, and the development mode after the infrastructure has been migrated to the cloud has also changed from the traditional waterfall type. Combined with DevOps development operation and maintenance. Behind the multiple data sources in complex system links are different data types and the extremely high cost of unified collection, processing, storage and maintenance of massive unstructured data. In addition to the traditional SRE operation and maintenance scenarios, enterprise business scenarios have derived various applications in real-time analysis, security audit, user behavior, operation growth, and transaction record scenarios. For a business component or system, the data generated by different solutions is difficult to communicate with each other and cannot give full play to the value of the data.

As a result, various enterprises are also paying more and more attention to the construction of system observability capabilities, and there is an urgent need to store, monitor, retrieve and analyze various types of data on a unified platform. It is recognized in the industry that log, metric, and trace are the three pillars of full observation. By building a unified observation system, in the operation and maintenance scenario, it helps the operation and maintenance personnel to understand the operating status of the system "before the event", quickly locate the fault "in the event", and "after the event". "Root cause analysis, in order to improve the high availability of the system, reduce costs and increase efficiency. However, in the evolution of full observation technology, it is not only necessary to realize the observation of logs and time series data across clouds and business systems, but there are many technical atomic tools supported by various data scenarios such as logs and time series, and the connection between tools is difficult. The price and the maintenance cost of the platform are high.

As one of the three core solutions of Elastic, observable can collect logs, indicators, uptime data, and application tracking data in a unified manner based on the full observation capability of Elasticsearch, and store all kinds of data in Elasticsearch for unified processing and analysis. Based on Kibana Complete visualization. As a result, the technology stack is unified in the observable scenario, and the SRE team does not need to build an observable platform based on multiple technical components.

 title=

In the full observation scenario, Alibaba Cloud Elasticsearch continuously optimizes the write performance and storage cost of massive log data based on the capabilities of the cloud-native serverless log engine. In the process of storing and processing Metric time series data, the following problems are often faced:

 title=

What is TimeStream?

TimeStream is a time series engine developed by the Alibaba Cloud Elasticsearch team and combined with the characteristics of the Elastic community time series products. On the basis of cloud-native ELK full hosting, high-performance, low-cost time series data storage and query analysis can be achieved through the TimeStream time series enhancement function plug-in.

Advantages of Alibaba Cloud ES TimeStream

As the core technology of Alibaba Cloud ES time series scenarios deeply integrated with the Ali kernel, Timestream has greatly optimized the cost, performance and ease of use of Alibaba Cloud ES time series scenarios:

  • Data management efficiency improvement: Based on the Timestream time series data model and addition, deletion, modification and query, the best practice template of Elasticsearch in time series scenarios is integrated, which greatly reduces the threshold for Elasticsearch to manage time series indicator data.
  • Improved query experience : Support using PromQL to query Elasticsearch data, seamlessly connect to Prometheus+Grafana, support DownSample sampling query and DataStream time partitioning
  • Storage cost optimization : Through data compression optimization and metadata storage capacity optimization, the storage capacity of TimeStream index is reduced by more than 80% compared with the open source Elasticsearch ordinary index.
  • Improved read and write performance : Compared with the open-source Elasticsearch general index, the TimeStream index is nearly 40% faster than the open-source Elasticsearch index. For common query analysis of time series data, the performance is improved by 5 times compared to the open-source Elasticsearch.

Compared with open source

In the time series scenario, when Elasticsearch uses and does not use the TimeStream plug-in, the scene-based configuration, storage, and query comparisons are as follows:

Contrast Using TimeStream Not using TimeStream
<span class="lake-fontsize-10">Scene configuration</span> <span class="lake-fontsize-10">TimeStream engine natively supports time series data model,</span> <span class="lake-fontsize-10">automatically generate _tsid, indexing sort optimization</span> < span class="lake-fontsize-10">etc</span> <span class="lake-fontsize-10">Users are required to perform best practices for a large number of indicator scenarios, such as generating a timeline id field, using the timeline id and time to configure indexing sorting, using the timeline id for routing, etc.</span >
<span class="lake-fontsize-10">Storage</span> <ul><li><span class="lake-fontsize-10">ali-codec plugin supports generating _source through doc_values</span></li><li><span class="lake-fontsize-10" >support</span> <span class="lake-fontsize-10">do not store _id</span> </li><li><span class="lake-fontsize-10">ali-codec in timing Scene compression optimization</span></li></ul> <ul><li><span class="lake-fontsize-10">Time sequence scene _id, _source and other metadata fields occupy</span> <span class="lake-fontsize-10">70%</span > <span class="lake-fontsize-10">+storage capacity</span></li><li><span class="lake-fontsize-10">doc value is not friendly to double type compression, timing scenarios The data similarity is very high, but the double data is basically not compressed</span></li></ul>
<span class="lake-fontsize-10">Query statement</span> <span class="lake-fontsize-10">Support</span> <span class="lake-fontsize-10">PromQL query DSL</span> <span class="lake-fontsize-10">Specially build query DSL to query Metric data</span>
<span class="lake-fontsize-10">Downsampling</span> <span class="lake-fontsize-10">Simply configure the time interval to support</span> <span class="lake-fontsize-10">downsampling</span> <span class="lake-fontsize-10">User-side downsampling is required</span>
<span class="lake-fontsize-10">Time division</span> <span class="lake-fontsize-10">According to the actual data partition, the data of a time range will be distributed in a certain index</span> <span class="lake-fontsize-10">Partitioned in order of writing, data for a time range may be distributed across many indexes</span>
### Performance comparison From the benchmark comparison results, Alibaba Cloud Elasticsearch has achieved a significant improvement in the read and write performance of Elasticsearch time series based on TimeStream, and its core performance is at the same level as traditional open source time series products. In terms of storage capacity, TimeStream index is compared with the open source Elasticsearch ordinary index storage capacity Reduced by more than 80% ; TimeStream supports not storing \_id, so that compared with ordinary indexes that store \_id under the same conditions, the storage capacity is reduced by more than 90% , which is equal to the open source time series database;  title= In terms of write performance, TimeStream index improves write TPS by nearly 40% compared to open source Elasticsearch ordinary index  title= In terms of query performance, for a single concurrent simple query, Alibaba Cloud ES is close to the open source time series product; for a single concurrent complex query, Alibaba Cloud ES TimeStream has better query performance. With multiple concurrency, simple and complex query statements , Alibaba Cloud ES TimeStream has better query performance  title= ### Practical Cases#### Case A: Quick Start for TimeStream Management of Elasticsearch Time Series Data STEP1 Purchase and use of TimeStream currently supports Alibaba Cloud ES 7.16 instances (kernel version 1.7.0 and above)  title= Check whether the Aliyun-TimeStream plug-in has been installed through the system default plug-in list, and confirm that it has the latest TimeStream functions  title= STEP2 Create TimeStream time series data index Create a time series data type index through the create interface of time\_stream in the Kibana console. The command and return result are as follows.  title= STEP3 Write data using bulk and index interfaces to write data. When writing, it needs to be written according to the time series model (the model field can be modified). The commands and return results are as follows.  title= STEP4 Query data Use the search interface to query data, and use the cat indices interface to view the specific index information of test\_stream. The commands and returned results are as follows:  title= STEP5 When using the DownSample function to create through the create interface of time\_stream, you can directly specify the DownSample rule, and set the downsample precision by configuring the interval. The example is as follows:  title= Related Documents >> #### Case B: Using Alibaba Cloud ES TimeStream to connect with Prometheus+Grafana to achieve observability Alibaba Cloud Elasticsearch supports seamless connection to Prometheus+Grafana, supports APIs related to Prometheus Query, and can directly use the TimeStream index as Grafana's The use of Prometheus data sources can improve the performance of time series data storage and query analysis while saving costs. Collect various hardware and kernel-related metrics through node\_expoter, and provide them to Prometheus for reading, and then write the data to Alibaba Cloud ES TimeStream index through remote write, and configure Grafana for visual analysis.  title= The following figure shows an example of configuring the Prometheus data source in Grafana, using PromQL to query the Alibaba Cloud ES data as the Prometheus data source, accessing and visualizing it.  title= Related Documentation >> ### Related Documentation  title= Introduction to TimeStream Time Series Enhancement Engine - Retrieval Analysis Service Elasticsearch Edition - Alibaba Cloud  title= Using Aliyun-TimeStream Plugin - Retrieval Analysis Service Elasticsearch Edition - Alibaba Cloud  title= TimeStream integrates Prometheus interface - Retrieval Analysis Service Elasticsearch Edition - Alibaba Cloud  title= TimeStream Management Elasticsearch Time Series Data Quick Start - Retrieval Analysis Service Elasticsearch Edition - Alibaba Cloud  title= Connecting Prometheus+Grafana based on TimeStream to achieve observability - Retrieval Analysis Service Elasticsearch Edition - Alibaba Cloud ### Contact us (Ding Qun QR code) For more observable scene architecture and use of best practice exchanges, welcome to scan the QR code to join Nail group>>  title= Alibaba Cloud ES 1 yuan monthly trial >> 👆🏻 > Copyright notice: The content of this article is contributed by Alibaba Cloud real-name registered users, and the copyright belongs to the original author. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find any content suspected of plagiarism in this community, fill out the infringement complaint form to report it. Once verified, this community will delete the allegedly infringing content immediately.

阿里云开发者
3.2k 声望6.3k 粉丝

阿里巴巴官方技术号,关于阿里巴巴经济体的技术创新、实战经验、技术人的成长心得均呈现于此。