Author: Liu An

A member of the Aikesheng testing team, mainly responsible for related testing tasks of the DTLE open source project, good at Python automated test development, and recently fascinated with the knowledge of Linux performance analysis and optimization.

Source of this article: original submission

* Produced by the Aikesheng open source community, original content is not allowed to be used without authorization, please contact the editor and indicate the source for reprinting.


background:

Although the introduction of various monitoring items is provided in the DTLE documentation, it is still a bit difficult for students who are not familiar with the configuration of prometheus and grafana. Today I will come to DTLE 3.21.07.0 to build a DTLE monitoring system.

1. Build DTLE operating environment

  • Configure a two-node DTLE cluster for demonstration, and its topology is as follows:

When modifying the DTLE configuration file, you need to pay attention to the following two points:

  1. Turn on the monitoring of DTLE and ensure that the value of publish_metrics is true
  2. Enable nomad monitoring to ensure that telemetry

Here is the configuration of dtle-src-1 as an example. For the specific configuration, refer to the node configuration :

# DTLE 3.21.07.0中nomad升级为1.1.2,需要添加如下配置使nomad提供监控数据
# 之前版本的DTLE无需添加此配置
telemetry {
  prometheus_metrics         = true
  collection_interval        = "15s"
}

plugin "dtle" {
  config {
    data_dir = "/opt/dtle/var/lib/nomad"
    nats_bind = "10.186.63.20:8193"
    nats_advertise = "10.186.63.20:8193"
    # Repeat the consul address above.
    consul = "10.186.63.76:8500"

    # By default, API compatibility layer is disabled.
    api_addr = "10.186.63.20:8190"   # for compatibility API
    nomad_addr = "10.186.63.20:4646" # compatibility API need to access a nomad server

    publish_metrics = true
    stats_collection_interval = 15
  }
}
  • Add two jobs to simulate data transfer between two MySQL instances

Two, deploy prometheus

  • Prepare prometheus configuration file to receive nomad and DTLE metrics at the same time
  • The value of DTLE monitoring labels:instance is recommended to be set to the hostname of the DTLE server
shell> cat /path/to/prometheus.yml
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.

scrape_configs:
  - job_name: 'nomad'
    scrape_interval: 15s
    metrics_path: '/v1/metrics'
    params:
      format: ['prometheus']
    static_configs:
      - targets: ['10.186.63.20:4646']
        labels:
          instance: nomad-src-1
      - targets: ['10.186.63.76:4646']
        labels:
          instance: nomad-dest-1

  - job_name: 'dtle'
    scrape_interval: 15s
    metrics_path: '/metrics'
    static_configs:
      - targets: ['10.186.63.20:8190']
        labels:
          instance: dtle-src-1
      - targets: ['10.186.63.76:8190']
        labels:
          instance: dtle-dest-1
  • Use docker to deploy prometheus service
  shell> docker run -itd -p 9090:9090 --name=prometheus --hostname=prometheus --restart=always -v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus

Three, deploy grafana

  • Use docker to deploy grafana service
  shell> docker run -d --name=grafana -p 3000:3000 grafana/grafana

  • Choose to add promethues

  • Just add the access address of promethues to the URL and click the "sava & test" button

  • Add panel

  • Take adding a CPU usage monitor as an example to configure a panel

Four, commonly used monitoring items

Nomad all monitoring items: https://www.nomadproject.io/docs/operations/metrics

DTLE all monitoring items: https://actiontech.github.io/dtle-docs-cn/3/3.4_metrics.html

Five, finally create multiple panels to display at the same time


爱可生开源社区
426 声望207 粉丝

成立于 2017 年,以开源高质量的运维工具、日常分享技术干货内容、持续的全国性的社区活动为社区己任;目前开源的产品有:SQL审核工具 SQLE,分布式中间件 DBLE、数据传输组件DTLE。