监控 - Data Transmission | How to build a DTLE monitoring system - 个人文章

Author: Liu An
A member of the Aikesheng testing team, mainly responsible for related testing tasks of the DTLE open source project, good at Python automated test development, and recently fascinated with the knowledge of Linux performance analysis and optimization.
Source of this article: original submission
* Produced by the Aikesheng open source community, original content is not allowed to be used without authorization, please contact the editor and indicate the source for reprinting.

background:

Although the introduction of various monitoring items is provided in the DTLE documentation, it is still a bit difficult for students who are not familiar with the configuration of prometheus and grafana. Today I will come to DTLE 3.21.07.0 to build a DTLE monitoring system.

1. Build DTLE operating environment

Configure a two-node DTLE cluster for demonstration, and its topology is as follows:

When modifying the DTLE configuration file, you need to pay attention to the following two points:

Turn on the monitoring of DTLE and ensure that the value of publish_metrics is true
Enable nomad monitoring to ensure that telemetry

Here is the configuration of dtle-src-1 as an example. For the specific configuration, refer to the node configuration :

# DTLE 3.21.07.0中nomad升级为1.1.2，需要添加如下配置使nomad提供监控数据
# 之前版本的DTLE无需添加此配置
telemetry {
  prometheus_metrics         = true
  collection_interval        = "15s"
}

plugin "dtle" {
  config {
    data_dir = "/opt/dtle/var/lib/nomad"
    nats_bind = "10.186.63.20:8193"
    nats_advertise = "10.186.63.20:8193"
    # Repeat the consul address above.
    consul = "10.186.63.76:8500"

    # By default, API compatibility layer is disabled.
    api_addr = "10.186.63.20:8190"   # for compatibility API
    nomad_addr = "10.186.63.20:4646" # compatibility API need to access a nomad server

    publish_metrics = true
    stats_collection_interval = 15
  }
}

Add two jobs to simulate data transfer between two MySQL instances

Two, deploy prometheus

Prepare prometheus configuration file to receive nomad and DTLE metrics at the same time
The value of DTLE monitoring labels:instance is recommended to be set to the hostname of the DTLE server

shell> cat /path/to/prometheus.yml
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.

scrape_configs:
  - job_name: 'nomad'
    scrape_interval: 15s
    metrics_path: '/v1/metrics'
    params:
      format: ['prometheus']
    static_configs:
      - targets: ['10.186.63.20:4646']
        labels:
          instance: nomad-src-1
      - targets: ['10.186.63.76:4646']
        labels:
          instance: nomad-dest-1

  - job_name: 'dtle'
    scrape_interval: 15s
    metrics_path: '/metrics'
    static_configs:
      - targets: ['10.186.63.20:8190']
        labels:
          instance: dtle-src-1
      - targets: ['10.186.63.76:8190']
        labels:
          instance: dtle-dest-1

Use docker to deploy prometheus service

  shell> docker run -itd -p 9090:9090 --name=prometheus --hostname=prometheus --restart=always -v /path/to/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus

Visit the prometheus page http://${prometheus_server_ip}:9090/targets verify that the configuration takes effect

Three, deploy grafana

Use docker to deploy grafana service

  shell> docker run -d --name=grafana -p 3000:3000 grafana/grafana

Visit the grafana page http://${grafana_server_ip}:3000 and log in with the default user admin/admin
Configure to add data source

Choose to add promethues

Just add the access address of promethues to the URL and click the "sava & test" button

Add panel

Take adding a CPU usage monitor as an example to configure a panel

Four, commonly used monitoring items

Nomad all monitoring items: https://www.nomadproject.io/docs/operations/metrics

DTLE all monitoring items: https://actiontech.github.io/dtle-docs-cn/3/3.4_metrics.html

Data Transmission | How to build a DTLE monitoring system

background:

1. Build DTLE operating environment

Two, deploy prometheus

Three, deploy grafana

Four, commonly used monitoring items

Five, finally create multiple panels to display at the same time

爱可生开源社区

引用和评论

如何巧妙解决 Too many connections 报错？

Prometheus 历史峰值看不到了，这监控不准啊

夜莺监控 v8.0 新版通知规则 | 对接钉钉告警

夜莺监控 v8.0 新版通知规则 | 对接企微告警

【第二章模型与设备连接】手把手教你玩转新版正点原子云

观测云多步拨测最佳实践

夜莺监控 v8.0 新版通知规则 | 对接飞书告警

Data Transmission | How to build a DTLE monitoring system

background:

1. Build DTLE operating environment

Two, deploy prometheus

Three, deploy grafana

Four, commonly used monitoring items

Five, finally create multiple panels to display at the same time

爱可生开源社区

引用和评论

如何巧妙解决 Too many connections 报错？

Prometheus 历史峰值看不到了，这监控不准啊

夜莺监控 v8.0 新版通知规则 | 对接钉钉告警

夜莺监控 v8.0 新版通知规则 | 对接企微告警

【第二章 模型与设备连接】手把手教你玩转新版正点原子云

观测云多步拨测最佳实践

夜莺监控 v8.0 新版通知规则 | 对接飞书告警

【第二章模型与设备连接】手把手教你玩转新版正点原子云