Rookie of the enterprise log platform! Lighter and more efficient than ELK

When our company deploys many services, testing, and formal environments, checking logs has become a very rigid requirement. Is it to collect logs from multiple environments in a unified manner and then use Nginx to provide services to the outside world, or to use the dedicated log collection service ELK? This becomes a problem!

As an integrated solution, Graylog uses Elasticsearch for storage, MongoDB for caching, and also has a throttle (throttling). At the same time, its interface query is simple and easy to use and easy to expand. Therefore, using Graylog has become the best choice, saving us a lot of heart.

Filebeat tool introduction

Filebeat log file shipping service

Filebeat is a log file shipping tool. After installing the client on your server, Filebeat will automatically monitor the given log directory or the specified log file, track and read these files, read them continuously, and forward the information to Stored in Elasticsearch or Logstarsh or Graylog.

Filebeat workflow introduction

When you install and enable the Filebeat program, it will start one or more probes (prospectors) to detect the log directory or file you specify.

For each log file found by the probe, Filebeat will start a harvester.

Each harvesting process reads the latest content of a log file and sends these new log data to the spooler, which will collect these events.

Finally, Filebeat will send the collected data to the address you specify (we send it to the Graylog service here).

Filebeat icon to understand memory

We don't apply Logstash service here, mainly because Filebeat is more lightweight than Logstash.

When the machine configuration or resources we need to collect information are not particularly large, and it is not that complicated, it is still recommended to use Filebeat to collect logs.

In daily use, Filebeat has various installation and deployment methods and runs very stable.

Filebeat configuration file

The core of configuring Filebeat tool is how to write its corresponding configuration file!

The configuration of the corresponding Filebeat tool is mainly controlled by writing its configuration file. For installation via rpm or deb package, the configuration file will be stored under the path /etc/filebeat/filebeat.yml by default. For Mac or Win systems, please check the relevant files in the unzipped file, which are all involved.

The main configuration file of the Filebeat tool is shown below, and the meaning of each field is explained in detail in the comment information, so I won't go into details here. Note that , we will log all input sources to read all yml defined configurations in inputs.d directory.

Therefore, we can define different configuration files for more unused services (testing, formal services), and configure them according to the actual situation of physical machine deployment.

# 配置输入来源的日志信息
# 我们合理将其配置到了 inputs.d 目录下的所有 yml 文件
filebeat.config.inputs:
  enabled: true
  path: ${path.config}/inputs.d/*.yml
  # 若收取日志格式为 json 的 log 请开启此配置
  # json.keys_under_root: true

# 配置 Filebeat 需要加载的模块
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

setup.template.settings:
  index.number_of_shards: 1

# 配置将日志信息发送那个地址上面
output.logstash:
  hosts: ["11.22.33.44:5500"]

# output.file:
#   enable: true

processors:
  - add_host_metadata: ~
  - rename:
      fields:
        - from: "log"
          to: "message"
  - add_fields:
      target: ""
      fields:
        # 加 Token 是为了防止无认证的服务上 Graylog 服务发送数据
        token: "0uxxxxaM-1111-2222-3333-VQZJxxxxxwgX "

The following shows the specific content of a simple yml configuration file under the inputs.d directory. Its main function is to configure independent log data for a separate service and to append different data tag types.

# 收集的数据类型
- type: log
  enabled: true
  # 日志文件的路径地址
  paths:
    - /var/log/supervisor/app_escape_worker-stderr.log
    - /var/log/supervisor/app_escape_prod-stderr.log
  symlinks: true
  # 包含的关键字信息
  include_lines: ["WARNING", "ERROR"]
  # 打上数据标签
  tags: ["app", "escape", "test"]
  # 防止程序堆栈信息被分行识别
  multiline.pattern: '^\[?[0-9]...{3}'
  multiline.negate: true
  multiline.match: after

# 需要配置多个日志时可加多个 type 字段
- type: log
  enabled: true
  ......

It should be noted that for different log types, filebeat also provides different modules to configure different service logs and its different module features, such as our common PostgreSQl, Redis, Iptables, etc.

# iptables
- module: iptables
  log:
    enabled: true
    var.paths: ["/var/log/iptables.log"]
    var.input: "file"

# postgres
- module: postgresql
  log:
    enabled: true
    var.paths: ["/path/to/log/postgres/*.log*"]

# nginx
- module: nginx
  access:
    enabled: true
    var.paths: ["/path/to/log/nginx/access.log*"]
  error:
    enabled: true
    var.paths: ["/path/to/log/nginx/error.log*"]

Graylog service introduction

Graylog log monitoring system

Graylog is an open source log aggregation, analysis, audit, presentation and warning tool. In terms of function, it is similar to ELK, but much simpler than ELK.

Relying on the advantages of being more concise, efficient, and simple to deploy and use, it quickly became favored by many people. Of course, there is really no better scalability than ELK, but there are commercial versions to choose from.

Graylog workflow introduction

The simplest architecture for deploying Graylog is stand-alone deployment, and the more complicated one is to deploy cluster mode. The architecture diagram is shown below. We can see that it contains three components, namely Elasticsearch, MongoDB and Graylog.

Among them, Elasticsearch is used to persistently store and retrieve log file data (IO-intensive), MongoDB is used to store related configurations about Graylog, and Graylog is used to provide a web interface and external interface (CPU-intensive).

Minimize single-machine deployment

Optimal cluster deployment

Graylog component function

The core of configuring Graylog service is to understand the function of the corresponding component and how it works!

Simply put, Input represents the source of the log data. For logs from different sources, Extractors can be used to convert the log fields, such as changing the status code of Nginx to the corresponding English expression.

Then, separate streams are grouped by different tag types, and these log data are stored in the specified Index library for persistent storage.

Graylog collects logs through Input, and each Input is individually configured with Extractors for field conversion.

The basic unit of log search in Graylog is Stream. Each Stream can have its own separate Elastic Index Set or share an Index Set.

Extractor is configured in System/Input. A very convenient point in Graylog is that you can load a log, and then configure it based on this practical example and see the result directly.

The built-in Extractor can basically complete various field extraction and conversion tasks, but there are also some restrictions. These restrictions need to be considered when writing logs in the application. Input can configure multiple Extractors and execute them in sequence.

The system will have a default Stream, and all logs will be saved to this Stream by default, unless a certain Stream is matched, and this Stream is configured not to save logs to the default Stream.

You can create more Streams through the menu Streams. The newly created Stream is in a paused state and needs to be started manually after the configuration is complete.

Stream matches logs by configuring conditions, and logs that meet the conditions are added with a stream ID identification field and saved in the corresponding Elastic Index Set.

Index Set is created through the menu System/Indices. The performance, reliability and expiration strategy of log storage are all configured through Index Set.

Performance and reliability are some parameters for configuring Elastic Index. The main parameters include Shards and Replicas.

In addition to the log processing process mentioned above, Graylog also provides a Pipeline script to achieve a more flexible log processing solution.

I will not elaborate here, but only introduce how to use Pipelines to filter unwanted logs. The following is an example of a Pipeline Rule that discards all logs with level> 6.

From data collection (input), field analysis (extractor), shunting to stream, and then to the cleaning of the Pipeline, it is done in one go without the need for secondary processing in other ways.

Sidecar is a lightweight log collector, centralized management by accessing Graylog, and supports Linux and windows systems.

The Sidecar daemon will periodically access Graylog's REST API interface to obtain the tags defined in the Sidecar configuration file. When Sidecar is run for the first time, it will pull the configuration information of the specified tags in the configuration file from the Graylog server and synchronize it to the local.

Currently Sidecar supports NXLog, Filebeat and Winlogbeat. They are all configured uniformly through the web interface in Graylog, and support output types such as Beats, CEF, Gelf, Json API, and NetFlow.

The most powerful thing about Graylog is that you can specify in the configuration file which Graylog cluster Sidecar sends logs to, and load balance multiple inputs in the Graylog cluster, so that Graylog can cope with the huge amount of logs.

rule "discard debug messages"
when
  to_long($message.level) > 6
then
  drop_message();
end

After the logs are centrally saved to Graylog, you can easily search for them. However, sometimes the data still needs to be processed further.

There are two main ways: directly accessing the data stored in Elastic, or forwarding it to other services through Graylog's Output.

Service installation and deployment

Mainly introduce the installation steps and precautions for deploying Filebeat+Graylog!

Deploy the Filebeat tool

The official provides a variety of deployment methods, including installation services through rpm and deb packages, and installation services through source code compilation, as well as installation services using Docker or kubernetes.

We can install it according to our actual needs:

# Ubuntu(deb)
$ curl -L -O https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-7.8.1-amd64.deb
$ sudo dpkg -i filebeat-7.8.1-amd64.deb
$ sudo systemctl enable filebeat
$ sudo service filebeat start

# 使用 Docker 启动
docker run -d --name=filebeat --user=root \
  --volume="./filebeat.docker.yml:/usr/share/filebeat/filebeat.yml:ro" \
  --volume="/var/lib/docker/containers:/var/lib/docker/containers:ro" \
  --volume="/var/run/docker.sock:/var/run/docker.sock:ro" \
  docker.elastic.co/beats/filebeat:7.8.1 filebeat -e -strict.perms=false \
  -E output.elasticsearch.hosts=["elasticsearch:9200"]

Deploy the Graylog service

Here we mainly introduce the use of Docker containers to deploy services. If you need to use other methods to deploy, please check the installation and deployment steps in the corresponding chapter of the official document.

Before service deployment, we need to generate and other related information for the Graylog service. The generation and deployment are as follows:

# 生成 password_secret 密码（最少 16 位）
$ sudo apt install -y pwgen
$ pwgen -N 1 -s 16
zscMb65...FxR9ag

# 生成后续 Web 登录时所需要使用的密码
$ echo -n "Enter Password: " && head -1 </dev/stdin | tr -d '\n' | sha256sum | cut -d" " -f1
Enter Password: zscMb65...FxR9ag
77e29e0f...557515f

After generating the required password information, we save the following yml information to the docker-comopse.yml file, and use the docker-compose command to start the service to complete the deployment.

After that, you can log in to the homepage by accessing port 9000 of the corresponding server address through a browser.

version: "3"

services:
  mongo:
    restart: on-failure
    container_name: graylog_mongo
    image: "mongo:3"
    volumes:
      - "./mongodb:/data/db"
    networks:
      - graylog_network

  elasticsearch:
    restart: on-failure
    container_name: graylog_es
    image: "elasticsearch:6.8.5"
    volumes:
      - "./es_data:/usr/share/elasticsearch/data"
    environment:
      - http.host=0.0.0.0
      - transport.host=localhost
      - network.host=0.0.0.0
      - "ES_JAVA_OPTS=-Xms512m -Xmx5120m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    deploy:
      resources:
        limits:
          memory: 12g
    networks:
      - graylog_network

  graylog:
    restart: on-failure
    container_name: graylog_web
    image: "graylog/graylog:3.3"
    ports:
      - 9000:9000 # Web 服务提供的访问端口
      - 5044:5044 # Filebeat 工具提供端口
      - 12201:12201 # GELF TCP
      - 12201:12201/udp # GELF UDP
      - 1514:1514 # Syslog TCP
      - 1514:1514/udp # Syslog UDP
    volumes:
      - "./graylog_journal:/usr/share/graylog/data/journal"
    environment:
      - GRAYLOG_PASSWORD_SECRET=zscMb65...FxR9ag
      - GRAYLOG_ROOT_PASSWORD_SHA2=77e29e0f...557515f
      - GRAYLOG_HTTP_EXTERNAL_URI=http://11.22.33.44:9000/
      - GRAYLOG_TIMEZONE=Asia/Shanghai
      - GRAYLOG_ROOT_TIMEZONE=Asia/Shanghai
    networks:
      - graylog
    depends_on:
      - mongo
      - elasticsearch

networks:
  graylog_network:
    driver: bridge

It should be noted that the input mode of GELF (Graylog Extended Log Format) can accept structured events, and supports compression and blocking. It just so happens that the log-driver driver of the Docker service natively provides GELF support.

Just after we create the corresponding input under the system/inputs of Graylog, specify the log-driver when starting the container, and then all the output in the container can be sent to Graylog.

# [docker] 启动容器指定地址和 driver
docker run --rm=true \
    --log-driver=gelf \
    --log-opt gelf-address=udp://11.22.33.44:12201 \
    --log-opt tag=myapp \
    myapp:0.0.1

# [docker-compose] 启动使用方式
version: "3"
services:
  redis:
    restart: always
    image: redis
    container_name: "redis"
    logging:
      driver: gelf
      options:
        gelf-address: udp://11.22.33.44:12201
        tag: "redis"
  ......