technology stack
OS: Ubuntu 20.04 LTS
docker: 20.10.12
docker-compose: 1.25.0
Elasticsearch: 7.16.3
Logstash: 7.16.3
kafka: 2.13-2.8.1
Python: 3.8.2
kafka-python: 2.0.2
Build logstash with docker
official documentation
Configuration steps
docker pull docker.elastic.co/logstash/logstash:7.16.3
- logstash configuration file
/home/qbit/logstash/settings/logstash.yml
http.host: "0.0.0.0"
xpack.monitoring.elasticsearch.hosts: [ "http://192.168.1.46:9200" ]
- Pipeline Profile
/home/qbit/logstash/pipeline/:/usr/share/logstash/pipeline/es-pipeline.conf
input {
kafka {
codec => json
bootstrap_servers => "192.168.1.46:9092"
topics => ["coder_topic"]
}
}
filter {
mutate {
add_field => { "timestamp" => "%{@timestamp}" }
remove_field => ["@version"]
}
date {
match => [ "timestamp", "ISO8601" ] # 这里用 @timestamp 解析会出错
target => "time0"
}
ruby {
code => "
time1 = event.get('@timestamp').time.getlocal('+08:00').strftime('%Y-%m-%dT%H:%M:%S+08')
time2 = Time.parse(event.get('timestamp')).getlocal('+08:00').strftime('%Y-%m-%dT%H:%M:%S+08')
time3 = Time.now.getlocal('+08:00').strftime('%Y-%m-%dT%H:%M:%S+08')
event.set('time1', time1)
event.set('time2', time2)
event.set('time3', time3)
"
}
}
output {
stdout {
codec => json_lines
}
elasticsearch {
hosts => ["192.168.1.46:9200"]
index => "coder_index"
document_id => "%{id}"
}
}
docker run --rm -it --name logstash \
-v /home/qbit/logstash/pipeline/:/usr/share/logstash/pipeline/ \
-v /home/qbit/logstash/settings/logstash.yml:/usr/share/logstash/config/logstash.yml \
docker.elastic.co/logstash/logstash:7.16.3
Send messages with Python
# encoding: utf-8
# author: qbit
# date: 2022-01-28
# summary: 向 kafka 发送消息
import json
from kafka import KafkaProducer
def producer():
producer = KafkaProducer(
bootstrap_servers="192.168.1.46:9092",
key_serializer=lambda k: json.dumps(k).encode('utf8'),
value_serializer=lambda v: json.dumps(v).encode('utf8'),
)
id = 'qbit'
dic = {'id': f"{id}", 'age': '23'}
producer.send(topic="coder_topic", key=id, value=dic)
print(f"send key: {id}, value: {dic}")
if __name__ == "__main__":
producer()
# python3 producer.py
send key: qbit, value: {'id': 'qbit', 'age': '23'}
View data in ES with Kibana
GET coder_index/_search
{
"_index": "coder_index",
"_type": "_doc",
"_id": "qbit",
"_score": 1.0,
"_source": {
"id": "qbit",
"age": "23",
"@timestamp": "2022-01-28T01:03:40.733Z", // logstash event 时间戳
"timestamp": "2022-01-28T01:03:40.733Z",
"time0": "2022-01-28T01:03:40.733Z",
"time1": "2022-01-28T09:03:40+08",
"time2": "2022-01-28T09:03:40+08",
"time3": "2022-01-28T09:03:40+08" // filter 中 ruby 代码生成的时间戳
}
}
Write messages to AWS S3
- Pipe output configuration
output {
stdout {
codec => json_lines
}
elasticsearch {
hosts => ["192.168.1.46:9200"]
index => "coder_index"
document_id => "%{id}"
}
s3 {
id => "kafka_logstash_s3"
access_key_id => "your_access_key_id"
secret_access_key => "your_secret_access_key"
region => "cn-northwest-1"
bucket => "my_bucket"
prefix => "logstash/%{+YYYY-MM-dd}"
time_file => 1 # 单位 minutes
codec => "json_lines"
}
}
- Output file name format, note that the time zone of the time in the file name is UTC
logstash/2022-01-28/ls.s3.9be50c52-8f29-437c-84c5-76911ca4d9c5.2022-01-28T08.21.part0.txt
logstash/2022-01-28/ls.s3.ef85ef47-8caf-43a8-b720-0abe9fc6a5ae.2022-01-28T08.22.part1.txt
This article is from qbit snap
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。