2

简介

Grafana Loki是一个水平可扩展,高可用性,多租户的日志聚合系统,包含了日志收集,存储,可视化以及报警等功能。
与其他日志系统不同,Loki的构想是仅对日志建立标签索引,而使原始日志消息保持未索引状态。这意味着Loki的运营成本更低,并且效率更高。

Loki特别适合存储Kubernetes Pod日志。诸如Pod标签之类的元数据会自动被抓取并建立索引。

与EFK对比

EFK(Elasticsearch,Fluentd,Kibana)技术栈用于收集,可视化和查询来自各种来源的日志。 Elasticsearch中的数据作为非结构化JSON对象存储在磁盘上。每个对象的键和每个键的内容都被索引。然后可以使用JSON对象或定义为Lucene的查询语言来查询数据。

相比之下,Loki在单二进制模式下可以将数据存储在磁盘上,但是在水平可伸缩模式下,数据存储在云存储系统(例如S3,GCS或Cassandra)中。日志以纯文本格式存储,并带有一组标签名称和值,其中仅对标签对进行索引。这种折衷使得它比全索引更具备成本优势,并且允许开发人员从其应用程序积极地进行日志记录。使用LogQL查询Loki中的日志。但是,由于这种设计的折衷,基于内容(即日志行中的文本)进行过滤的LogQL查询需要加载搜索窗口中与查询中定义的标签匹配的所有块。

此外,我们知道metrcis和alert只能揭示预定义的问题,未知的问题还得从Log里边查找。日志和 metric 分在两个系统,这增加了排查问题的难度。我们的日志和metrcis系统需要建立联系,而灵感来源于prometheus的loki,恰好解决了这个问题。

Loki架构

Loki大体架构如下:

接下来我们介绍一些核心组件:

  • Distributor -- Distributor 服务负责处理客户端的写入流。这是日志数据写入路径中的第一站。Distributor收到一组流后,将验证每个流的正确性并确保其在配置的租户(或全局)限制之内。然后,将有效块拆分为多个批次,并并行发送到多个ingester。
Distributor 使用一致性哈希和可配置的复制因子,以确定ingester服务的哪些实例应接收给定的流。流是与租户和唯一标签集关联的一组日志。使用租户ID和标签集对流进行散列,然后使用散列查找将流发送到的实例。
  • Ingester -- Ingester服务负责在写入路径上将日志数据写入到长期存储后端(DynamoDB,S3,Cassandra等),并在读取路径上返回日志数据以进行内存中查询。
Ingester包含一个生命周期器,该生命周期器管理哈希环中ingester的生命周期。每个ingester状态为以下状态中的一种:PENDINGJOININGACTIVELEAVINGUNHEALTHY
  • Query frontend -- Query frontend是一项可选服务,可提供查询器的API终结点,并可用于加速读取路径。当Query frontend就位时,应将传入的查询请求定向到Query frontend,而不是Querier。为了执行实际查询,集群中仍将需要Querier服务。
    Query frontend在内部执行一些查询调整,并将查询保存在内部队列中。在此设置中,Queriers充当工作人员,将工作从队列中拉出,执行,然后将其返回到Query frontend进行聚合。Queriers需要配置Query frontend地址(通过-querier.frontend-address CLI标志),以允许Queriers连接到Query frontend。
    Query frontend是无状态的。但是,由于内部队列的工作原理,建议运行一些Query frontend副本以充分利用公平调度的好处。在大多数情况下,两个副本就足够了。
  • Querier --Querier 使用 LogQL 查询语言处理查询,同时从ingester和长期存储中获取日志。
    Querier将查询所有内存中的内存数据,然后回退到针对后端存储运行相同的查询。由于副本因素,Querier可能会收到重复的数据。为解决此问题,Querier在内部对具有相同纳秒级时间戳,标签集和日志消息的数据进行重复数据删除。
  • Chunk Store -- Chunk Store是Loki的长期数据存储,旨在支持交互式查询和持续写入,而无需后台维护任务。它包括:

块的索引。该索引可以通过以下方式支持:

块数据本身的键值(KV)存储,可以是:

当然还可以包括做日志报警的ruler组件以及负责在其时间段开始之前创建周期表,并在其数据时间范围超出保留期限时将其删除的table-manager。

部署

Loki微服务部署模式,涉及组件比较多,我们生产环境使用k8s部署,当然涉及到敏感信息已经去掉。

Chuck 存储选择的是s3,index存储选择的是Cassandra。

1:创建s3 桶,然后将aksk添加到下面的配置文件中。Cassandra 集群搭建我们这里不再讲述。

2:部署loki的配置文件,

apiVersion: v1
data:
  config.yaml: |
    chunk_store_config:
        chunk_cache_config:
            memcached:
                batch_size: 100
                parallelism: 100
            memcached_client:
                consistent_hash: true
                host: memcached.loki.svc.cluster.local
                service: memcached-client
        max_look_back_period: 0
        write_dedupe_cache_config:
            memcached:
                batch_size: 100
                parallelism: 100
            memcached_client:
                consistent_hash: true
                host: memcached-index-writes.loki.svc.cluster.local
                service: memcached-client
    auth_enabled: false
    distributor:
        ring:
            kvstore:
                store: memberlist
    frontend:
        compress_responses: true
        log_queries_longer_than: 5s
        max_outstanding_per_tenant: 200
    frontend_worker:
        frontend_address: query-frontend.loki.svc.cluster.local:9095
        grpc_client_config:
            max_send_msg_size: 1.048576e+08
        parallelism: 2
    ingester:
        chunk_block_size: 262144
        chunk_idle_period: 15m
        lifecycler:
            heartbeat_period: 5s
            interface_names:
              - eth0
            join_after: 30s
            num_tokens: 512
            ring:
                kvstore:
                    store: memberlist
                replication_factor: 3
        max_transfer_retries: 0
    ingester_client:
        grpc_client_config:
            max_recv_msg_size: 6.7108864e+07
        remote_timeout: 1s
    limits_config:
        enforce_metric_name: false
        ingestion_burst_size_mb: 20
        ingestion_rate_mb: 10
        ingestion_rate_strategy: global
        max_cache_freshness_per_query: 10m
        max_global_streams_per_user: 10000
        max_query_length: 12000h
        max_query_parallelism: 16
        max_streams_per_user: 0
        reject_old_samples: true
        reject_old_samples_max_age: 168h
    querier:
        query_ingesters_within: 2h
    query_range:
        align_queries_with_step: true
        cache_results: true
        max_retries: 5
        results_cache:
            cache:
                memcached_client:
                    consistent_hash: true
                    host: memcached-frontend.loki.svc.cluster.local
                    max_idle_conns: 16
                    service: memcached-client
                    timeout: 500ms
                    update_interval: 1m
        split_queries_by_interval: 30m
    ruler: {}
    schema_config:
        configs:
          - from: "2020-05-15"
            index:
                period: 168h
                prefix: cassandra_table
            object_store: s3
            schema: v11
            store: cassandra
    server:
        graceful_shutdown_timeout: 5s
        grpc_server_max_concurrent_streams: 1000
        grpc_server_max_recv_msg_size: 1.048576e+08
        grpc_server_max_send_msg_size: 1.048576e+08
        http_listen_port: 3100
        http_server_idle_timeout: 120s
        http_server_write_timeout: 1m
    storage_config:
        cassandra:
            username: loki-superuser
            password: xxx
            addresses: loki-dc1-all-pods-service.cass-operator.svc.cluster.local
            auth: true
            keyspace: lokiindex
        aws:
            bucketnames: xx
            endpoint: s3.amazonaws.com
            region: ap-southeast-1
            access_key_id: xx
            secret_access_key: xx
            s3forcepathstyle: false
        index_queries_cache_config:
            memcached:
                batch_size: 100
                parallelism: 100
            memcached_client:
                consistent_hash: true
                host: memcached-index-queries.loki.svc.cluster.local
                service: memcached-client
    memberlist:
        abort_if_cluster_join_fails: false
        bind_port: 7946
        join_members:
        - loki-gossip-ring.loki.svc.cluster.local:7946
        max_join_backoff: 1m
        max_join_retries: 10
        min_join_backoff: 1s
    table_manager:
        creation_grace_period: 3h
        poll_interval: 10m
        retention_deletes_enabled: true
        retention_period: 168h
kind: ConfigMap
metadata:
  name: loki
  namespace: loki
---
apiVersion: v1
data:
  overrides.yaml: |
    overrides: {}
kind: ConfigMap
metadata:
  name: overrides
  namespace: loki

2:部署依赖的4个memcached:

memcached-frontend.yaml如下:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: memcached-frontend
  name: memcached-frontend
  namespace: loki
spec:
  replicas: 3
  selector:
    matchLabels:
      app: memcached-frontend
  serviceName: memcached-frontend
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9150"   
      labels:
        app: memcached-frontend
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: memcached-frontend
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -m 1024
        - -I 5m
        - -c 1024
        - -v
        image: memcached:1.5.17-alpine
        imagePullPolicy: IfNotPresent
        name: memcached
        ports:
        - containerPort: 11211
          name: client
        resources:
          limits:
            cpu: "3"
            memory: 1536Mi
          requests:
            cpu: 500m
            memory: 1329Mi
      - args:
        - --memcached.address=localhost:11211
        - --web.listen-address=0.0.0.0:9150
        image: prom/memcached-exporter:v0.6.0
        imagePullPolicy: IfNotPresent
        name: exporter
        ports:
        - containerPort: 9150
          name: http-metrics
  updateStrategy:
    type: RollingUpdate

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: memcached-frontend
  name: memcached-frontend
  namespace: loki
spec:
  ports:
  - name: memcached-client
    port: 11211
    targetPort: 11211
  selector:
    app: memcached-frontend

memcached-index-queries.yaml 如下:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: memcached-index-queries
  name: memcached-index-queries
  namespace: loki
spec:
  replicas: 3
  selector:
    matchLabels:
      app: memcached-index-queries
  serviceName: memcached-index-queries
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9150"   
      labels:
        app: memcached-index-queries
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: memcached-index-queries
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -m 1024
        - -I 5m
        - -c 1024
        - -v
        image: memcached:1.5.17-alpine
        imagePullPolicy: IfNotPresent
        name: memcached
        ports:
        - containerPort: 11211
          name: client
        resources:
          limits:
            cpu: "3"
            memory: 1536Mi
          requests:
            cpu: 500m
            memory: 1329Mi
      - args:
        - --memcached.address=localhost:11211
        - --web.listen-address=0.0.0.0:9150
        image: prom/memcached-exporter:v0.6.0
        imagePullPolicy: IfNotPresent
        name: exporter
        ports:
        - containerPort: 9150
          name: http-metrics
  updateStrategy:
    type: RollingUpdate

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: memcached-index-queries
  name: memcached-index-queries
  namespace: loki
spec:
  clusterIP: None
  ports:
  - name: memcached-client
    port: 11211
    targetPort: 11211
  selector:
    app: memcached-index-queries 

memcached-index-writes.yaml 如下:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: memcached-index-writes
  name: memcached-index-writes
  namespace: loki
spec:
  replicas: 3
  selector:
    matchLabels:
      app: memcached-index-writes
  serviceName: memcached-index-writes
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9150"   
      labels:
        app: memcached-index-writes
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: memcached-index-writes
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -m 1024
        - -I 1m
        - -c 1024
        - -v
        image: memcached:1.5.17-alpine
        imagePullPolicy: IfNotPresent
        name: memcached
        ports:
        - containerPort: 11211
          name: client
        resources:
          limits:
            cpu: "3"
            memory: 1536Mi
          requests:
            cpu: 500m
            memory: 1329Mi
      - args:
        - --memcached.address=localhost:11211
        - --web.listen-address=0.0.0.0:9150
        image: prom/memcached-exporter:v0.6.0
        imagePullPolicy: IfNotPresent
        name: exporter
        ports:
        - containerPort: 9150
          name: http-metrics
  updateStrategy:
    type: RollingUpdate

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: memcached-index-writes
  name: memcached-index-writes
  namespace: loki
spec:
  clusterIP: None
  ports:
  - name: memcached-client
    port: 11211
    targetPort: 11211
  selector:
    app: memcached-index-writes

memcached.yaml 如下:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: memcached
  name: memcached
  namespace: loki
spec:
  replicas: 3
  selector:
    matchLabels:
      app: memcached
  serviceName: memcached
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9150"   
      labels:
        app: memcached
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: memcached
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -m 4096
        - -I 2m
        - -c 1024
        - -v
        image: memcached:1.5.17-alpine
        imagePullPolicy: IfNotPresent
        name: memcached
        ports:
        - containerPort: 11211
          name: client
        resources:
          limits:
            cpu: "3"
            memory: 6Gi
          requests:
            cpu: 500m
            memory: 5016Mi
      - args:
        - --memcached.address=localhost:11211
        - --web.listen-address=0.0.0.0:9150
        image: prom/memcached-exporter:v0.6.0
        imagePullPolicy: IfNotPresent
        name: exporter
        ports:
        - containerPort: 9150
          name: http-metrics
  updateStrategy:
    type: RollingUpdate

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: memcached
  name: memcached
  namespace: loki
spec:
  clusterIP: None
  ports:
  - name: memcached-client
    port: 11211
    targetPort: 11211
  selector:
    app: memcached

对于这4个memcached 的作用,大家可以结合loki的配置文件和架构图查阅。

4:在Loki中,ring是由tokens分成较小段的空间。每个段都属于单个“ ingester”,用于对多个ingester的系列/日志进行分片。除tokens外,每个实例还具有其ID,地址和定期更新的最新心跳时间戳。这允许其他组件(distributors 和 queriers)发现哪些inester是可用的和有效的。

支持consul ,etcd,memberlist(gossip) 等实现。 我们为了减少不必要的依赖,选择了memberlist。

所以需要部署一个service:

gossip_ring.yaml 如下:

apiVersion: v1
kind: Service
metadata:
  labels:
    name: loki-gossip-ring
  name: loki-gossip-ring
  namespace: loki
spec:
  ports:
  - name: gossip-ring
    port: 7946
    targetPort: 7946
    protocol: TCP
  selector:
    gossip_ring_member: 'true'

5:部署distributor

apiVersion: apps/v1
kind: Deployment
metadata:
  name: distributor
  namespace: loki
  labels:
    app: distributor
spec:
  minReadySeconds: 10
  replicas: 3
  revisionHistoryLimit: 10
  selector:
    matchLabels:
        app: distributor
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "3100"
        prometheus.io/scrape: "true"
      labels:
        app: distributor
        gossip_ring_member: 'true'
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: distributor
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -config.file=/etc/loki/config/config.yaml
        - -limits.per-user-override-config=/etc/loki/overrides/overrides.yaml
        - -target=distributor
        image: grafana/loki:1.6.1
        imagePullPolicy: IfNotPresent
        name: distributor
        ports:
        - containerPort: 3100
          name: http-metrics
        - containerPort: 9095
          name: grpc
        - containerPort: 7946
          name: gossip-ring       
        readinessProbe:
          httpGet:
            path: /ready
            port: 3100
          initialDelaySeconds: 15
          timeoutSeconds: 1
        resources:
          limits:
            cpu: "1"
            memory: 1Gi
          requests:
            cpu: 500m
            memory: 500Mi
        volumeMounts:
        - mountPath: /etc/loki/config
          name: loki
        - mountPath: /etc/loki/overrides
          name: overrides
      volumes:
      - configMap:
          name: loki
        name: loki
      - configMap:
          name: overrides
        name: overrides

---
apiVersion: v1
kind: Service
metadata:
  labels:
    name: distributor
  name: distributor
  namespace: loki
spec:
  ports:
  - name: distributor-http-metrics
    port: 3100
    targetPort: 3100
  - name: distributor-grpc
    port: 9095
    targetPort: 9095
  selector:
    app: distributor

6:部署ingester

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: ingester
  namespace: loki
  labels:
    app: ingester
spec:
  updateStrategy:
    type: RollingUpdate
  replicas: 3
  serviceName: ingester
  selector:
    matchLabels:
      app: ingester
  strategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "3100"
        prometheus.io/scrape: "true"
      labels:
        name: ingester
        gossip_ring_member: 'true'
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      securityContext:
        fsGroup: 10001
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: ingester
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -config.file=/etc/loki/config/config.yaml
        - -limits.per-user-override-config=/etc/loki/overrides/overrides.yaml
        - -target=ingester
        image: grafana/loki:1.6.1
        imagePullPolicy: IfNotPresent
        name: ingester
        ports:
        - containerPort: 3100
          name: http-metrics
        - containerPort: 9095
          name: grpc
        - containerPort: 7946
          name: gossip-ring 
        readinessProbe:
          httpGet:
            path: /ready
            port: 3100
          initialDelaySeconds: 15
          timeoutSeconds: 1
        resources:
          limits:
            cpu: "2"
            memory: 10Gi
          requests:
            cpu: "1"
            memory: 5Gi
        volumeMounts:
        - mountPath: /etc/loki/config
          name: loki
        - mountPath: /etc/loki/overrides
          name: overrides
        - mountPath: /data
          name: ingester-data
      terminationGracePeriodSeconds: 4800
      volumes:
      - configMap:
          name: loki
        name: loki
      - configMap:
          name: overrides
        name: overrides
  volumeClaimTemplates:
  - metadata:
      labels:
        app: querier
      name: ingester-data
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
      storageClassName: gp2
---
apiVersion: v1
kind: Service
metadata:
  labels:
    name: ingester
  name: ingester
  namespace: loki
spec:
  ports:
  - name: ingester-http-metrics
    port: 3100
    targetPort: 3100
  - name: ingester-grpc
    port: 9095
    targetPort: 9095
  selector:
    app: ingester

7:部署query-frontend

apiVersion: apps/v1
kind: Deployment
metadata:
  name: query-frontend
  namespace: loki
  labels:
    app: query-frontend
spec:
  minReadySeconds: 10
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      name: query-frontend
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "3100"
        prometheus.io/scrape: "true"
      labels:
        app: query-frontend
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: query-frontend
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -config.file=/etc/loki/config/config.yaml
        - -limits.per-user-override-config=/etc/loki/overrides/overrides.yaml
        - -log.level=debug
        - -target=query-frontend
        image: grafana/loki:master-92ace83
        imagePullPolicy: IfNotPresent
        name: query-frontend
        ports:
        - containerPort: 3100
          name: http-metrics
        - containerPort: 9095
          name: grpc
        readinessProbe:
          httpGet:
            path: /ready
            port: 3100
          initialDelaySeconds: 15
          timeoutSeconds: 1
        resources:
          limits:
            memory: 1200Mi
          requests:
            cpu: "2"
            memory: 600Mi
        volumeMounts:
        - mountPath: /etc/loki/config
          name: loki
        - mountPath: /etc/loki/overrides
          name: overrides
      volumes:
      - configMap:
          name: loki
        name: loki
      - configMap:
          name: overrides
        name: overrides

---
apiVersion: v1
kind: Service
metadata:
  labels:
    name: query-frontend
  name: query-frontend
  namespace: loki
spec:
  clusterIP: None
  publishNotReadyAddresses: true
  ports:
  - name: query-frontend-http-metrics
    port: 3100
    targetPort: 3100
  - name: query-frontend-grpc
    port: 9095
    targetPort: 9095
  selector:
    app: query-frontend

8:部署querier

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    name: querier
  name: querier
  namespace: loki
spec:
  updateStrategy:
    type: RollingUpdate
  replicas: 3
  serviceName: querier
  selector:
    matchLabels:
      app: querier
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "3100"
        prometheus.io/scrape: "true"
      labels:
        app: querier
        gossip_ring_member: 'true'
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      securityContext:
        fsGroup: 10001
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: querier
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -config.file=/etc/loki/config/config.yaml
        - -limits.per-user-override-config=/etc/loki/overrides/overrides.yaml
        - -target=querier
        image: grafana/loki:1.6.1
        imagePullPolicy: IfNotPresent
        name: querier
        ports:
        - containerPort: 3100
          name: http-metrics
        - containerPort: 9095
          name: grpc
        - containerPort: 7946
          name: gossip-ring 
        readinessProbe:
          httpGet:
            path: /ready
            port: 3100
          initialDelaySeconds: 15
          timeoutSeconds: 1
        resources:
          requests:
            cpu: "4"
            memory: 2Gi
        volumeMounts:
        - mountPath: /etc/loki/config
          name: loki
        - mountPath: /etc/loki/overrides
          name: overrides
        - mountPath: /data
          name: querier-data
      volumes:
      - configMap:
          name: loki
        name: loki
      - configMap:
          name: overrides
        name: overrides
  volumeClaimTemplates:
  - metadata:
      labels:
        app: querier
      name: querier-data
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
      storageClassName: gp2

---
apiVersion: v1
kind: Service
metadata:
  labels:
    name: querier
  name: querier
  namespace: loki
spec:
  ports:
  - name: querier-http-metrics
    port: 3100
    targetPort: 3100
  - name: querier-grpc
    port: 9095
    targetPort: 9095
  selector:
    app: querier

9:部署table manager

apiVersion: apps/v1
kind: Deployment
metadata:
  name: table-manager
  namespace: loki
  labels:
    app: table-manager
spec:
  minReadySeconds: 10
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: table-manager
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "3100"
        prometheus.io/scrape: "true"
      labels:
        app: table-manager
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      containers:
      - args:
        - -bigtable.backoff-on-ratelimits=true
        - -bigtable.grpc-client-rate-limit=5
        - -bigtable.grpc-client-rate-limit-burst=5
        - -bigtable.table-cache.enabled=true
        - -config.file=/etc/loki/config/config.yaml
        - -limits.per-user-override-config=/etc/loki/overrides/overrides.yaml
        - -target=table-manager
        image: grafana/loki:1.6.1
        imagePullPolicy: IfNotPresent
        name: table-manager
        ports:
        - containerPort: 3100
          name: http-metrics
        - containerPort: 9095
          name: grpc
        readinessProbe:
          httpGet:
            path: /ready
            port: 3100
          initialDelaySeconds: 15
          timeoutSeconds: 1
        resources:
          limits:
            cpu: 200m
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - mountPath: /etc/loki/config
          name: loki
      volumes:
      - configMap:
          name: loki
        name: loki

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: table-manager
  name: table-manager
  namespace: loki
spec:
  ports:
  - name: table-manager-grpc
    port: 9095
    targetPort: 9095
  selector:
    app: table-manager

部署完成之后,可以查看所有pod运行状态:

kubectl get pods -n loki
NAME                              READY   STATUS    RESTARTS   AGE
distributor-84747955fb-hhtzl      1/1     Running   0          8d
distributor-84747955fb-pq9wn      1/1     Running   0          8d
distributor-84747955fb-w66hp      1/1     Running   0          8d
ingester-0                        1/1     Running   0          8d
ingester-1                        1/1     Running   0          8d
ingester-2                        1/1     Running   0          8d
memcached-0                       2/2     Running   0          3d2h
memcached-1                       2/2     Running   0          3d2h
memcached-2                       2/2     Running   0          3d2h
memcached-frontend-0              2/2     Running   0          3d2h
memcached-frontend-1              2/2     Running   0          3d2h
memcached-frontend-2              2/2     Running   0          3d2h
memcached-index-queries-0         2/2     Running   0          3d2h
memcached-index-queries-1         2/2     Running   0          3d2h
memcached-index-queries-2         2/2     Running   0          3d2h
memcached-index-writes-0          2/2     Running   0          3d3h
memcached-index-writes-1          2/2     Running   0          3d3h
memcached-index-writes-2          2/2     Running   0          3d3h
querier-0                         1/1     Running   0          3d3h
querier-1                         1/1     Running   0          3d3h
querier-2                         1/1     Running   0          3d3h
query-frontend-6c8ffc8667-qj5zq   1/1     Running   0          8d
table-manager-c4fdf6475-zjzqg     1/1     Running   0          8d

至此,部署工作完成。至于promtail插件,本文不作介绍,目前支持loki数据源的插件比较多,比如fluent bit 等。

可视化

Grafana已经通过Explore组件支持对loki直接查询。

通过标签组合来实现查询。

总结

如果对全文索引不是强需求,那么loki是k8s 日志系统的一个比较好的选择。


iyacontrol
1.4k 声望2.7k 粉丝

专注kubernetes,devops,aiops,service mesh。