k8s 部署生产vault集群

在之前的文章中,我们在k8s中部署了consul 生产集群。今天我继续在k8s中部署一个vault的生产集群。

Vault可以在高可用性(HA)模式下运行,以通过运行多个Vault服务器来防止中断。Vault通常受存储后端的IO限制的约束,而不是受计算要求的约束。某些存储后端(例如Consul)提供了附加的协调功能,使Vault可以在HA配置中运行,而其他一些则提供了更强大的备份和还原过程。

在高可用性模式下运行时,Vault服务器具有两个附加状态:备用和活动状态。在Vault群集中,只有一个实例将处于活动状态并处理所有请求(读取和写入),并且所有备用节点都将请求重定向到活动节点。

部署

我们的consul 集群复用之前文章中部署的consul集群。

vault配置文件server.hcl如下:

listener "tcp" {
  address          = "0.0.0.0:8200"
  cluster_address  = "POD_IP:8201"
  tls_disable      = "true"
}

storage "consul" {
  address = "127.0.0.1:8500"
  path    = "vault/"
}

api_addr = "http://POD_IP:8200"
cluster_addr = "https://POD_IP:8201"

接下我们创建configmap:

kubectl create configmap vault  --from-file=server.hcl
大家可以注意到配置文件中的POD_IP,我们将会在容器启动的时候,sed替换成真实的pod的IP。

我们采用StatefulSet方式部署一个两个节点的vault集群。通过sidecar的方式将consul client agent和vault部署到一个Pod中。

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: vault
  labels:
    app: vault
spec:
  serviceName: vault
  podManagementPolicy: Parallel
  replicas: 3
  updateStrategy:
    type: OnDelete
  selector:
    matchLabels:
      app: vault
  template:
    metadata:
      labels:
        app: vault
    spec:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - consul
              topologyKey: kubernetes.io/hostname
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - vault
              topologyKey: kubernetes.io/hostname       
      containers:
      - name: vault
        command:
          - "/bin/sh"
          - "-ec"
        args:
        - |
            sed -E "s/POD_IP/${POD_IP?}/g" /vault/config/server.hcl > /tmp/server.hcl;
            /usr/local/bin/docker-entrypoint.sh vault server -config=/tmp/server.hcl
        image: "vault:1.4.2"
        imagePullPolicy: IfNotPresent
        securityContext:
          capabilities:
            add:
              - IPC_LOCK
        env:
          - name: POD_IP
            valueFrom:
              fieldRef:
                fieldPath: status.podIP
          - name: VAULT_ADDR
            value: "http://127.0.0.1:8200"
          - name: VAULT_API_ADDR
            value: "http://$(POD_IP):8200"
          - name: SKIP_CHOWN
            value: "true"
        volumeMounts:
          - name: vault-config
            mountPath: /vault/config/server.hcl
            subPath: server.hcl
        ports:
        - containerPort: 8200
          name: vault-port
          protocol: TCP
        - containerPort: 8201
          name: cluster-port
          protocol: TCP
        readinessProbe:
          # Check status; unsealed vault servers return 0
          # The exit code reflects the seal status:
          #   0 - unsealed
          #   1 - error
          #   2 - sealed
          exec:
            command: ["/bin/sh", "-ec", "vault status -tls-skip-verify"]
          failureThreshold: 2
          initialDelaySeconds: 5
          periodSeconds: 3
          successThreshold: 1
          timeoutSeconds: 5
        lifecycle:
          # Vault container doesn't receive SIGTERM from Kubernetes
          # and after the grace period ends, Kube sends SIGKILL.  This
          # causes issues with graceful shutdowns such as deregistering itself
          # from Consul (zombie services).
          preStop:
            exec:
              command: [
                "/bin/sh", "-c",
                # Adding a sleep here to give the pod eviction a
                # chance to propagate, so requests will not be made
                # to this pod while it's terminating
                "sleep 5 && kill -SIGTERM $(pidof vault)",
              ]
      - name: consul-client
        image: consul:1.7.4
        env:
          - name: GOSSIP_ENCRYPTION_KEY
            valueFrom:
              secretKeyRef:
                name: consul
                key: gossip-encryption-key
          - name: POD_IP
            valueFrom:
              fieldRef:
                fieldPath: status.podIP 
        args:
          - "agent"
          - "-advertise=$(POD_IP)"
          - "-config-file=/etc/consul/config/client.json"
          - "-encrypt=$(GOSSIP_ENCRYPTION_KEY)"
        volumeMounts:
            - name: consul-config
              mountPath: /etc/consul/config
            - name: consul-tls
              mountPath: /etc/tls
        lifecycle:
            preStop:
              exec:
                command:
                - /bin/sh
                - -c
      volumes:
        - name: vault-config
          configMap:
            defaultMode: 420
            name: vault
        - name: consul-config
          configMap:
            defaultMode: 420
            name: consul-client
        - name: consul-tls
          secret:
            secretName: consul
如果你的k8s集群pod网段flat,可以和vpc当中的主机互相访问。那么按照以上的配置即可。否则需要设置pod的hostNetwork: true。

查看部署情况:

kubectl get pods  -l app=vault
NAME                     READY   STATUS    RESTARTS   AGE
vault-0   2/2     Running   0          3m3s
vault-1   2/2     Running   0          3m3s

此时补充一下consul client agent 的配置文件:

    {
        "bind_addr": "0.0.0.0",
        "client_addr": "0.0.0.0",
        "ca_file": "/etc/tls/ca.pem",
        "cert_file": "/etc/tls/consul.pem",
        "key_file": "/etc/tls/consul-key.pem",
        "data_dir": "/consul/data",
        "datacenter": "dc1",
        "domain": "cluster.consul",
        "server": false,
        "verify_incoming": true,
        "verify_outgoing": true,
        "verify_server_hostname": true,
        "retry_join": [
            "prod.discovery-01.xx.sg2.consul", 
            "prod.discovery-02.xx.sg2.consul", 
            "prod.discovery-03.xx.sg2.consul"
        ]
    }

prod.discovery-01.xx.sg2.consul 是我们私有域名,分别解析到之前部署的三个consul实例。

现在需要初始化和启动每个Vault实例

首先exec到其中一个vault实例:

kubectl exec -it vault-68bcdf8dbc-7gf29  -c vault sh

执行

vault operator init

Unseal Key 1: 4uyvFnGT8WxM7OXXvFJh0ich8W/4yDh27MBBj
Unseal Key 2: RzbrhGbV4hA+MlxkzwtPRP7aGXA3UaK95+5eb
Unseal Key 3: hBIv4GiVkMvrWMDnxoW7m4MAYZqgX/xvwF1KS
Unseal Key 4: +KyBJREqU+1p4qao1red/i7EX0ASmzWP2Ch79
Unseal Key 5: 8v0Q3ZHvMi7QwsJxmH3ay8h7KrJAE3ESgh+qK

Initial Root Token: s.mbHbP3WOWGEpaCT8zaoVl

Vault initialized with 5 key shares and a key threshold of 3. Please securely
distribute the key shares printed above. When the Vault is re-sealed,
restarted, or stopped, you must supply at least 3 of these keys to unseal it
before it can start servicing requests.

Vault does not store the generated master key. Without at least 3 key to
reconstruct the master key, Vault will remain permanently sealed!

It is possible to generate new unseal keys, provided you have a quorum of
existing unseal keys shares. See "vault operator rekey" for more information.

接着使用上面生成的Unseal Key 去 Unseal 三次:

vault operator unseal <unseal_key_1>

Key                Value
---                -----
Seal Type          shamir
Initialized        true
Sealed             true
Total Shares       5
Threshold          3
Unseal Progress    1/3
Unseal Nonce       3b5933b9-4120-5dcb-40df-afc8ab9e6563
Version            1.4.2
HA Enabled         true


vault operator unseal <unseal_key_2>

Key                Value
---                -----
Seal Type          shamir
Initialized        true
Sealed             true
Total Shares       5
Threshold          3
Unseal Progress    2/3
Unseal Nonce       3b5933b9-4120-5dcb-40df-afc8ab9e6563
Version            1.4.2
HA Enabled         true


vault operator unseal <unseal_key_3>

Key                    Value
---                    -----
Seal Type              shamir
Initialized            true
Sealed                 false
Total Shares           5
Threshold              3
Version                1.4.2
Cluster Name           vault-cluster-b9554129
Cluster ID             e6cedfdd-07d2-520a-9a7c-c4e857803c7e
HA Enabled             true
HA Cluster             n/a
HA Mode                standby
Active Node Address    <none>

此时查看status:

vault status
Key             Value
---             -----
Seal Type       shamir
Initialized     true
Sealed          false
Total Shares    5
Threshold       3
Version         1.4.2
Cluster Name    vault-cluster-b9554129
Cluster ID      e6cedfdd-07d2-520a-9a7c-c4e857803c7e
HA Enabled      true
HA Cluster      https://10.xx.xx.229:8201
HA Mode         active

接下来操作另外一个实例,用同样的key Unseal 三次。

最后查看状态:

vault status
Key                    Value
---                    -----
Seal Type              shamir
Initialized            true
Sealed                 false
Total Shares           5
Threshold              3
Version                1.4.2
Cluster Name           vault-cluster-b9554129
Cluster ID             e6cedfdd-07d2-520a-9a7c-c4e857803c7e
HA Enabled             true
HA Cluster             https://10.xx.3.229:8201
HA Mode                standby
Active Node Address    http://10.xx.3.229:8200

最后创建svc:

apiVersion: v1
kind: Service
metadata:
  name: vault
  labels:
    app: vault
spec:
  type: ClusterIP
  ports:
    - port: 8200
      targetPort: 8200
      protocol: TCP
      name: vault
  selector:
    app: vault

总结

  • 对于一些高可用的部署,我们需要加一些反亲和性的设置,比如我们设置了vault之间的反亲和性,以及和consul的反亲和性。
  • 由于我们运行的1号进程是sh,所以我们必须自己通过preStop实现优雅退出。

kubernetes solutions
专注k8s,serverless,service mesh,devops

专注kubernetes,devops,aiops,service mesh。

1.4k 声望
2.7k 粉丝
0 条评论
推荐阅读
关于多集群Kubernetes的一些思考
随着Kubernetes在企业中应用的越来越广泛和普及,越来越多的公司在生产环境中运维多个集群。本文主要讲述一些关于多集群Kubernetes的思考,包括为什么选择多集群,多集群的好处以及多集群的落地方案。

iyacontrol阅读 2.1k

深入剖析容器网络和 iptables
Docker 能为我们提供很强大和灵活的网络能力,很大程度上要归功于与 iptables 的结合。在使用时,你可能没有太关注到 iptables 的作用,这是因为 Docker 已经帮我们自动完成了相关的配置。

张晋涛3阅读 1.2k

封面图
Kubernetes v1.26 新特性一览
我每期的 「k8s生态周报」都有一个叫上游进展的部分,所以很多值得关注的内容在之前的文章中已经发过了。这篇中我会再额外介绍一些之前未涵盖的,和之前介绍过的值得关注的内容。

张晋涛2阅读 584评论 1

封面图
使用kubeasz部署高可用kubernetes集群
本实验采用kubeasz作为kubernetes环境部署工具,它是一个基于二进制方式部署和利用ansible-playbook实现自动化来快速部署高可用kubernetes集群的工具,详细介绍请查看kubeasz官方。本实验用到的所有虚拟机默认软...

李朝阳2阅读 419

Kubernetes 证书管理系列(一)
大家好,我是张晋涛。这是一个系列文章,将会通过七篇内容和大家一起聊聊 Kubernetes 中的证书管理。以下是内容概览:如上所示,在第一篇中,我们将从原理出发,来理解 Kubernetes 中的证书及其相关的作用,然后...

张晋涛2阅读 810

封面图
CodeGalaxy 推出轻量集群,可在云主机上一键搭建 K8s
CodeGalaxy 是 Swoole 官方推出的 ServerLess 平台,底层基于 Docker 和 K8s,帮助开发者更简单方便地管理云上的 Web 应用/服务。CodeGalaxy 是完全免费的,用户不需要付费即可使用。

韩天峰2阅读 405

vivo 云原生容器探索和落地实践
作者:vivo 互联网容器团队- Pan Liangbiao本文根据潘良彪老师在“2022 vivo开发者大会"现场演讲内容整理而成。公众号回复【2022 VDC】获取互联网技术分会场议题相关资料。

vivo互联网技术2阅读 656

专注kubernetes,devops,aiops,service mesh。

1.4k 声望
2.7k 粉丝
宣传栏