编译过程中的细节点可以参考我的上一篇文章

https://segmentfault.com/a/1190000045492463

官网部署地址教程
https://gitcode.com/eBackup/open-eBackup/blob/master/doc/quic...

需要的环境

k8s 1.23.6
docker 18.09.0 及后续版本
helm

image.png

软件安装

helm安装

wget https://get.helm.sh/helm-v3.13.3-linux-arm64.tar.gz
tar -zxvf helm-v3.13.3-linux-arm64.tar.gz linux-arm64/helm
cp linux-arm64/helm /usr/local/bin

docker安装

执行以下命令安装docker。

yum install -y docker
cat <<EOF > /etc/docker/daemon.json
{
"registry-mirrors": ["https://hub-mirror.c.163.com", "https://docker.mirrors.ustc.edu.cn", "https://registry.docker-cn.com"],
"insecure-registries": ["hub-mirror.c.163.com", "docker.mirrors.ustc.edu.cn", "registry.docker-cn.com"],
"seccomp-profile": "/etc/docker/profile.json",
"exec-opts": ["native.cgroupdriver=systemd"],
"experimental":true
}
EOF

cat <<EOF > /etc/docker/profile.json
{}
EOF

注意:官网这里的registry-mirrors可能会不可用,因为某些原因,建议是配置docker代理

配置docker代理

sudo mkdir -p /etc/systemd/system/docker.service.d
sudo vi /etc/systemd/system/docker.service.d/http-proxy.conf

# http-proxy.conf
[Service]
Environment="HTTP_PROXY=http://127.0.0.1:7890/"
Environment="HTTPS_PROXY=http://127.0.0.1:7890/"
Environment="NO_PROXY=localhost,127.0.0.1"

重启生效
sudo systemctl daemon-reload
sudo systemctl restart docker

Kubenetes安装

这里跟官网的系统一样(openEuler 22.03系统)

在安装kubenetes 需要先禁用防火墙跟swap

1.禁用防火墙

1.查看当前防火墙状态
systemctl status firewalld
2. 禁用防火墙服务
systemctl disable firewalld
3.验证防火墙已禁用
systemctl status firewalld

2.禁用Swap

1. 关闭 Swap 分区
swapoff -a

2.永久禁用 Swap
编辑 /etc/fstab 文件,注释掉所有包含 swap 的行:

sudo vi /etc/fstab
# /dev/mapper/openeuler01-swap none   swap    defaults        0 0

3.添加K8S yum源

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-aarch64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF

4.安装kubeadm,kubelet,kubectl(指定版本)

yum install -y kubelet-1.23.6 kubeadm-1.23.6 kubectl-1.23.6

设置kubelet开机自启动
systemctl enable kubelet

5.初始化k8s

初始化之前需要先修改节点名称

hostnamectl set-hostname master

管理平面为master的IP地址

kubeadm init --apiserver-advertise-address=<管理平面IP> --image-repository registry.aliyuncs.com/google_containers --service-cidr=10.1.0.0/16 --pod-network-cidr=10.244.0.0/16 --kubernetes-version=v1.23.6

6.去除污点节点和给节点添加标签

kubectl taint node master node-role.kubernetes.io/master-

kubectl label nodes master role=MASTER --overwrite

网络插件安装

1.编辑kube-flannel.yaml

---
kind: Namespace
apiVersion: v1
metadata:
  name: kube-flannel
  labels:
    k8s-app: flannel
    pod-security.kubernetes.io/enforce: privileged
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: flannel
  name: flannel
rules:
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - nodes
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - nodes/status
  verbs:
  - patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  labels:
    k8s-app: flannel
  name: flannel
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: flannel
subjects:
- kind: ServiceAccount
  name: flannel
  namespace: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
  labels:
    k8s-app: flannel
  name: flannel
  namespace: kube-flannel
---
kind: ConfigMap
apiVersion: v1
metadata:
  name: kube-flannel-cfg
  namespace: kube-flannel
  labels:
    tier: node
    k8s-app: flannel
    app: flannel
data:
  cni-conf.json: |
    {
      "name": "cbr0",
      "cniVersion": "0.3.1",
      "plugins": [
        {
          "type": "flannel",
          "delegate": {
            "hairpinMode": true,
            "isDefaultGateway": true
          }
        },
        {
          "type": "portmap",
          "capabilities": {
            "portMappings": true
          }
        }
      ]
    }
  net-conf.json: |
    {
      "Network": "10.244.0.0/16",
      "EnableNFTables": false,
      "Backend": {
        "Type": "vxlan"
      }
    }
---
kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: kube-flannel-ds
  namespace: kube-flannel
  labels:
    tier: node
    app: flannel
    k8s-app: flannel
spec:
  selector:
    matchLabels:
      app: flannel
  template:
    metadata:
      labels:
        tier: node
        app: flannel
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/os
                operator: In
                values:
                - linux
      hostNetwork: true
      priorityClassName: system-node-critical
      tolerations:
      - operator: Exists
        effect: NoSchedule
      serviceAccountName: flannel
      initContainers:
      - name: install-cni-plugin
        image: docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel2
        command:
        - cp
        args:
        - -f
        - /flannel
        - /opt/cni/bin/flannel
        volumeMounts:
        - name: cni-plugin
          mountPath: /opt/cni/bin
      - name: install-cni
        image: docker.io/flannel/flannel:v0.26.0

        command:
        - cp
        args:
        - -f
        - /etc/kube-flannel/cni-conf.json
        - /etc/cni/net.d/10-flannel.conflist
        volumeMounts:
        - name: cni
          mountPath: /etc/cni/net.d
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
      containers:
      - name: kube-flannel
        image: docker.io/flannel/flannel:v0.26.0
        command:
        - /opt/bin/flanneld
        args:
        - --ip-masq
        - --kube-subnet-mgr
        resources:
          requests:
            cpu: "100m"
            memory: "50Mi"
        securityContext:
          privileged: false
          capabilities:
            add: ["NET_ADMIN", "NET_RAW"]
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: EVENT_QUEUE_DEPTH
          value: "5000"
        volumeMounts:
        - name: run
          mountPath: /run/flannel
        - name: flannel-cfg
          mountPath: /etc/kube-flannel/
        - name: xtables-lock
          mountPath: /run/xtables.lock
      volumes:
      - name: run
        hostPath:
          path: /run/flannel

      - name: cni-plugin
        hostPath:
          path: /opt/cni/bin
      - name: cni
        hostPath:
          path: /etc/cni/net.d
      - name: flannel-cfg
        configMap:
          name: kube-flannel-cfg
      - name: xtables-lock
        hostPath:
          path: /run/xtables.lock
          type: FileOrCreate

2.安装flannel服务

注意点: docker 需要需要配置代理,不要会拉取镜像失败,导致无法安装flannel服务

kubectl apply -f kube-flannel.yaml

3.安装cni-plugins-linux(容器网络接口(CNI)插件的二进制包)

下载地址

https://github.com/containernetworking/plugins/releases/tag/v0.9.1

image.png

4.解压当/opt/cni/bin目录下

tar -zxvf cni-plugins-linux-arm64-v0.9.1.tgz  -C /opt/cni/bin

Traefik安装

1.拉取traefik镜像:

docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/rancher/mirrored-library-traefik:2.10.7-linuxarm64v8
docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/rancher/mirrored-library-traefik:2.10.7-linuxarm64v8 traefik:v2.11.2

2.编辑traefik_config.yaml:

特别注意:IP设置当前主机地址,这里有2个25081端口,不需要修改,127.0.0.1:25081,不能修改,如果修改会导致之后登录的时候登录不进去

sudo vi traefik_config.yaml
kind: ConfigMap
apiVersion: v1
metadata:
  name: traefik-config
  namespace: kube-system
data:
  traefik.yaml: |-
    ping: ""                    ## 启用 Ping
    serversTransport:
      insecureSkipVerify: true  ## Traefik 忽略验证代理服务的 TLS 证书
    api:
      insecure: true            ## 允许 HTTP 方式访问 API
      dashboard: false           ## 启用 Dashboard
      debug: false              ## 启用 Debug 调试模式
    metrics:
      prometheus: ""            ## 配置 Prometheus 监控指标数据,并使用默认配置
    entryPoints:
      gui:
        address: "<IP_ADDR>:25080"
        http:
          redirections:
            entryPoint:
              to: gui
              scheme: https
        forwardedHeaders:
          insecure: true
        transport:
          respondingTimeouts:
            readTimeout: 0
        proxyProtocol:
              insecure: true
      third:
        address: "<IP_ADDR>:25081"
        forwardedHeaders:
          insecure: true
        transport:
          respondingTimeouts:
            readTimeout: 0
        proxyProtocol:
              insecure: true
      third_local:
        address: "127.0.0.1:25081"
        forwardedHeaders:
          insecure: true
        transport:
          respondingTimeouts:
            readTimeout: 0
        proxyProtocol:
              insecure: true
      agent:
        address: "<IP_ADDR>:25082"
        forwardedHeaders:
          insecure: true
        proxyProtocol:
          insecure: true
        transport:
          respondingTimeouts:
            readTimeout: 0
      traefik:
        address: "127.0.0.1:25083"
        forwardedHeaders:
          insecure: true
        transport:
          respondingTimeouts:
            readTimeout: 0
      internal:
        address: "<IP_ADDR>:25084"
        forwardedHeaders:
          insecure: true
        transport:
          respondingTimeouts:
            readTimeout: 0
      replica:
        address: "<IP_ADDR>:25085"
        forwardedHeaders:
          insecure: true
        transport:
          respondingTimeouts:
            readTimeout: 0
    providers:
      kubernetesCRD: ""         ## 启用 Kubernetes CRD 方式来配置路由规则
      kubernetesIngress: ""     ## 启用 Kubernetes Ingress 方式来配置路由规则
    log:
      filePath: "/var/log/traefik/msg_traefik.log"              ## 设置调试日志文件存储路径,如果为空则输出到控制台
      level: INFO               ## 设置日志级别
      format: json              ## 设置调试日志格式
    accessLog:
      filePath: ""              ## 设置访问日志文件存储路径,如果为空则输出到控制台
      format: json              ## 设置访问调试日志格式
      bufferingSize: 0          ## 设置访问日志缓存行数
      filters:
        #statusCodes: ["200"]   ## 设置只保留指定状态码范围内的访问日志
        retryAttempts: true     ## 设置代理访问重试失败时,保留访问日志
        minDuration: 20         ## 设置保留请求时间超过指定持续时间的访问日志
      fields:                   ## 设置访问日志中的字段是否保留(keep 保留、drop 不保留)
        defaultMode: keep       ## 设置默认保留访问日志字段
        names:                  ## 针对访问日志特别字段特别配置保留模式
          ClientUsername: drop
        headers:                ## 设置 Header 中字段是否保留
          defaultMode: keep     ## 设置默认保留 Header 中字段
          names:                ## 针对 Header 中特别字段特别配置保留模式
            User-Agent: redact
            Authorization: drop
            Content-Type: keep

3.编辑traefik_rbac.yaml

sudo vi traefik_rbac.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: traefik-ingress-controller

rules:
  - apiGroups:
      - ""
    resources:
      - services
      - endpoints
      - secrets
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
      - networking.k8s.io
    resources:
      - ingresses
      - ingressclasses
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - extensions
      - networking.k8s.io
    resources:
      - ingresses/status
    verbs:
      - update
  - apiGroups:
      - traefik.io
      - traefik.containo.us
    resources:
      - middlewares
      - middlewaretcps
      - ingressroutes
      - traefikservices
      - ingressroutetcps
      - ingressrouteudps
      - tlsoptions
      - tlsstores
      - serverstransports
    verbs:
      - get
      - list
      - watch

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: traefik-ingress-controller
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: traefik-ingress-controller
subjects:
  - kind: ServiceAccount
    name: traefik-ingress-controller
    namespace: kube-system

---
apiVersion: v1
kind: ServiceAccount
metadata:
  namespace: kube-system
  name: traefik-ingress-controller

4.编辑traefik_ingress_controller.yaml

sudo vi traefik_rbac.yaml
apiVersion: v1
kind: Service
metadata:
  name: traefik
  namespace: kube-system
spec:
  ports:
    - name: internal
      port: 80
      targetPort: 25084
    - name: agent
      port: 90
      targetPort: 25082
  selector:
    app: traefik
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: traefik-ingress-controller
  namespace: kube-system
  labels:
    app: traefik
spec:
  selector:
    matchLabels:
      app: traefik
  template:
    metadata:
      name: traefik
      labels:
        app: traefik
    spec:
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: node-role.kubernetes.io/master
                    operator: Exists
      serviceAccountName: traefik-ingress-controller
      terminationGracePeriodSeconds: 1
      hostNetwork: true
      containers:
        - image: traefik:v2.11.2
          name: traefik-ingress-lb
          securityContext:
            runAsNonRoot: false
          ports:
            - name: gui
              containerPort: 25080  #entryPoint端口,与configMap配置一致
              hostPort: 25080       #宿主机规划占用端口
            - name: third
              containerPort: 25081  #entryPoint端口
              hostPort: 25081       #宿主机规划占用端口
            - name: agent
              containerPort: 25082  #entryPoint端口
              hostPort: 25082       #宿主机规划占用端口
            - name: traefik
              containerPort: 25083
              hostPort: 25083
            - name: internal
              containerPort: 25084
              hostPort: 25084
          command: ["/usr/local/bin/traefik"]
          args:
            - --configfile=/config/traefik.yaml
          volumeMounts:
            - mountPath: "/config"
              name: "config"
      volumes:
        - name: config
          configMap:
            name: traefik-config
      tolerations:              ## 设置容忍所有污点,防止节点被设置污点
        - key: node-role.kubernetes.io/master
          operator: Exists

5.下载traefik

mkdir -p /opt/k8s/run/conf/traefik
cd /opt/k8s/run/conf/traefik

wget https://github.com/traefik/traefik-helm-chart/archive/refs/tags/v27.0.2.tar.gz

tar -xvf traefik-helm-chart-27.0.2.tar.gz
cd traefik-helm-chart-27.0.2/traefik
kubectl apply -f crds

6.部署traefik资源

kubectl apply -f traefik_config.yaml
kubectl apply -f traefik_rbac.yaml
kubectl apply -f traefik_ingress_controller.yaml

检查是否安装完成功

kubectl get pods -A 

都跑这样证明k8s主要的都安装完成了

image.png

开始部署

1.创建dpa命名空间

kubectl create namespace dpa

2.进入安装目录

mdir -p /open-eBackup

cd /open-eBackup

2.上传open-ebackup-1.0.zip到安装环境的/open-eBackup。并解压安装包

编译open-ebackup-1.0.zip 可以结合官网跟我上一篇文章

https://segmentfault.com/a/1190000045492463
unzip open-ebackup-1.0.zip

3.安装MasterServer

tar -zxvf open-eBackup_1.6.RC2_MasterServer.tgz
mkdir open-eBackup_MasterServer_image
tar -zxvf open-eBackup_1.6.RC2_MasterServer.tgz -C open-eBackup_MasterServer_image
docker load -i open-eBackup_MasterServer_image/open-eBackup_1.6.RC2_MasterServer.tar.xz

tar -zxvf open-eBackup_1.6.RC2_MediaServer.tgz
mkdir open-eBackup_MediaServer_image
tar -zxvf open-eBackup_1.6.RC2_MediaServer.tgz -C open-eBackup_MediaServer_image
docker load -i open-eBackup_MediaServer_image/open-eBackup_1.6.RC2_MediaServer.tar.xz

mkdir open-eBackup_MasterServer_chart
tar -zxvf open-eBackup_MasterServer_chart.tgz -C open-eBackup_MasterServer_chart
tar -zxvf open-eBackup_MasterServer_chart/databackup-1.6.0-RC2.tgz -C open-eBackup_MasterServer_chart
helm install master-server open-eBackup_MasterServer_chart/databackup --set global.gaussdbpwd=R2F1c3NkYl8xMjM= --set global.replicas=1 --set global.deploy_type=d10 -n dpa

4.查看安装结果

kubectl get pods -n dpa

画红款的可以进行忽略,这是安装media server遇到的问题

image.png

其他几项能全部跑起来就没问题

之后访问输入你的主机ip,端口25080进行管理页面,如果能看到这些证明部署成功了

默认用户名密码是 sysadmin Admin@123,首次登录成功会让你修改密码

image.png

遇到的问题

第一次部署成功后,如果没有进行登录修改默认密码,在重新进行install会导致infrastructure-的om容器无法跑起来。

image.png

会出现以上报错

image.png

出错位置

进去om容器

 kubectl exec -it infrastructure-0 -n dpa -c om -- /bin/bash

om异常路径

/opt/om/package/src/app/service/update_password/service_update_password.pyc

image.png

原因是gaussdb重新启动时脚本(判断用户名存在后)没有写入common_secret,导致下面的报错

查看gassudb 执行的脚本

执行的脚本位置

/open-eBackup-master/src/Infrastructure_OM/infrastructure/script/gaussdb/install_gaussdb.sh
# 检查是否已经创建gaussdbremote,已存在则不再创建
username=$(gsql postgres -p 6432 -v ON_ERROR_STOP=on -Atc "select usename from pg_user where usename='gaussdbremote';")

# ON_ERROR_STOP=on时,return 0代表sql执行成功
if [ $? -eq 0 ] && [ "$username" != "gaussdbremote" ]; then
    log_info "${LINENO} Start to create gaussdbremote"
    log_info "${LINENO} Username:${username}"
    # 生成随机密码,密码中至少包含三种字符
    while true; do
      gaussdbremote_password=$(openssl rand -base64 8)
      if [[ $(echo "$gaussdbremote_password" | grep -c '[a-z]') -ge 1 && \
            $(echo "$gaussdbremote_password" | grep -c '[A-Z]') -ge 1 && \
            $(echo "$gaussdbremote_password" | grep -c '[0-9]') -ge 1 ]]; then
              break
      fi
    done
    gsql postgres -p 6432 -v ON_ERROR_STOP=on -c "CREATE USER gaussdbremote WITH SYSADMIN password '${gaussdbremote_password}'"
    check_result "$?" "${LINENO} create gaussdbremote user"
    # kmc加密gaussdbremote_password
    kmc_password=`python3 -c 'import gaussdb_kmc; print(gaussdb_kmc.encrypt_secret("'${gaussdbremote_password}'"))'`
    if [ "${kmc_password}" != "None" ];then
      log_info "${LINENO} Succeed to decrypt gaussdbremote password"
    else
      log_error "${LINENO} Failed to decrypt gaussdbremote password"
      exit 1
    fi

    PAYLOAD="{\"data\":{\"database.remoteUsername_V5\": \"Z2F1c3NkYnJlbW90ZQ==\", \"database.remotePassword_V5\": \"${kmc_password}\"}}"
    for i in {1..3}
    do
        curl --cacert ${rootCAFile} -o /dev/null -s\
          -X PATCH \
          -H "Content-Type: application/strategic-merge-patch+json" \
          -H "Authorization: Bearer ${tokenFile}" \
          --data "${PAYLOAD}" \
          https://${KUBERNETES_SERVICE_HOST}/api/v1/namespaces/dpa/secrets/common-secret
        log_info "${LINENO} Start to check gaussdbremote secret"
        secrets=$(curl --cacert ${rootCAFile} -X GET -H "Authorization: Bearer $tokenFile" https://${KUBERNETES_SERVICE_HOST}/api/v1/namespaces/dpa/secrets/common-secret)
        is_exist=$(echo "${secrets} " | python3 -c "import sys, json;print(json.load(sys.stdin)['data'].get('database.remotePassword_V5'))")
        if [ ! -z "${is_exist}" ] && [ "${is_exist}" != "None" ];then
            log_info "${LINENO} Succeed to add gaussdbremote secret"
            break
        fi
        log_error "${LINENO} Failed to add gaussdbremote secret"
        sleep 5
    done
else
    log_info "${LINENO} Username:${username}"
    log_info "${LINENO} Gaussdbremote already exists or sql execute faild"
fi

他会去gassudb进行查看,看看是否有初始化的用户,第一次的没有就进行初始化,之后发送一个请求,把数据发送到中,如果有这个用户,就不发送这个请求,就会导致om在获取common-secret的时候为空,导致失败

secrets=$(curl --cacert ${rootCAFile} -X GET -H "Authorization: Bearer $tokenFile" https://${KUBERNETES_SERVICE_HOST}/api/v1/namespaces/dpa/secrets/common-secret)

解决方法

从脚本创建的gassudb可以找到对应的数据存储位置,把数据进行删除就能解决这个问题

image.png

卸载

helm uninstall master-server -n dpa

重新安装

tar -zxvf open-eBackup_MasterServer_chart/databackup-1.6.0-RC2.tgz -C open-eBackup_MasterServer_chart
helm install master-server open-eBackup_MasterServer_chart/databackup --set global.gaussdbpwd=R2F1c3NkYl8xMjM= --set global.replicas=1 --set global.deploy_type=d10 -n dpa

总结

到这里就安装成功了,这里还是有一些细节的坑,需要注意,官网大部分都写了,但是细节上的没有写出来,所以解决起来挺麻烦的,需要去翻源码😑,安装的时候特别注意我文章标注的注意点,一点出错可能就会导致无法安装成功,这篇文章大部分内容是从官网过来的,主要加多了一些细节点。


kexb
474 声望15 粉丝