编译过程中的细节点可以参考我的上一篇文章
https://segmentfault.com/a/1190000045492463
官网部署地址教程
https://gitcode.com/eBackup/open-eBackup/blob/master/doc/quic...
需要的环境
k8s 1.23.6
docker 18.09.0 及后续版本
helm
软件安装
helm安装
wget https://get.helm.sh/helm-v3.13.3-linux-arm64.tar.gz
tar -zxvf helm-v3.13.3-linux-arm64.tar.gz linux-arm64/helm
cp linux-arm64/helm /usr/local/bin
docker安装
执行以下命令安装docker。
yum install -y docker
cat <<EOF > /etc/docker/daemon.json
{
"registry-mirrors": ["https://hub-mirror.c.163.com", "https://docker.mirrors.ustc.edu.cn", "https://registry.docker-cn.com"],
"insecure-registries": ["hub-mirror.c.163.com", "docker.mirrors.ustc.edu.cn", "registry.docker-cn.com"],
"seccomp-profile": "/etc/docker/profile.json",
"exec-opts": ["native.cgroupdriver=systemd"],
"experimental":true
}
EOF
cat <<EOF > /etc/docker/profile.json
{}
EOF
注意:官网这里的registry-mirrors可能会不可用,因为某些原因,建议是配置docker代理
配置docker代理
sudo mkdir -p /etc/systemd/system/docker.service.d
sudo vi /etc/systemd/system/docker.service.d/http-proxy.conf
# http-proxy.conf
[Service]
Environment="HTTP_PROXY=http://127.0.0.1:7890/"
Environment="HTTPS_PROXY=http://127.0.0.1:7890/"
Environment="NO_PROXY=localhost,127.0.0.1"
重启生效
sudo systemctl daemon-reload
sudo systemctl restart docker
Kubenetes安装
这里跟官网的系统一样(openEuler 22.03系统)
在安装kubenetes 需要先禁用防火墙跟swap
1.禁用防火墙
1.查看当前防火墙状态
systemctl status firewalld
2. 禁用防火墙服务
systemctl disable firewalld
3.验证防火墙已禁用
systemctl status firewalld
2.禁用Swap
1. 关闭 Swap 分区
swapoff -a
2.永久禁用 Swap
编辑 /etc/fstab 文件,注释掉所有包含 swap 的行:
sudo vi /etc/fstab
# /dev/mapper/openeuler01-swap none swap defaults 0 0
3.添加K8S yum源
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-aarch64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
EOF
4.安装kubeadm,kubelet,kubectl(指定版本)
yum install -y kubelet-1.23.6 kubeadm-1.23.6 kubectl-1.23.6
设置kubelet开机自启动
systemctl enable kubelet
5.初始化k8s
初始化之前需要先修改节点名称
hostnamectl set-hostname master
管理平面为master的IP地址
kubeadm init --apiserver-advertise-address=<管理平面IP> --image-repository registry.aliyuncs.com/google_containers --service-cidr=10.1.0.0/16 --pod-network-cidr=10.244.0.0/16 --kubernetes-version=v1.23.6
6.去除污点节点和给节点添加标签
kubectl taint node master node-role.kubernetes.io/master-
kubectl label nodes master role=MASTER --overwrite
网络插件安装
1.编辑kube-flannel.yaml
---
kind: Namespace
apiVersion: v1
metadata:
name: kube-flannel
labels:
k8s-app: flannel
pod-security.kubernetes.io/enforce: privileged
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: flannel
name: flannel
rules:
- apiGroups:
- ""
resources:
- pods
verbs:
- get
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- nodes/status
verbs:
- patch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
labels:
k8s-app: flannel
name: flannel
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: flannel
subjects:
- kind: ServiceAccount
name: flannel
namespace: kube-flannel
---
apiVersion: v1
kind: ServiceAccount
metadata:
labels:
k8s-app: flannel
name: flannel
namespace: kube-flannel
---
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
namespace: kube-flannel
labels:
tier: node
k8s-app: flannel
app: flannel
data:
cni-conf.json: |
{
"name": "cbr0",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "flannel",
"delegate": {
"hairpinMode": true,
"isDefaultGateway": true
}
},
{
"type": "portmap",
"capabilities": {
"portMappings": true
}
}
]
}
net-conf.json: |
{
"Network": "10.244.0.0/16",
"EnableNFTables": false,
"Backend": {
"Type": "vxlan"
}
}
---
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: kube-flannel-ds
namespace: kube-flannel
labels:
tier: node
app: flannel
k8s-app: flannel
spec:
selector:
matchLabels:
app: flannel
template:
metadata:
labels:
tier: node
app: flannel
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
hostNetwork: true
priorityClassName: system-node-critical
tolerations:
- operator: Exists
effect: NoSchedule
serviceAccountName: flannel
initContainers:
- name: install-cni-plugin
image: docker.io/flannel/flannel-cni-plugin:v1.5.1-flannel2
command:
- cp
args:
- -f
- /flannel
- /opt/cni/bin/flannel
volumeMounts:
- name: cni-plugin
mountPath: /opt/cni/bin
- name: install-cni
image: docker.io/flannel/flannel:v0.26.0
command:
- cp
args:
- -f
- /etc/kube-flannel/cni-conf.json
- /etc/cni/net.d/10-flannel.conflist
volumeMounts:
- name: cni
mountPath: /etc/cni/net.d
- name: flannel-cfg
mountPath: /etc/kube-flannel/
containers:
- name: kube-flannel
image: docker.io/flannel/flannel:v0.26.0
command:
- /opt/bin/flanneld
args:
- --ip-masq
- --kube-subnet-mgr
resources:
requests:
cpu: "100m"
memory: "50Mi"
securityContext:
privileged: false
capabilities:
add: ["NET_ADMIN", "NET_RAW"]
env:
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: EVENT_QUEUE_DEPTH
value: "5000"
volumeMounts:
- name: run
mountPath: /run/flannel
- name: flannel-cfg
mountPath: /etc/kube-flannel/
- name: xtables-lock
mountPath: /run/xtables.lock
volumes:
- name: run
hostPath:
path: /run/flannel
- name: cni-plugin
hostPath:
path: /opt/cni/bin
- name: cni
hostPath:
path: /etc/cni/net.d
- name: flannel-cfg
configMap:
name: kube-flannel-cfg
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
2.安装flannel服务
注意点: docker 需要需要配置代理,不要会拉取镜像失败,导致无法安装flannel服务
kubectl apply -f kube-flannel.yaml
3.安装cni-plugins-linux(容器网络接口(CNI)插件的二进制包)
下载地址
https://github.com/containernetworking/plugins/releases/tag/v0.9.1
4.解压当/opt/cni/bin目录下
tar -zxvf cni-plugins-linux-arm64-v0.9.1.tgz -C /opt/cni/bin
Traefik安装
1.拉取traefik镜像:
docker pull swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/rancher/mirrored-library-traefik:2.10.7-linuxarm64v8
docker tag swr.cn-north-4.myhuaweicloud.com/ddn-k8s/docker.io/rancher/mirrored-library-traefik:2.10.7-linuxarm64v8 traefik:v2.11.2
2.编辑traefik_config.yaml:
特别注意:IP设置当前主机地址,这里有2个25081端口,不需要修改,127.0.0.1:25081,不能修改,如果修改会导致之后登录的时候登录不进去
sudo vi traefik_config.yaml
kind: ConfigMap
apiVersion: v1
metadata:
name: traefik-config
namespace: kube-system
data:
traefik.yaml: |-
ping: "" ## 启用 Ping
serversTransport:
insecureSkipVerify: true ## Traefik 忽略验证代理服务的 TLS 证书
api:
insecure: true ## 允许 HTTP 方式访问 API
dashboard: false ## 启用 Dashboard
debug: false ## 启用 Debug 调试模式
metrics:
prometheus: "" ## 配置 Prometheus 监控指标数据,并使用默认配置
entryPoints:
gui:
address: "<IP_ADDR>:25080"
http:
redirections:
entryPoint:
to: gui
scheme: https
forwardedHeaders:
insecure: true
transport:
respondingTimeouts:
readTimeout: 0
proxyProtocol:
insecure: true
third:
address: "<IP_ADDR>:25081"
forwardedHeaders:
insecure: true
transport:
respondingTimeouts:
readTimeout: 0
proxyProtocol:
insecure: true
third_local:
address: "127.0.0.1:25081"
forwardedHeaders:
insecure: true
transport:
respondingTimeouts:
readTimeout: 0
proxyProtocol:
insecure: true
agent:
address: "<IP_ADDR>:25082"
forwardedHeaders:
insecure: true
proxyProtocol:
insecure: true
transport:
respondingTimeouts:
readTimeout: 0
traefik:
address: "127.0.0.1:25083"
forwardedHeaders:
insecure: true
transport:
respondingTimeouts:
readTimeout: 0
internal:
address: "<IP_ADDR>:25084"
forwardedHeaders:
insecure: true
transport:
respondingTimeouts:
readTimeout: 0
replica:
address: "<IP_ADDR>:25085"
forwardedHeaders:
insecure: true
transport:
respondingTimeouts:
readTimeout: 0
providers:
kubernetesCRD: "" ## 启用 Kubernetes CRD 方式来配置路由规则
kubernetesIngress: "" ## 启用 Kubernetes Ingress 方式来配置路由规则
log:
filePath: "/var/log/traefik/msg_traefik.log" ## 设置调试日志文件存储路径,如果为空则输出到控制台
level: INFO ## 设置日志级别
format: json ## 设置调试日志格式
accessLog:
filePath: "" ## 设置访问日志文件存储路径,如果为空则输出到控制台
format: json ## 设置访问调试日志格式
bufferingSize: 0 ## 设置访问日志缓存行数
filters:
#statusCodes: ["200"] ## 设置只保留指定状态码范围内的访问日志
retryAttempts: true ## 设置代理访问重试失败时,保留访问日志
minDuration: 20 ## 设置保留请求时间超过指定持续时间的访问日志
fields: ## 设置访问日志中的字段是否保留(keep 保留、drop 不保留)
defaultMode: keep ## 设置默认保留访问日志字段
names: ## 针对访问日志特别字段特别配置保留模式
ClientUsername: drop
headers: ## 设置 Header 中字段是否保留
defaultMode: keep ## 设置默认保留 Header 中字段
names: ## 针对 Header 中特别字段特别配置保留模式
User-Agent: redact
Authorization: drop
Content-Type: keep
3.编辑traefik_rbac.yaml
sudo vi traefik_rbac.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: traefik-ingress-controller
rules:
- apiGroups:
- ""
resources:
- services
- endpoints
- secrets
verbs:
- get
- list
- watch
- apiGroups:
- extensions
- networking.k8s.io
resources:
- ingresses
- ingressclasses
verbs:
- get
- list
- watch
- apiGroups:
- extensions
- networking.k8s.io
resources:
- ingresses/status
verbs:
- update
- apiGroups:
- traefik.io
- traefik.containo.us
resources:
- middlewares
- middlewaretcps
- ingressroutes
- traefikservices
- ingressroutetcps
- ingressrouteudps
- tlsoptions
- tlsstores
- serverstransports
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: traefik-ingress-controller
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: traefik-ingress-controller
subjects:
- kind: ServiceAccount
name: traefik-ingress-controller
namespace: kube-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
namespace: kube-system
name: traefik-ingress-controller
4.编辑traefik_ingress_controller.yaml
sudo vi traefik_rbac.yaml
apiVersion: v1
kind: Service
metadata:
name: traefik
namespace: kube-system
spec:
ports:
- name: internal
port: 80
targetPort: 25084
- name: agent
port: 90
targetPort: 25082
selector:
app: traefik
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: traefik-ingress-controller
namespace: kube-system
labels:
app: traefik
spec:
selector:
matchLabels:
app: traefik
template:
metadata:
name: traefik
labels:
app: traefik
spec:
securityContext:
seccompProfile:
type: RuntimeDefault
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node-role.kubernetes.io/master
operator: Exists
serviceAccountName: traefik-ingress-controller
terminationGracePeriodSeconds: 1
hostNetwork: true
containers:
- image: traefik:v2.11.2
name: traefik-ingress-lb
securityContext:
runAsNonRoot: false
ports:
- name: gui
containerPort: 25080 #entryPoint端口,与configMap配置一致
hostPort: 25080 #宿主机规划占用端口
- name: third
containerPort: 25081 #entryPoint端口
hostPort: 25081 #宿主机规划占用端口
- name: agent
containerPort: 25082 #entryPoint端口
hostPort: 25082 #宿主机规划占用端口
- name: traefik
containerPort: 25083
hostPort: 25083
- name: internal
containerPort: 25084
hostPort: 25084
command: ["/usr/local/bin/traefik"]
args:
- --configfile=/config/traefik.yaml
volumeMounts:
- mountPath: "/config"
name: "config"
volumes:
- name: config
configMap:
name: traefik-config
tolerations: ## 设置容忍所有污点,防止节点被设置污点
- key: node-role.kubernetes.io/master
operator: Exists
5.下载traefik
mkdir -p /opt/k8s/run/conf/traefik
cd /opt/k8s/run/conf/traefik
wget https://github.com/traefik/traefik-helm-chart/archive/refs/tags/v27.0.2.tar.gz
tar -xvf traefik-helm-chart-27.0.2.tar.gz
cd traefik-helm-chart-27.0.2/traefik
kubectl apply -f crds
6.部署traefik资源
kubectl apply -f traefik_config.yaml
kubectl apply -f traefik_rbac.yaml
kubectl apply -f traefik_ingress_controller.yaml
检查是否安装完成功
kubectl get pods -A
都跑这样证明k8s主要的都安装完成了
开始部署
1.创建dpa命名空间
kubectl create namespace dpa
2.进入安装目录
mdir -p /open-eBackup
cd /open-eBackup
2.上传open-ebackup-1.0.zip到安装环境的/open-eBackup。并解压安装包
编译open-ebackup-1.0.zip 可以结合官网跟我上一篇文章
https://segmentfault.com/a/1190000045492463
unzip open-ebackup-1.0.zip
3.安装MasterServer
tar -zxvf open-eBackup_1.6.RC2_MasterServer.tgz
mkdir open-eBackup_MasterServer_image
tar -zxvf open-eBackup_1.6.RC2_MasterServer.tgz -C open-eBackup_MasterServer_image
docker load -i open-eBackup_MasterServer_image/open-eBackup_1.6.RC2_MasterServer.tar.xz
tar -zxvf open-eBackup_1.6.RC2_MediaServer.tgz
mkdir open-eBackup_MediaServer_image
tar -zxvf open-eBackup_1.6.RC2_MediaServer.tgz -C open-eBackup_MediaServer_image
docker load -i open-eBackup_MediaServer_image/open-eBackup_1.6.RC2_MediaServer.tar.xz
mkdir open-eBackup_MasterServer_chart
tar -zxvf open-eBackup_MasterServer_chart.tgz -C open-eBackup_MasterServer_chart
tar -zxvf open-eBackup_MasterServer_chart/databackup-1.6.0-RC2.tgz -C open-eBackup_MasterServer_chart
helm install master-server open-eBackup_MasterServer_chart/databackup --set global.gaussdbpwd=R2F1c3NkYl8xMjM= --set global.replicas=1 --set global.deploy_type=d10 -n dpa
4.查看安装结果
kubectl get pods -n dpa
画红款的可以进行忽略,这是安装media server遇到的问题
其他几项能全部跑起来就没问题
之后访问输入你的主机ip,端口25080进行管理页面,如果能看到这些证明部署成功了
默认用户名密码是 sysadmin Admin@123,首次登录成功会让你修改密码
遇到的问题
第一次部署成功后,如果没有进行登录修改默认密码,在重新进行install会导致infrastructure-的om容器无法跑起来。
会出现以上报错
出错位置
进去om容器
kubectl exec -it infrastructure-0 -n dpa -c om -- /bin/bash
om异常路径
/opt/om/package/src/app/service/update_password/service_update_password.pyc
原因是gaussdb重新启动时脚本(判断用户名存在后)没有写入common_secret,导致下面的报错
查看gassudb 执行的脚本
执行的脚本位置
/open-eBackup-master/src/Infrastructure_OM/infrastructure/script/gaussdb/install_gaussdb.sh
# 检查是否已经创建gaussdbremote,已存在则不再创建
username=$(gsql postgres -p 6432 -v ON_ERROR_STOP=on -Atc "select usename from pg_user where usename='gaussdbremote';")
# ON_ERROR_STOP=on时,return 0代表sql执行成功
if [ $? -eq 0 ] && [ "$username" != "gaussdbremote" ]; then
log_info "${LINENO} Start to create gaussdbremote"
log_info "${LINENO} Username:${username}"
# 生成随机密码,密码中至少包含三种字符
while true; do
gaussdbremote_password=$(openssl rand -base64 8)
if [[ $(echo "$gaussdbremote_password" | grep -c '[a-z]') -ge 1 && \
$(echo "$gaussdbremote_password" | grep -c '[A-Z]') -ge 1 && \
$(echo "$gaussdbremote_password" | grep -c '[0-9]') -ge 1 ]]; then
break
fi
done
gsql postgres -p 6432 -v ON_ERROR_STOP=on -c "CREATE USER gaussdbremote WITH SYSADMIN password '${gaussdbremote_password}'"
check_result "$?" "${LINENO} create gaussdbremote user"
# kmc加密gaussdbremote_password
kmc_password=`python3 -c 'import gaussdb_kmc; print(gaussdb_kmc.encrypt_secret("'${gaussdbremote_password}'"))'`
if [ "${kmc_password}" != "None" ];then
log_info "${LINENO} Succeed to decrypt gaussdbremote password"
else
log_error "${LINENO} Failed to decrypt gaussdbremote password"
exit 1
fi
PAYLOAD="{\"data\":{\"database.remoteUsername_V5\": \"Z2F1c3NkYnJlbW90ZQ==\", \"database.remotePassword_V5\": \"${kmc_password}\"}}"
for i in {1..3}
do
curl --cacert ${rootCAFile} -o /dev/null -s\
-X PATCH \
-H "Content-Type: application/strategic-merge-patch+json" \
-H "Authorization: Bearer ${tokenFile}" \
--data "${PAYLOAD}" \
https://${KUBERNETES_SERVICE_HOST}/api/v1/namespaces/dpa/secrets/common-secret
log_info "${LINENO} Start to check gaussdbremote secret"
secrets=$(curl --cacert ${rootCAFile} -X GET -H "Authorization: Bearer $tokenFile" https://${KUBERNETES_SERVICE_HOST}/api/v1/namespaces/dpa/secrets/common-secret)
is_exist=$(echo "${secrets} " | python3 -c "import sys, json;print(json.load(sys.stdin)['data'].get('database.remotePassword_V5'))")
if [ ! -z "${is_exist}" ] && [ "${is_exist}" != "None" ];then
log_info "${LINENO} Succeed to add gaussdbremote secret"
break
fi
log_error "${LINENO} Failed to add gaussdbremote secret"
sleep 5
done
else
log_info "${LINENO} Username:${username}"
log_info "${LINENO} Gaussdbremote already exists or sql execute faild"
fi
他会去gassudb进行查看,看看是否有初始化的用户,第一次的没有就进行初始化,之后发送一个请求,把数据发送到中,如果有这个用户,就不发送这个请求,就会导致om在获取common-secret的时候为空,导致失败
secrets=$(curl --cacert ${rootCAFile} -X GET -H "Authorization: Bearer $tokenFile" https://${KUBERNETES_SERVICE_HOST}/api/v1/namespaces/dpa/secrets/common-secret)
解决方法
从脚本创建的gassudb可以找到对应的数据存储位置,把数据进行删除就能解决这个问题
卸载
helm uninstall master-server -n dpa
重新安装
tar -zxvf open-eBackup_MasterServer_chart/databackup-1.6.0-RC2.tgz -C open-eBackup_MasterServer_chart
helm install master-server open-eBackup_MasterServer_chart/databackup --set global.gaussdbpwd=R2F1c3NkYl8xMjM= --set global.replicas=1 --set global.deploy_type=d10 -n dpa
总结
到这里就安装成功了,这里还是有一些细节的坑,需要注意,官网大部分都写了,但是细节上的没有写出来,所以解决起来挺麻烦的,需要去翻源码😑,安装的时候特别注意我文章标注的注意点,一点出错可能就会导致无法安装成功,这篇文章大部分内容是从官网过来的,主要加多了一些细节点。
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。