一、准备工作
1.1 环境介绍
PS:集群部署出来NodeName默认是节点IP地址的问题请阅读此issue:https://github.com/easzlab/ku...
本实验采用kubeasz作为kubernetes环境部署工具,它是一个基于二进制方式部署和利用ansible-playbook实现自动化来快速部署高可用kubernetes集群的工具,详细介绍请查看kubeasz官方。本实验用到的所有虚拟机默认软件源更新为阿里云的源,操作系统为最小化安装,已预装好vim、net-tools、ssh等常用工具,时钟默认已经全部同步至阿里云,默认已关闭操作系统自带防火墙。
本实验用到的操作系统版本为:
系统:Ubuntu Server 20.04 LTS 64位
Kubernetes:v1.26
运行时:containerd v1.6.8
网络:calico
1.2 地址规划
角色 | IP地址 | 主机名 | VIP |
---|---|---|---|
ETCD | 192.168.10.101 | etcd01 | |
ETCD | 192.168.10.102 | etcd02 | |
ETCD | 192.168.10.103 | etcd03 | |
MATSER/ANSIBLE | 192.168.10.104 | master01 | |
MASTER | 192.168.10.105 | master02 | |
MATSER | 192.168.10.106 | master03 | |
NODE | 192.168.10.107 | node01 | |
NODE | 192.168.10.108 | node02 | |
NODE | 192.168.10.109 | node03 | |
HA | 192.168.10.110 | ha01 | 192.168.10.115 |
HA | 192.168.10.111 | ha02 | 192.168.10.115 |
二、环境部署
2.1 基础环境搭建
首先解决ubuntu系统开机默认会把dns配置成127.0.0.53的问题,所有节点都要操作
参考https://blog.csdn.net/qifei71...
具体解决方法
修改 /etc/systemd/resolved.conf 文件
[Resolve]
DNS=8.8.8.8
之后以root身份执行
sudo systemctl restart systemd-resolved
sudo systemctl enable systemd-resolved
sudo mv /etc/resolv.conf /etc/resolv.conf.bak
sudo ln -s /run/systemd/resolve/resolv.conf /etc/resolv.conf
修改所有节点时区为上海
root@master01:~# timedatectl set-timezone Asia/Shanghai
在master01节点操作
安装ansible:
root@k8s-master01:~# apt install ansible
配置免密登录,需要配置为master01节点到所有节点都无须密码即可登录,只拿master02作为示范,其他节点操作类似。
生成秘钥对
root@master01:~# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa
Your public key has been saved in /root/.ssh/id_rsa.pub
The key fingerprint is:
SHA256:2kfD/vlpbkKZtG90oujjd90CLZZQHb4buOFSHx4p+so root@k8s-master01
The key's randomart image is:
+---[RSA 3072]----+
| ... |
| ... |
| . . |
| .. .. o |
| S +o=** |
| o o =X*+=.|
| . . =+o*++.|
| oo++.B.o|
| oE++O+. |
+----[SHA256]-----+
分发公钥至各个节点
root@master01:~# ssh-copy-id 192.168.10.102
/usr/bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/root/.ssh/id_rsa.pub"
The authenticity of host '192.168.10.102 (192.168.10.102)' can't be established.
ECDSA key fingerprint is SHA256:LHdJ1aX0Rx+tQlCcGKwIk7aJsFjsUm4/Ze7vwhMqsS8.
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@192.168.10.102's password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh '192.168.10.102'"
and check to make sure that only the key(s) you wanted were added.
配置DNS解析,并且将hosts文件同步到所有结点,只拿master02节点作为示范,其他节点操作类似。
root@master01:~# cat /etc/hosts
127.0.0.1 localhost
192.168.10.101 etcd01
192.168.10.102 etcd02
192.168.10.103 etcd03
192.168.10.104 master01
192.168.10.105 master02
192.168.10.106 master03
192.168.10.107 node01
192.168.10.108 node02
192.168.10.109 node03
192.168.10.110 ha1
192.168.10.111 ha2
将hosts文件同步到所有节点(仅使用个别节点举例)
root@master01:~# scp /etc/hosts 192.168.10.101:/etc/hosts
hosts 100% 512 349.0KB/s 00:00
root@master01:~# scp /etc/hosts 192.168.10.102:/etc/hosts
hosts 100% 512 346.3KB/s 00:00
2.2开始部署环境
在master01节点下载项目源码、二进制文件及离线镜像,下载工具脚本ezdown,举例使用kubeasz版本3.5.0
root@master01:~# export release=3.5.0
root@master01:~# wget https://github.com/easzlab/kubeasz/releases/download/${release}/ezdown
root@master01:~# chmod +x ezdown
下载kubeasz代码、二进制、默认容器镜像
root@master01:~# ./ezdown -D
下载完成后会提示
INFO Action successed: download_all
生成ansible hosts等相关配置文件。
root@master01:/etc/kubeasz# ./ezctl new k8s01
2022-12-31 22:46:36 DEBUG generate custom cluster files in /etc/kubeasz/clusters/k8s01
2022-12-31 22:46:36 DEBUG set versions
2022-12-31 22:46:36 DEBUG cluster k8s01: files successfully created.
2022-12-31 22:46:36 INFO next steps 1: to config '/etc/kubeasz/clusters/k8s01/hosts'
2022-12-31 22:46:36 INFO next steps 2: to config '/etc/kubeasz/clusters/k8s01/config.yml'
编辑hosts文件
root@k8s-master01:/etc/kubeasz/clusters/k8s01# cat hosts
# 'etcd' cluster should have odd member(s) (1,3,5,...)
[etcd]
192.168.10.101
192.168.10.102
192.168.10.103
# master node(s)
[kube_master]
192.168.10.104
192.168.10.105
192.168.10.106
# work node(s)
[kube_node]
192.168.10.107
192.168.10.108
192.168.10.109
# [optional] harbor server, a private docker registry
# 'NEW_INSTALL': 'true' to install a harbor server; 'false' to integrate with existed one
[harbor]
#192.168.1.8 NEW_INSTALL=false
# [optional] loadbalance for accessing k8s from outside
[ex_lb]
192.168.10.110 LB_ROLE=master EX_APISERVER_VIP=192.168.10.115 EX_APISERVER_PORT=6443
192.168.10.111 LB_ROLE=backup EX_APISERVER_VIP=192.168.10.115 EX_APISERVER_PORT=6443
# [optional] ntp server for the cluster
[chrony]
#192.168.1.1
[all:vars]
# --------- Main Variables ---------------
# Secure port for apiservers
SECURE_PORT="6443"
# Cluster container-runtime supported: docker, containerd
# if k8s version >= 1.24, docker is not supported
CONTAINER_RUNTIME="containerd"
# Network plugins supported: calico, flannel, kube-router, cilium, kube-ovn
CLUSTER_NETWORK="calico"
# Service proxy mode of kube-proxy: 'iptables' or 'ipvs'
PROXY_MODE="ipvs"
# K8S Service CIDR, not overlap with node(host) networking
SERVICE_CIDR="10.68.0.0/16"
# Cluster CIDR (Pod CIDR), not overlap with node(host) networking
CLUSTER_CIDR="172.20.0.0/16"
# NodePort Range
NODE_PORT_RANGE="30000-32767"
# Cluster DNS Domain
CLUSTER_DNS_DOMAIN="cluster.local"
# -------- Additional Variables (don't change the default value right now) ---
# Binaries Directory
bin_dir="/opt/kube/bin"
# Deploy Directory (kubeasz workspace)
base_dir="/etc/kubeasz"
# Directory for a specific cluster
cluster_dir="{{ base_dir }}/clusters/k8s01"
# CA and other components cert/key Directory
ca_dir="/etc/kubernetes/ssl"
编辑config.yaml文件,主要将coredns与metric-server自动安装打开。
root@master01:/etc/kubeasz/clusters/k8s01# cat config.yml
############################
# prepare
############################
# 可选离线安装系统软件包 (offline|online)
INSTALL_SOURCE: "online"
# 可选进行系统安全加固 github.com/dev-sec/ansible-collection-hardening
OS_HARDEN: false
############################
# role:deploy
############################
# default: ca will expire in 100 years
# default: certs issued by the ca will expire in 50 years
CA_EXPIRY: "876000h"
CERT_EXPIRY: "438000h"
# force to recreate CA and other certs, not suggested to set 'true'
CHANGE_CA: false
# kubeconfig 配置参数
CLUSTER_NAME: "cluster1"
CONTEXT_NAME: "context-{{ CLUSTER_NAME }}"
# k8s version
K8S_VER: "1.26.0"
############################
# role:etcd
############################
# 设置不同的wal目录,可以避免磁盘io竞争,提高性能
ETCD_DATA_DIR: "/var/lib/etcd"
ETCD_WAL_DIR: ""
############################
# role:runtime [containerd,docker]
############################
# ------------------------------------------- containerd
# [.]启用容器仓库镜像
ENABLE_MIRROR_REGISTRY: true
# [containerd]基础容器镜像
SANDBOX_IMAGE: "easzlab.io.local:5000/easzlab/pause:3.9"
# [containerd]容器持久化存储目录
CONTAINERD_STORAGE_DIR: "/var/lib/containerd"
# ------------------------------------------- docker
# [docker]容器存储目录
DOCKER_STORAGE_DIR: "/var/lib/docker"
# [docker]开启Restful API
ENABLE_REMOTE_API: false
# [docker]信任的HTTP仓库
INSECURE_REG: '["http://easzlab.io.local:5000"]'
############################
# role:kube-master
############################
# k8s 集群 master 节点证书配置,可以添加多个ip和域名(比如增加公网ip和域名)
MASTER_CERT_HOSTS:
- "10.1.1.1"
- "k8s.easzlab.io"
- "www.snow.com"
# node 节点上 pod 网段掩码长度(决定每个节点最多能分配的pod ip地址)
# 如果flannel 使用 --kube-subnet-mgr 参数,那么它将读取该设置为每个节点分配pod网段
# https://github.com/coreos/flannel/issues/847
NODE_CIDR_LEN: 24
############################
# role:kube-node
############################
# Kubelet 根目录
KUBELET_ROOT_DIR: "/var/lib/kubelet"
# node节点最大pod 数
MAX_PODS: 110
# 配置为kube组件(kubelet,kube-proxy,dockerd等)预留的资源量
# 数值设置详见templates/kubelet-config.yaml.j2
KUBE_RESERVED_ENABLED: "no"
# k8s 官方不建议草率开启 system-reserved, 除非你基于长期监控,了解系统的资源占用状况;
# 并且随着系统运行时间,需要适当增加资源预留,数值设置详见templates/kubelet-config.yaml.j2
# 系统预留设置基于 4c/8g 虚机,最小化安装系统服务,如果使用高性能物理机可以适当增加预留
# 另外,集群安装时候apiserver等资源占用会短时较大,建议至少预留1g内存
SYS_RESERVED_ENABLED: "no"
############################
# role:network [flannel,calico,cilium,kube-ovn,kube-router]
############################
# ------------------------------------------- flannel
# [flannel]设置flannel 后端"host-gw","vxlan"等
FLANNEL_BACKEND: "vxlan"
DIRECT_ROUTING: false
# [flannel]
flannel_ver: "v0.19.2"
# ------------------------------------------- calico
# [calico] IPIP隧道模式可选项有: [Always, CrossSubnet, Never],跨子网可以配置为Always与CrossSubnet(公有云建议使用always比较省事,其他的话需要修改各自公有云的网络配置,具体可以参考各个公有云说明)
# 其次CrossSubnet为隧道+BGP路由混合模式可以提升网络性能,同子网配置为Never即可.
CALICO_IPV4POOL_IPIP: "Always"
# [calico]设置 calico-node使用的host IP,bgp邻居通过该地址建立,可手工指定也可以自动发现
IP_AUTODETECTION_METHOD: "can-reach={{ groups['kube_master'][0] }}"
# [calico]设置calico 网络 backend: brid, vxlan, none
CALICO_NETWORKING_BACKEND: "brid"
# [calico]设置calico 是否使用route reflectors
# 如果集群规模超过50个节点,建议启用该特性
CALICO_RR_ENABLED: false
# CALICO_RR_NODES 配置route reflectors的节点,如果未设置默认使用集群master节点
# CALICO_RR_NODES: ["192.168.1.1", "192.168.1.2"]
CALICO_RR_NODES: []
# [calico]更新支持calico 版本: ["3.19", "3.23"]
calico_ver: "v3.23.5"
# [calico]calico 主版本
calico_ver_main: "{{ calico_ver.split('.')[0] }}.{{ calico_ver.split('.')[1] }}"
# ------------------------------------------- cilium
# [cilium]镜像版本
cilium_ver: "1.12.4"
cilium_connectivity_check: true
cilium_hubble_enabled: false
cilium_hubble_ui_enabled: false
# ------------------------------------------- kube-ovn
# [kube-ovn]选择 OVN DB and OVN Control Plane 节点,默认为第一个master节点
OVN_DB_NODE: "{{ groups['kube_master'][0] }}"
# [kube-ovn]离线镜像tar包
kube_ovn_ver: "v1.5.3"
# ------------------------------------------- kube-router
# [kube-router]公有云上存在限制,一般需要始终开启 ipinip;自有环境可以设置为 "subnet"
OVERLAY_TYPE: "full"
# [kube-router]NetworkPolicy 支持开关
FIREWALL_ENABLE: true
# [kube-router]kube-router 镜像版本
kube_router_ver: "v0.3.1"
busybox_ver: "1.28.4"
############################
# role:cluster-addon
############################
# coredns 自动安装
dns_install: "yes"
corednsVer: "1.9.3"
ENABLE_LOCAL_DNS_CACHE: true
dnsNodeCacheVer: "1.22.13"
# 设置 local dns cache 地址
LOCAL_DNS_CACHE: "169.254.20.10"
# metric server 自动安装
metricsserver_install: "yes"
metricsVer: "v0.5.2"
# dashboard 自动安装
dashboard_install: "no"
dashboardVer: "v2.7.0"
dashboardMetricsScraperVer: "v1.0.8"
# prometheus 自动安装
prom_install: "no"
prom_namespace: "monitor"
prom_chart_ver: "39.11.0"
# nfs-provisioner 自动安装
nfs_provisioner_install: "no"
nfs_provisioner_namespace: "kube-system"
nfs_provisioner_ver: "v4.0.2"
nfs_storage_class: "managed-nfs-storage"
nfs_server: "192.168.1.10"
nfs_path: "/data/nfs"
# network-check 自动安装
network_check_enabled: false
network_check_schedule: "*/5 * * * *"
############################
# role:harbor
############################
# harbor version,完整版本号
HARBOR_VER: "v2.1.5"
HARBOR_DOMAIN: "harbor.easzlab.io.local"
HARBOR_PATH: /var/data
HARBOR_TLS_PORT: 8443
HARBOR_REGISTRY: "{{ HARBOR_DOMAIN }}:{{ HARBOR_TLS_PORT }}"
# if set 'false', you need to put certs named harbor.pem and harbor-key.pem in directory 'down'
HARBOR_SELF_SIGNED_CERT: true
# install extra component
HARBOR_WITH_NOTARY: false
HARBOR_WITH_TRIVY: false
HARBOR_WITH_CLAIR: false
HARBOR_WITH_CHARTMUSEUM: true
部署k8s集群
基础环境初始化,准备CA和基础系统设置
root@k8s-master01:/etc/kubeasz# ./ezctl setup k8s01 01
执行完毕后提示如下信息表示安装没问题
PLAY RECAP *************************************************************************************************************************************************************************************************
192.168.10.101 : ok=23 changed=6 unreachable=0 failed=0 skipped=97 rescued=0 ignored=0
192.168.10.102 : ok=23 changed=6 unreachable=0 failed=0 skipped=97 rescued=0 ignored=0
192.168.10.103 : ok=23 changed=6 unreachable=0 failed=0 skipped=97 rescued=0 ignored=0
192.168.10.104 : ok=24 changed=6 unreachable=0 failed=0 skipped=96 rescued=0 ignored=0
192.168.10.105 : ok=23 changed=6 unreachable=0 failed=0 skipped=97 rescued=0 ignored=0
192.168.10.106 : ok=23 changed=6 unreachable=0 failed=0 skipped=97 rescued=0 ignored=0
192.168.10.107 : ok=23 changed=6 unreachable=0 failed=0 skipped=97 rescued=0 ignored=0
192.168.10.108 : ok=23 changed=6 unreachable=0 failed=0 skipped=97 rescued=0 ignored=0
192.168.10.109 : ok=23 changed=6 unreachable=0 failed=0 skipped=97 rescued=0 ignored=0
192.168.10.110 : ok=1 changed=0 unreachable=0 failed=0 skipped=78 rescued=0 ignored=0
192.168.10.111 : ok=1 changed=0 unreachable=0 failed=0 skipped=78 rescued=0 ignored=0
localhost : ok=31 changed=21 unreachable=0 failed=0 skipped=13 rescued=0 ignored=0
部署etcd集群
root@k8s-master01:/etc/kubeasz# ./ezctl setup k8s01 02
执行完毕后提示如下信息表示安装没问题
PLAY RECAP *************************************************************************************************************************************************************************************************
192.168.10.101 : ok=10 changed=5 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
192.168.10.102 : ok=8 changed=4 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
192.168.10.103 : ok=8 changed=4 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
验证etcd集群
在etcd01节点执行,返回如下信息证明etcd集群运行正常
root@etcd01:~# export ETCD_IPS="192.168.10.101 192.168.10.102 192.168.10.103"
root@etcd01:~# cp -a /opt/kube/bin/etcdctl /usr/local/bin/
root@etcd01:~# for ip in ${ETCD_IPS}; do ETCD_API=3 /usr/local/bin/etcdctl --endpoints=https://${ip}:2379 --cacert=/etc/kubernetes/ssl/ca.pem --cert=/etc/kubernetes/ssl/etcd.pem --key=/etc/kubernetes/ssl/etcd-key.pem endpoint health; done
https://192.168.10.101:2379 is healthy: successfully committed proposal: took = 35.684337ms
https://192.168.10.102:2379 is healthy: successfully committed proposal: took = 21.125465ms
https://192.168.10.103:2379 is healthy: successfully committed proposal: took = 28.284479ms
安装容器运行时
root@k8s-master01:/etc/kubeasz# ./ezctl setup k8s01 03
执行完毕后提示如下信息表示安装没问题
PLAY RECAP *************************************************************************************************************************************************************************************************
192.168.10.104 : ok=2 changed=1 unreachable=0 failed=0 skipped=28 rescued=0 ignored=0
192.168.10.105 : ok=2 changed=1 unreachable=0 failed=0 skipped=25 rescued=0 ignored=0
192.168.10.106 : ok=2 changed=1 unreachable=0 failed=0 skipped=25 rescued=0 ignored=0
192.168.10.107 : ok=2 changed=1 unreachable=0 failed=0 skipped=25 rescued=0 ignored=0
192.168.10.108 : ok=2 changed=1 unreachable=0 failed=0 skipped=25 rescued=0 ignored=0
192.168.10.109 : ok=2 changed=1 unreachable=0 failed=0 skipped=25 rescued=0 ignored=0
安装master节点
root@k8s-master01:/etc/kubeasz# ./ezctl setup k8s01 04
执行完毕后提示如下信息表示安装没问题
PLAY RECAP *************************************************************************************************************************************************************************************************
192.168.10.104 : ok=55 changed=36 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
192.168.10.105 : ok=54 changed=36 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
192.168.10.106 : ok=54 changed=36 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
部署node节点
root@master01:/etc/kubeasz# ./ezctl setup k8s01 05
执行完毕后提示如下信息表示安装没问题
PLAY RECAP *************************************************************************************************************************************************************************************************
192.168.10.107 : ok=35 changed=21 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
192.168.10.108 : ok=35 changed=21 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
192.168.10.109 : ok=35 changed=21 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
部署网络服务
root@master01:/etc/kubeasz# ./ezctl setup k8s01 06
执行完毕后提示如下信息表示安装没问题
PLAY RECAP *************************************************************************************************************************************************************************************************
192.168.10.104 : ok=13 changed=7 unreachable=0 failed=0 skipped=39 rescued=0 ignored=0
192.168.10.105 : ok=7 changed=3 unreachable=0 failed=0 skipped=16 rescued=0 ignored=0
192.168.10.106 : ok=7 changed=3 unreachable=0 failed=0 skipped=16 rescued=0 ignored=0
192.168.10.107 : ok=7 changed=3 unreachable=0 failed=0 skipped=16 rescued=0 ignored=0
192.168.10.108 : ok=7 changed=3 unreachable=0 failed=0 skipped=16 rescued=0 ignored=0
192.168.10.109 : ok=7 changed=3 unreachable=0 failed=0 skipped=16 rescued=0 ignored=0
验证网络服务,输出如下信息表示网络服务正常。
root@master01:~# calicoctl node status
Calico process is running.
IPv4 BGP status
+----------------+-------------------+-------+----------+-------------+
| PEER ADDRESS | PEER TYPE | STATE | SINCE | INFO |
+----------------+-------------------+-------+----------+-------------+
| 192.168.10.105 | node-to-node mesh | up | 15:45:31 | Established |
| 192.168.10.106 | node-to-node mesh | up | 15:45:30 | Established |
| 192.168.10.107 | node-to-node mesh | up | 15:45:31 | Established |
| 192.168.10.108 | node-to-node mesh | up | 15:45:31 | Established |
| 192.168.10.109 | node-to-node mesh | up | 15:45:30 | Established |
+----------------+-------------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.
部署负载均衡服务
root@master01:/etc/kubeasz# ./ezctl setup k8s01 10
执行完毕后提示如下信息表示安装没问题
PLAY RECAP ***************************************************************************************************
192.168.10.110 : ok=17 changed=14 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
192.168.10.111 : ok=16 changed=14 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
部署coredns与metric-server
root@master01:/etc/kubeasz# ./ezctl setup k8s01 07
执行完毕后提示如下信息表示安装没问题
localhost : ok=8 changed=7 unreachable=0 failed=0 skipped=34 rescued=0 ignored=0
三、集群验证
3.1 集群部署完成验证
root@k8s-master01:~# kubectl get no
NAME STATUS ROLES AGE VERSION
192.168.10.104 Ready,SchedulingDisabled master 51m v1.25.4
192.168.10.105 Ready,SchedulingDisabled master 51m v1.25.4
192.168.10.106 Ready,SchedulingDisabled master 51m v1.25.4
192.168.10.107 Ready node 38m v1.25.4
192.168.10.108 Ready node 38m v1.25.4
192.168.10.109 Ready node 38m v1.25.4
3.2 metric-server验证
root@master01:/etc/kubeasz# kubectl top node
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
192.168.10.104 111m 11% 1776Mi 70%
192.168.10.105 114m 11% 669Mi 98%
192.168.10.106 130m 13% 657Mi 97%
192.168.10.107 61m 6% 482Mi 71%
192.168.10.108 82m 8% 530Mi 78%
192.168.10.109 59m 5% 513Mi 75%
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。