参考资料

Docker docs
安装kubeadm | Kubernetes
cri-docker github
Flannel github

准备阶段

环境
设备:单卡4060ti + 4090
系统:ubuntu 22.04

1. 安装docker

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

由于当前国内无法拉取镜像,需配置代理,有两种方式

  1. 找到守护进程的代理配置/etc/systemd/system/docker.service.d/proxy.conf,进行如下修改

    [Service]
    Environment="HTTP_PROXY=http://proxy.example.com:8080/"
    Environment="HTTPS_PROXY=http://proxy.example.com:8080/"
    Environment="NO_PROXY=localhost,127.0.0.1,.example.com"

    重新加载配置并重启服务

    sudo systemctl daemon-reload
    sudo systemctl restart docker
  2. 配置docker配置文件

    {
      "proxies": {
         "default": {
           "httpProxy": "http://proxy.example.com:8080",
           "httpsProxy": "http://proxy.example.com:8080",
           "noProxy": "localhost,127.0.0.1,.example.com"
         }
      }
    }
    

2. 安装cri-docker

下载安装包

将可执行文件放到/usr/bin或者/usr/local/bin

下载systemd默认配置文件,并进行一些修改
https://github.com/Mirantis/cri-dockerd/tree/master/packaging/systemd

在ExecStart添加启动参数--pod-infra-container-image设置国产镜像,这个容器是用于创建每个pod中都存在的基础设施容器pause,他是实现pod中容器共享网络和命名空间的关键

ExecStart=/usr/bin/cri-dockerd --container-runtime-endpoint fd:// --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.10

3. 安装kubelet、kubectl、kubeadm

sudo apt-get update
# apt-transport-https 可能是一个虚拟包(dummy package);如果是的话,你可以跳过安装这个包
sudo apt-get install -y apt-transport-https ca-certificates curl gpg

# 如果 `/etc/apt/keyrings` 目录不存在,则应在 curl 命令之前创建它,请阅读下面的注释。
# sudo mkdir -p -m 755 /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.xx/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg

# 此操作会覆盖 /etc/apt/sources.list.d/kubernetes.list 中现存的所有配置。
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.xx/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list

sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl

(可选)配置docker代理

修改/etc/systemd/system/docker.service.d或者/usr/lib/systemd/system/docker.service.d下的http-proxy.conf

[Service]
Environment="HTTP_PROXY=http://your-proxy-server:port"
Environment="HTTPS_PROXY=http://your-proxy-server:port"
Environment="NO_PROXY=localhost,127.0.0.1,.example.com"

cgroups与交换空间

如果linux的版本默认支持cgroupsv2,那么支持开启交换空间,否则则要关闭交换空间。

// 临时关闭
sudo swapoff -a
// 永久关闭,在/etc/fstab设置
sed -ri 's/.swap./#&/' /etc/fstab

netfilter模块

确保Linux网桥(bridge)的流量能够被iptables/ip6tables规则处理,从而保障Kubernetes的网络功能

cat > /etc/sysctl.d/k8s.conf << EOF
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF

sysctl --system  # 生效

创建集群

1. 创建主节点

kubeadm init \
    // 该节点ip地址
    --apiserver-advertise-address=xx.xxx.xx.xxx \
    // k8s版本
    --kubernetes-version v1.32.2 \
    // service和pod所在ip域
    --service-cidr=10.96.0.0/12 \
    --pod-network-cidr=10.244.0.0/16 \
    // 选择cri-docker
    --cri-socket=unix:///var/run/cri-dockerd.sock \
    // 忽略所有检查错误
    --ignore-preflight-errors=all \
    // 国内镜像源, 设置代理可以不配置
    --image-repository=registry.aliyuncs.com/google_containers

创建成功后,会返回设置集群配置文件、添加子节点的命令。

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a Pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  /docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

  kubeadm join <control-plane-host>:<control-plane-port> --token <token> --discovery-token-ca-cert-hash sha256:<hash>

如果返回错误,需要先删除已有配置:

kubeadm reset --cri-socket=unix:///var/run/cri-dockerd.sock

2. 加入子节点

使用创建集群后返回的添加子节点命令将子节点添加到集群:

如果忘记命令可以再主节点执行如下命令获取:

kubeadm token create --print-join-command

3. 配置CNI

CNI是容器网络插件,主要功能是实现节点间容器的网络通信,我们这里使用flannel配置集群

kubectl apply -f https://github.com/flannel-io/flannel/releases/latest/download/kube-flannel.yml

如果pod网络不是默认的10.244.0.0/16,需要下载manifest进行配置,或者使用helm安装指定--set podCidr="xx.xx.xx.xx"参数。

配置插件

1. NVIDIA

2. Prometheus


berrydreams
1 声望1 粉丝

Talk is cheap