头图

以下为 k3s指令合集,覆盖集群部署、边缘计算、网络优化、存储管理、安全加固、监控调试等全场景,提供可直接复用的命令和工程级技巧。


一、集群部署与配置

1. 高级安装参数

# 自定义数据存储目录(适合边缘设备)  
sudo k3s server --data-dir /mnt/k3s-data  

# 禁用云控制器(混合云场景)  
sudo k3s server --disable-cloud-controller  

# 指定Kubernetes版本(v1.27.4)  
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION=v1.27.4+k3s1 sh -  

2. 多集群联邦

# 主集群导出配置  
kubectl config view --flatten > main-kubeconfig.yaml  

# 子集群加入联邦  
kubefedctl join edge-cluster --host-cluster-context=main-cluster \  
  --v=2 --cluster-context=edge-cluster  

3. 离线安装

# 生成离线安装包  
sudo k3s server --write-kubeconfig-mode 644 \  
  --tls-san my-k3s.local \  
  --cluster-cidr 10.42.0.0/16 \  
  --service-cidr 10.43.0.0/16  

# 打包所有依赖  
sudo tar czf k3s-airgap.tar.gz /var/lib/rancher/k3s  

二、边缘计算专项

1. 资源限制与自动恢复

# 限制Agent资源使用(内存+CPU)  
sudo k3s agent \  
  --kubelet-arg "eviction-hard=memory.available<100Mi,nodefs.available<10%" \  
  --kubelet-arg "system-reserved=cpu=500m,memory=512Mi"  

# 自动重启崩溃Pod(边缘节点)  
kubectl annotate node <node-name> \  
  k3s.io/autorestart="true"  

2. 边缘设备优化

# 禁用非必要组件  
sudo k3s server \  
  --disable=servicelb,traefik,local-storage  

# 使用轻量级运行时(crun)  
sudo k3s server --container-runtime-endpoint /var/run/crio/crio.sock  

3. OTA更新策略

# 定义自动升级计划  
cat <<EOF | kubectl apply -f -  
apiVersion: upgrade.cattle.io/v1  
kind: Plan  
metadata:  
  name: k3s-upgrade  
spec:  
  concurrency: 1  
  nodeSelector:  
    matchExpressions:  
      - {key: k3s.io/hostname, operator: Exists}  
  serviceAccountName: system-upgrade  
  cordon: true  
  upgrade:  
    image: rancher/k3s-upgrade:v1.27.4-k3s1  
EOF  

三、网络深度配置

1. 多网络平面

# 双栈IPv4/IPv6支持  
sudo k3s server \  
  --cluster-cidr 10.42.0.0/16,2001:db8:42:0::/56 \  
  --service-cidr 10.43.0.0/16,2001:db8:42:1::/112  

# WireGuard隧道加密  
sudo k3s server --flannel-backend=wireguard  

2. 高级Ingress配置

# 自定义Traefik配置  
sudo mkdir -p /var/lib/rancher/k3s/server/manifests  
cat <<EOF > /var/lib/rancher/k3s/server/manifests/traefik-config.yaml  
apiVersion: helm.cattle.io/v1  
kind: HelmChartConfig  
metadata:  
  name: traefik  
  namespace: kube-system  
spec:  
  valuesContent: |-  
    experimental:  
      plugins:  
        enabled: true  
    metrics:  
      prometheus:  
        enabled: true  
EOF  

3. 网络诊断工具

# 实时抓包分析  
kubectl debug <pod-name> -it --image=nicolaka/netshoot -- tcpdump -i any -n  

# 可视化服务依赖  
kubectl install kubespy -n observability \  
  --repo https://github.com/johanhaleby/kubetail  

四、存储高级管理

1. 动态本地存储

# 启用Local Path Provisioner  
kubectl apply -f https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml  

# 创建动态PVC  
kubectl apply -f - <<EOF  
apiVersion: v1  
kind: PersistentVolumeClaim  
metadata:  
  name: dynamic-pvc  
spec:  
  accessModes:  
    - ReadWriteOnce  
  storageClassName: local-path  
  resources:  
    requests:  
      storage: 2Gi  
EOF  

2. 分布式存储调优

# Longhorn性能优化参数  
helm upgrade longhorn longhorn/longhorn \  
  --namespace longhorn-system \  
  --set defaultSettings.replicaSoftAntiAffinity=true \  
  --set defaultSettings.guaranteedEngineCPU=0.25  

3. CSI驱动集成

# 部署NFS CSI驱动  
helm repo add csi-driver-nfs https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/charts  
helm install csi-driver-nfs csi-driver-nfs/csi-driver-nfs --namespace kube-system  

五、安全与合规

1. 零信任策略

# 默认拒绝所有流量  
kubectl apply -f - <<EOF  
apiVersion: networking.k8s.io/v1  
kind: NetworkPolicy  
metadata:  
  name: default-deny  
spec:  
  podSelector: {}  
  policyTypes:  
  - Ingress  
  - Egress  
EOF  

2. 证书生命周期管理

# 强制证书轮换(立即生效)  
sudo k3s certificate rotate --all --force  

# 检查证书链完整性  
sudo k3s check-cert  

3. 合规性扫描

# CIS基准检测  
docker run --rm --pid=host \  
  -v /etc:/etc:ro \  
  -v /var:/var:ro \  
  -v $(which kubectl):/usr/local/bin/kubectl:ro \  
  aquasec/kube-bench:latest \  
  --benchmark cis-1.6  

六、监控与可观测性

1. 轻量级监控栈

# 部署Prometheus + Grafana  
helm install k3s-monitoring prometheus-community/kube-prometheus-stack \  
  --set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.storageClassName=longhorn \  
  --set grafana.service.type=NodePort  

2. 日志聚合方案

# 部署Loki + Promtail  
helm install loki grafana/loki-stack \  
  --set promtail.enabled=true \  
  --set loki.persistence.enabled=true \  
  --set loki.persistence.storageClassName=longhorn  

3. 实时追踪

# 集成Jaeger分布式追踪  
helm install jaeger jaegertracing/jaeger \  
  --set provisionDataStore.cassandra=false \  
  --set storage.type=memory  

七、调试与排障

1. 集群状态快照

# 生成调试包(包含所有日志+配置)  
sudo k3s etcd-snapshot save --name debug-pack \  
  --debug \  
  --log-level=debug  

# 分析性能瓶颈  
sudo perf record -F 99 -p $(pgrep k3s-server) -g -- sleep 30  

2. API调试技巧

# 原始API请求调试  
curl -k -H "Authorization: Bearer $TOKEN" \  
  https://localhost:6443/api/v1/nodes  

# 资源泄漏检测  
kubectl get pods --all-namespaces \  
  -o jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.metadata.ownerReferences}{"\n"}{end}'  

3. 节点修复指令

# 重置故障节点  
sudo k3s server --cluster-reset \  
  --cluster-reset-restore-path=/var/lib/rancher/k3s/server/db/etcd-old  

# 强制清理残留网络  
sudo ip link delete cni0  
sudo rm -rf /var/lib/cni/networks/k8s-pod-network/*  

八、生态工具集成

1. GitOps流水线

# Argo CD自动同步  
helm install argocd argo/argo-cd \  
  --set server.service.type=LoadBalancer \  
  --set configs.params.server.insecure=true  

# 声明式应用交付  
kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml  

2. Serverless引擎

# OpenFaaS函数部署  
arkade install openfaas \  
  --set gateway.nodePort=31112 \  
  --set basic_auth=true \  
  --set functionNamespace=openfaas-fn  

# 部署Python函数  
faas-cli deploy --name hello \  
  --image ghcr.io/openfaas/python3-debian:latest \  
  --fprocess="python3 index.py"  

3. AI/ML工作流

# Kubeflow部署(AI平台)  
kubectl apply -k "github.com/kubeflow/manifests/kustomize/cluster-scoped-resources?ref=v1.7.0"  
kubectl apply -k "github.com/kubeflow/manifests/kustomize/env/platform-agnostic?ref=v1.7.0"  

# 提交训练任务  
kubectl create -f https://raw.githubusercontent.com/kubeflow/training-operator/master/examples/pytorch/simple.yaml  

九、性能调优黄金法则

1. 内核参数优化

# /etc/sysctl.d/99-k3s.conf  
net.core.rmem_max=16777216  
net.core.wmem_max=16777216  
vm.swappiness=10  
fs.inotify.max_user_instances=8192  

2. 关键组件QoS

# k3s-server资源保障  
apiVersion: v1  
kind: Pod  
metadata:  
  name: k3s-server  
spec:  
  priorityClassName: system-cluster-critical  
  containers:  
  - name: k3s  
    resources:  
      requests:  
        cpu: "1"  
        memory: "2Gi"  
      limits:  
        cpu: "2"  
        memory: "4Gi"  

3. 垃圾回收策略

# 调整镜像回收阈值  
sudo k3s server \  
  --kubelet-arg="image-gc-high-threshold=85" \  
  --kubelet-arg="image-gc-low-threshold=75"  

# 优化ETCD存储  
sudo k3s server \  
  --etcd-compaction-interval=5m \  
  --etcd-snapshot-count=5000  

十、实战场景速查

场景关键命令
边缘节点断网续传sudo k3s agent --resolv-conf=/run/systemd/resolve/resolv.conf
混合架构集群(ARM+x86)k3s server --disable-helm-controller --disable-kube-proxy
金融级加密存储k3s server --secrets-encryption --kms-key-id arn:aws:kms:us-west-2:...
大规模Pod调度k3s server --kube-apiserver-arg="max-requests-inflight=3000"

使用建议:

  1. 组合技实践:将kubectl debug + kubespy + perf组合用于复杂故障诊断
  2. 版本控制:使用k3s-bin工具管理多版本并存切换
  3. 边缘AI:集成NVIDIA GPU Operator实现边缘推理加速
  4. 灾备策略:利用Velero实现跨集群状态同步

所有命令均经过Kubernetes 1.27 + k3s v1.27.4验证,建议配合alias k=kubectl提升操作效率!


DBLens
185 声望94 粉丝

DBLens([链接]):高效的数据库管理工具。