1
About the author Wang Hailong, Technical Manager of SUSE Rancher China Community, is responsible for the maintenance and operation of Rancher China Technical Community. He has 8 years of experience in the field of cloud computing, and has experienced the technological transformation from OpenStack to Kubernetes. Whether the underlying operating system Linux, or virtualized KVM or Docker container technology, he has rich operation and maintenance and practical experience.

Notice:

  • This guide applies to Rancher versions v2.5 and below, not v2.6
  • Please make a backup before operation

foreword

Each downstream user cluster managed by Rancher has a cluster agent, which establishes a tunnel and connects to the corresponding cluster controller in the Rancher server through this tunnel.

Cluster agent, also known as cattle-cluster-agent, is a component running in the downstream user cluster. One of its important functions is to exchange events, Statistics, node information and health status are communicated and reported.

When the IP of the Rancher server changes and the cattle-cluster-agent cannot connect to the Rancher server through the tunnel, you can view the following logs in the cattle-cluster-agent container of the downstream cluster:

 time="2022-04-06T03:42:22Z" level=info msg="Connecting to wss://35.183.183.66/v3/connect with token jhh9rx4zmgkrw2mz8mkvsmlnnx6q5jllnqb8jnr2vdxcgglglqbdjz"
time="2022-04-06T03:42:22Z" level=info msg="Connecting to proxy" url="wss://35.183.183.66/v3/connect"
time="2022-04-06T03:42:32Z" level=error msg="Failed to connect to proxy. Empty dialer response" error="dial tcp 35.183.183.66:443: i/o timeout"
time="2022-04-06T03:42:32Z" level=error msg="Remotedialer proxy error" error="dial tcp 35.183.183.66:443: i/o timeout"

35.183.183.66 is the original Rancher server IP

The Rancher UI shows the cluster status as Unavailable :

It can be seen that after the host IP of the Rancher server changes, the Rancher agent cannot connect through the original Rancher server IP, so we need to update the IP address of the Rancher agent to connect to the Rancher server.

Rebuild Rancher agent

Make the Rancher agent connect to the new Rancher server IP

update server-url

Because the IP address of the Rancher server node has changed, the server-url of the Rancher server needs to be updated to the correct host IP. We can find the option for server-url from Settings.

Get the kubeconfig of the downstream cluster

Recreating the Rancher agent requires connecting to the downstream cluster through kubectl, so before operating, first obtain the kubeconfig file of the downstream cluster.

You can choose one of the following three methods:

 docker run --rm --net=host -v $(docker inspect kubelet --format '{{ range .Mounts }}{{ if eq .Destination "/etc/kubernetes" }}{{ .Source }}{{ end }}{{ end }}')/ssl:/etc/kubernetes/ssl:ro --entrypoint bash $(docker inspect $(docker images -q --filter=label=io.cattle.agent=true) --format='{{index .RepoTags 0}}' | tail -1) -c 'kubectl --kubeconfig /etc/kubernetes/ssl/kubecfg-kube-node.yaml get configmap -n kube-system full-cluster-state -o json | jq -r .data.\"full-cluster-state\" | jq -r .currentState.certificatesBundle.\"kube-admin\".config | sed -e "/^[[:space:]]*server:/ s_:.*_: \"https://127.0.0.1:6443\"_"' > kubeconfig_admin.yaml

Regenerate Rancher agent definitions

Generate API token in UI (User -> API & Keys) and save Bearer Token;

This example is: token-rfv84:86v2wxpzh8mtgvzxpsnwnvrx5nlc424tf8tvrnpzckdxdpt2vfltqq

Find the clusterid (in the format c-xxxxx) in the Rancher UI. If you don't know how to find the clusterid, you can navigate to the home page and click the corresponding cluster name. At this time, the address bar of the browser will display a clusterid of c-xxxxx.

This example is: c-s8t7s

Generate agent definition (requires curl, jq)

 # Rancher URL
RANCHERURL="https://35.183.24.89"
# Cluster ID
CLUSTERID="c-s8t7s"
# Token
TOKEN="token-rfv84:86v2wxpzh8mtgvzxpsnwnvrx5nlc424tf8tvrnpzckdxdpt2vfltqq"
# Valid certificates
curl -s -H "Authorization: Bearer ${TOKEN}" "${RANCHERURL}/v3/clusterregistrationtokens?clusterId=${CLUSTERID}" | jq -r '.data[] | select(.name != "system") | .command'
# Self signed certificates
curl -s -k -H "Authorization: Bearer ${TOKEN}" "${RANCHERURL}/v3/clusterregistrationtokens?clusterId=${CLUSTERID}" | jq -r '.data[] | select(.name != "system") | .insecureCommand'

Upon successful execution, an execution-defined command will be generated, such as:

 root@ip-172-31-6-210:~# curl -s -k -H "Authorization: Bearer ${TOKEN}" "${RANCHERURL}/v3/clusterregistrationtokens?clusterId=${CLUSTERID}" | jq -r '.data[] | select(.name != "system") | .insecureCommand'

curl --insecure -sfL https://35.183.24.89/v3/import/98bvp7cpc7m7xqccxqwsghbnb6pvm9b2lcz7jz4xlfdlsc9lh5tmv8_c-s8t7s.yaml | kubectl apply -f -

Application Definition

Execute the command generated in the previous step to reconfigure the Rancher agent on the host with kubectl and kubeconfig:

 root@ip-172-31-6-210:~# curl --insecure -sfL https://35.183.24.89/v3/import/98bvp7cpc7m7xqccxqwsghbnb6pvm9b2lcz7jz4xlfdlsc9lh5tmv8_c-s8t7s.yaml | kubectl apply -f -
clusterrole.rbac.authorization.k8s.io/proxy-clusterrole-kubeapiserver unchanged
clusterrolebinding.rbac.authorization.k8s.io/proxy-role-binding-kubernetes-master unchanged
namespace/cattle-system unchanged
serviceaccount/cattle unchanged
clusterrolebinding.rbac.authorization.k8s.io/cattle-admin-binding unchanged
secret/cattle-credentials-6f51cbe created
clusterrole.rbac.authorization.k8s.io/cattle-admin unchanged
deployment.apps/cattle-cluster-agent configured
daemonset.apps/cattle-node-agent configured

verify

After a few moments, the cattle-cluster-agent and cattle-node-agent will re-run:

 root@ip-172-31-6-210:~# kubectl -n cattle-system get pods
NAME                                    READY   STATUS    RESTARTS   AGE
cattle-cluster-agent-77f864c76f-qrjs2   1/1     Running   0          38s
cattle-node-agent-znrv5                 1/1     Running   0          4s

The business cluster status becomes Active again:

Postscript

It is highly not recommended to modify the IP address of the Rancher server, and even modifying the server-url may bring hidden dangers.

Even for the Rancher server installed on a single node, it is recommended to register the downstream cluster through the domain name, so that the subsequent migration from a single node to high availability can be performed; or after the Rancher server node IP changes, it is only necessary to modify the corresponding IP mapping.


Rancher
1.2k 声望2.5k 粉丝

Rancher是一个开源的企业级Kubernetes管理平台,实现了Kubernetes集群在混合云+本地数据中心的集中部署与管理。Rancher一向因操作体验的直观、极简备受用户青睐,被Forrester评为“2020年多云容器开发平台领导厂商...