In this article, we will install an Etcd cluster and use Prometheus and Grafana to configure monitoring. All of the above operations are performed through Rancher.
We will see how easy it is to make full use of Rancher's app store to achieve this goal without requiring dependencies. In this article, we will not need :
- Configure the interactive box specifically for running kubectl and point to the Kubernetes cluster
- Knowledge of the use of kubectl, because we can use Rancher UI to accomplish all this
- Install/configure Helm binary
Preliminary preparation for the demo
You will need:
- A Google cloud platform account (free). Any other cloud will work.
- Rancher v2.4.7 (the latest version at the time of writing)
- A Kubernetes cluster running on GKE (version 1.16.3-gke.1) (it can also run on EKS or AKS)
Start a Rancher instance
First, start your Rancher instance. You can view the quick start guide at the following link:
https://www.rancher.cn/quick-start/
Use Rancher to deploy a GKE cluster
To use Rancher to set up and configure a Kubernetes cluster, you can view related documents:
https://docs.rancher.cn/docs/rancher2/cluster-provisioning/production/_index/
deploy etcd, Prometheus and Grafana
We can use Rancher's application store to install all the software. The application store is a collection of Helm charts, which allows users to easily deploy these applications repeatedly.
When our cluster is up and running, let us select the Default project created for it, and in the Apps tab, click [Launch].
The first application we want to install is etcd-operator. Keep all its pre-populated default values, and make sure you also enable etcd cluster creation (for the simplicity of the demo, we deselect etcd Backup Operator and etcd Restore Operator).
The role of Operator is to observe, analyze and act. It uses the Kubernetes API to observe the current state of the cluster. If there are any differences between the operating state and the desired state, it will find and repair them.
For example, suppose we are running an etcd cluster with three members. If something happens and one of the members falls, the Operator will observe this. It makes a difference based on the required state, and then restores the lost members based on the difference. As a result, we have a healthy cluster without human intervention.
To install Prometheus and Grafana, please activate the integrated cluster monitoring support in Rancher. From the [Global] view, select the cluster you want to configure, and select [Tools] → [Monitor] to enable it. In order to allow changes to Grafana to be persistent, make sure to enable persistent storage for Grafana and Prometheus. If you haven't set up any persistent storage, you can learn about Longhorn, which is Kubernetes' cloud distributed block storage.
When everything is installed, you can explore some tabs. Check the progress of workloads (Pods, Deployments, DaemonSet) or created services.
Let's connect to an etcd Pod in order to use some basic etcdctl commands (see the previous article for more details). Select a Pod, click its vertical ellipsis (3 vertical dots) menu button, and then select Execute Shell.
Configure Prometheus and Grafana
One of the best and easiest ways to monitor etcd clusters is to use Prometheus and Grafana. Let's log in to Grafana-click on any Grafana icon in the cluster overview to log in.
Grafana has been pre-configured with Prometheus as a data source, including several dashboards that visualize cluster status.
Log in to Grafana to add a dashboard to etcd. The default username and password are both "admin" (you will be prompted to change it when you log in for the first time). Then use id3070 to import the default etcd dashboard template. Click Load, and the remaining step is to select the Prometheus data source.
We have successfully imported the dashboard, we can see various charts, but no data is displayed. why? We have already run Prometheus and Grafana is integrated with it. The problem is that we didn't tell Prometheus to collect the targets related to our etcd cluster.
Let us return to Rancher to solve this problem. Enter the system project and click Import YAML under the [Resources] tab. Then import the following resources into the cattle-prometheus namespace:
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
source: rancher-monitoring
name: etcd
namespace: cattle-prometheus
spec:
endpoints:
- port: client
namespaceSelector:
matchNames:
- etcd-operator
selector:
matchLabels:
app: etcd
How can we verify that our new configuration is valid and ensure that Prometheus is performing its work? We need to check it, please click on the Apps tab of the system project and click on the second /index/.html link in the cluster-monitoring application.
This will open the Prometheus web UI interface. In the interface, enter Graphs and manually execute some queries. If there is data displayed, it means that our settings have been completed.
The last thing we need to do is to check Grafana and see the relevant data charts we have.
Uninstall application and cluster
To clean up the resources we used in this article, we only need to select our cluster in the global hierarchy and click [Delete].
By doing this, everything except the persistent storage created for Prometheus will be deleted. We need to handle this issue from our cloud provider console.
Of course, we can perform cleanup only from Rancher, but the steps are slightly different.
- disable monitoring: At the global level, navigate to the cluster, select Tools→Monitor and click the [Disable] button.
- Remove persistent storage: Go to "System Project" → "Resources" → "Workload" → "Volume"; select your volume and click "Delete".
- Delete cluster: Select the cluster at the global level and delete it.
to sum up
In this demo, we saw how to use Rancher to install Etcd (using etcd-operator), Prometheus and Grafana. All integrations are available out of the box: we only need to add a few things to complete all the configuration. Rancher also provides all the required visibility and, if necessary, can easily troubleshoot.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。