1

background

As we all know, eBPF is a very promising project, and even established a special foundation ( https://ebpf.io/) to promote the development and standardization of its ecology.

There are a lot of information about the basic knowledge of eBPF, so I won't repeat it here.

This article aims to explore what chemical reactions will occur when eBPF and kubernetes are combined, and how to combine existing tool chains to solve practical problems.

The related open source projects involved are mainly as follows:

  • bcc
  • bpftrace
  • kubectl-trace
  • kubectl-flame
  • cilium

precondition

kernel

The concept of eBPF has been around for a long time, so in fact, some functions can be supported under the old version of the kernel, as shown in the following table. Taking the x86_64 system as an example, JIT compilation is supported in version 3.16.

image.png

But if you want to experience/use most of the functions normally, it is recommended to upgrade to the latest LTS version of the kernel. For example, the currently used CentOS 7.9 after upgrading the kernel version used is 5.15.4.

The following link shows the version of most of the functions that eBPF depends on.

https://github.com/iovisor/bcc/blob/master/docs/kernel-versions.md

kernel header

Some functions depend on the header files in the kernel header, and all the kernel headers corresponding to the kernel version need to be installed.

Introduction to related projects

bcc

BCC makes BPF programs easier to write, with kernel instrumentation in C (and includes a C wrapper around LLVM),
and front-ends in Python and lua. It is suited for many tasks, including performance analysis and network traffic control.

Provides a set of easy-to-use programming interfaces, allowing developers to write eBPF-based functional scripts with python or lua without detailed kernel code (nearly ten million lines of code).

image.png

bpftrace

bpftrace is a high-level tracing language for Linux enhanced Berkeley Packet Filter (eBPF) available in recent Linux kernels (4.x).
bpftrace uses LLVM as a backend to compile scripts to BPF-bytecode and makes use of BCC for interacting with the Linux BPF system,
as well as existing Linux tracing capabilities: kernel dynamic tracing (kprobes), user-level dynamic tracing (uprobes), and tracepoints.

bpftrace is built on top of bcc and has implemented a set of DSL based on C/awk, which is more suitable for some simple one-liner commands to monitor or trace, or directly use official examples.

image.png

kubectl-trace

eBPF's kubectl plug-in can use bpftrace to monitor k8s resources such as node/pod. The latest version v0.1.2 (2021.7) only supports bpftrace, and future versions will support both bpftrace and bcc (the code has been merged into the main branch but has not yet been released).

The overall experience is smoother.

kubectl-flame

Yahoo open sourced the kubectl plug-in that provides flame graphs for programs. Officially, it is possible to attach to the business container for analysis without changing the business program.

After the test, the community version has many bugs and cannot run smoothly. In fact, it is dependent on business containers, including the JDK. The experience is average, and it feels difficult to implement it on a large scale.

clilium

The kubernetes network plug-in that uses eBPF technology to improve network forwarding performance and enhance observability.

Get Your Hands Dirty

The CentOS 7 kernel version used in the following experiments is 4.4.

Update kernel

yum -y update
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-3.el7.elrepo.noarch.rpm
yum --disablerepo="*" --enablerepo="elrepo-kernel" list available

image.png

Will show two types of kernel, lt and ml. Among them, lt stands for LongTerm, which is similar to Ubuntu's LTS; ml stands for MainLine. You can choose any one, but you need to correspond to it when you install the header. Choose ml here.

yum --enablerepo=elrepo-kernel install -y kernel-ml

Install header

You need to delete old headers before installation. If you have installed the header before, you may get an incompatibility error if you install it again.

yum remove kernel-headers

Install the new kernel

yum --enablerepo=elrepo-kernel install -y kernel-ml-headers

Use new kernel
View all currently available cores

$ sudo awk -F\' '$1=="menuentry " {print i++ " : " $2}' /etc/grub2.cfg
0 : CentOS Linux (4.18.7-1.el7.elrepo.x86_64) 7 (Core)
1 : CentOS Linux (3.10.0-862.11.6.el7.x86_64) 7 (Core)
2 : CentOS Linux (3.10.0-514.el7.x86_64) 7 (Core)
3 : CentOS Linux (0-rescue-063ec330caa04d4baae54c6902c62e54) 7 (Core)

Edit grub to modify the default kernel, restart the machine.

grub2-set-default 0
reboot

Install bcc/bpftrace

# bcc
yum install bcc-tools

# bpftrace
curl https://repos.baslab.org/rhel/7/bpftrace-daily/bpftrace-daily.repo --output /etc/yum.repos.d/bpftools.repo
curl https://repos.baslab.org/rhel/7/bpftools/bpftools.repo --output /etc/yum.repos.d/bpftrace-daily.repo
yum install bpftrace bpftrace-tools bpftrace-doc

Install kubectl-trace

Unlike bcc btftrace, which needs to be executed on each host, kubectl-trace is a client plug-in, which can be installed on the client machine where kubectl is executed.

kubectl krew install trace

Use bcc

git clone https://github.com/iovisor/bcc.git
cd ./bcc/tools/
# 查看某 java 进程的 gc 事件
./javagc.sh -l java 24682

Use bpftrace

git clone https://github.com/iovisor/bpftrace.git
cd ./bpftrace/tools/
# 查看 DNS 解析请求的延迟
bpftrace gethostlatency.bt

path referenced in the official tools is hardcoded, and an error may be reported (1619f06f7e25d2 https://github.com/iovisor/bpftrace/issues/2075#issuecomment-977648027), which can be requested Change to the correct path. (This problem should be fixed later)

kubectl trace node

Regardless of whether it is a node or a pod, trace starts a job on the corresponding node, and then detects the host or attaches to the pod container.

There is a bug here, which corresponds to the environment of CentOS. Even if I have installed kernel-headers on the host, I still need --fetch-headers to execute successfully.

k trace run node1 -e "tracepoint:syscalls:sys_enter_* { @[probe] = count(); }" --fetch-headers

Another problem is that --fetch-headers may pull the tar package from the external network. Depending on the network environment, the pull may fail. The solution is to pull it down in advance, and then refer to the official website to manually build an initContainer image, as shown in the following example.

k trace run node1 -e "tracepoint:syscalls:sys_enter_* { @[probe] = count(); }" --fetch-headers
k trace run -nkube-system pod/calico-kube-controllers-7d5d95c8c9-mkp54 -e "tracepoint:syscalls:sys_enter_* { @[probe] = count(); }" --fetch-headers --init-imagename=docker.4pd.io/tmp/kubectl-trace-init:5.15.4
kubectl trace pod

kubectl trace pod

k trace run -nkube-system pod/calico-kube-controllers-7d5d95c8c9-mkp54 -e "tracepoint:syscalls:sys_enter_* { @[probe] = count(); }" --fetch-headers

Future Step

Regardless of whether it is native eBPF, bcc, or bpftrace, there is actually a certain threshold when using it. Therefore, it is necessary to package the corresponding pins according to the actual situation according to the actual situation (or reuse the official tools and provide help documents) for development and use.

For example, the following scenarios can be pre-written to support scripts.

  • mysql slow query
  • fd leak
  • Memory leak
  • Frequent gc
  • tcp packet loss
  • DNS lookup failed

FingerLiu
562 声望28 粉丝

先广后精。Explore the whole world,and then do one thing but do it best.


« 上一篇
遗留系统演进
下一篇 »
k8s 未来展望