beyla是Grafana开发一款基于ebpf的应用检测工具,它可以在不侵入应用程序代码的基础上,通过监听内核态的系统调用和用户态的库函数调用,将其解析为http/grpc的指标,从而为应用程序提供可观测性。

一.使用

beyla要求linux的内核>=4.18,下面以golang的net/http服务程序为例。

1.安装beyla

需要golang 1.21版本:

# go install github.com/grafana/beyla/cmd/beyla@latest

2.运行应用程序

服务程序监听在8080端口:

# curl -OL https://raw.githubusercontent.com/grafana/beyla/main/examples/example-http-service/example-http-service.go
# go run ./example-http-service.go

3.发起http调用

# curl -s http://localhost:8080

4.启动beyla

# BEYLA_PROMETHEUS_PORT=9400 OPEN_PORT=8080 beyla

其中:

  • BEYLA_PROMETHEUS_PORT=9400: 表示beyla为应用程序提供的http /metrics端口为9400;
  • OPEN_PORT=8080:表示beyla监听机器上8080端口的应用程序服务;

5.查看指标

# curl localhost:9400/metrics
# HELP http_server_duration_seconds duration of HTTP service calls from the server side, in seconds
# TYPE http_server_duration_seconds histogram
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="0"} 0
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="0.005"} 0
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="0.01"} 0
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="0.025"} 0
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="0.05"} 0
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="0.075"} 1
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="0.1"} 1
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="0.25"} 3
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="0.5"} 6
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="0.75"} 6
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="1"} 6
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="2.5"} 6
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="5"} 6
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="7.5"} 6
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="10"} 6
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="+Inf"} 6
http_server_duration_seconds_sum{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service"} 1.536171785
http_server_duration_seconds_count{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service"} 6
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="0"} 0
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="0.005"} 0
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="0.01"} 1
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="0.025"} 1
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="0.05"} 1
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="0.075"} 1
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="0.1"} 1
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="0.25"} 1
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="0.5"} 2
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="0.75"} 2
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="1"} 2
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="2.5"} 2
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="5"} 2
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="7.5"} 2
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="10"} 2
http_server_duration_seconds_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="+Inf"} 2
http_server_duration_seconds_sum{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service"} 0.39173739999999996
http_server_duration_seconds_count{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service"} 2
# HELP http_server_request_size_bytes size, in bytes, of the HTTP request body as received at the server side
# TYPE http_server_request_size_bytes histogram
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="0"} 6
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="32"} 6
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="64"} 6
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="128"} 6
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="256"} 6
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="512"} 6
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="1024"} 6
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="2048"} 6
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="4096"} 6
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="8192"} 6
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service",le="+Inf"} 6
http_server_request_size_bytes_sum{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service"} 0
http_server_request_size_bytes_count{http_method="GET",http_route="/**",http_status_code="200",service_name="example-http-service"} 6
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="0"} 2
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="32"} 2
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="64"} 2
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="128"} 2
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="256"} 2
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="512"} 2
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="1024"} 2
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="2048"} 2
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="4096"} 2
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="8192"} 2
http_server_request_size_bytes_bucket{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service",le="+Inf"} 2
http_server_request_size_bytes_sum{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service"} 0
http_server_request_size_bytes_count{http_method="GET",http_route="/**",http_status_code="500",service_name="example-http-service"} 2
# HELP promhttp_metric_handler_errors_total Total number of internal errors encountered by the promhttp metric handler.
# TYPE promhttp_metric_handler_errors_total counter
promhttp_metric_handler_errors_total{cause="encoding"} 0
promhttp_metric_handler_errors_total{cause="gathering"} 0

6.其它beyla参数

上面的metrics并没有显示请求的url和client地址信息,可以配置下面的参数:

  • BEYLA_METRICS_REPORT_TARGET: 默认=false,配置=true时,metrics中显示url_path=“/*";
  • BEYLA_METRICS_REPORT_PEER:默认=false,配置=true时,metrics中显示client_address=“$ip";
BEYLA_PROMETHEUS_PORT=9400 BEYLA_OPEN_PORT=8080 BEYLA_LOG_LEVEL=DEBUG BEYLA_METRICS_REPORT_TARGET=true BEYLA_METRICS_REPORT_PEER=true beyla

重新启动beyla后,再curl localhost:9400/metrics,将得到:

# HELP http_server_duration_seconds duration of HTTP service calls from the server side, in seconds
# TYPE http_server_duration_seconds histogram
http_server_duration_seconds_bucket{client_address="127.0.0.1",http_request_method="GET",http_response_status_code="200",http_route="/**",service_name="example-http-service",service_namespace="",url_path="/0",le="0"} 0
http_server_duration_seconds_bucket{client_address="127.0.0.1",http_request_method="GET",http_response_status_code="200",http_route="/**",service_name="example-http-service",service_namespace="",url_path="/0",le="0.005"} 1
http_server_duration_seconds_bucket{client_address="127.0.0.1",http_request_method="GET",http_response_status_code="200",http_route="/**",service_name="example-http-service",service_namespace="",url_path="/0",le="0.01"} 1
......

二. 分析

beyla应用程序使用golang编写,beyla中的bpf程序使用c编写。

在beyla的Makefile中,使用go-generate和bpf2go工具,将c的bpf程序转换为go程序,然后就可以在beyla中使用了。

image.png

1. ebpf程序

下面的ebpf程序beyla/bpf/go_nethttp.c,监听了用户态的库函数调用uprobe/ServerHTTP,这是golang中的http调用:

// This instrumentation attaches uprobe to the following function:
// func (mux *ServeMux) ServeHTTP(w ResponseWriter, r *Request)
// or other functions sharing the same signature (e.g http.Handler.ServeHTTP)
SEC("uprobe/ServeHTTP")
int uprobe_ServeHTTP(struct pt_regs *ctx) {
    bpf_dbg_printk("=== uprobe/ServeHTTP === ");
    void *goroutine_addr = GOROUTINE_PTR(ctx);
    bpf_dbg_printk("goroutine_addr %lx", goroutine_addr);

    func_invocation invocation = {
        .start_monotime_ns = bpf_ktime_get_ns(),
        .regs = *ctx,
    };

    // Write event
    if (bpf_map_update_elem(&ongoing_server_requests, &goroutine_addr, &invocation, BPF_ANY)) {
        bpf_dbg_printk("can't update map element");
    }

    return 0;
}

2. Makefile

Makefile中的generate描述:

# As generated artifacts are part of the code repo (pkg/ebpf packages), you don't have
# to run this target for each build. Only when you change the C code inside the bpf folder.
# You might want to use the docker-generate target instead of this.
.PHONY: generate
generate: export BPF_CLANG := $(CLANG)
generate: export BPF_CFLAGS := $(CFLAGS)
generate: export BPF2GO := $(BPF2GO)
generate: prereqs
   @echo "### Generating BPF Go bindings"
   go generate ./pkg/...

实际是执行的go generate命令。

3. go-generate

以beyla/bpf/go_nethttp.c为例,go-generate将为其生成:

  • beyla/pkg/internal/ebpf/nethttp/bpf_bpfel_x86.o
  • beyla/pkg/internal/ebpf/nethttp/bpf_bpfel_x86.go

go generate的执行日志:

### Generating BPF Go bindings
go generate ./pkg/…
…
Compiled /root/go/src/github.com/grafana/beyla/pkg/internal/ebpf/nethttp/bpf_bpfel_x86.o
Stripped /root/go/src/github.com/grafana/beyla/pkg/internal/ebpf/nethttp/bpf_bpfel_x86.o
Wrote /root/go/src/github.com/grafana/beyla/pkg/internal/ebpf/nethttp/bpf_bpfel_x86.go
..

在beyla/bpf/go_nethttp.c中,跟踪点包含:

SEC("uprobe/ServeHTTP")
int uprobe_ServeHTTP(struct pt_regs *ctx)

SEC("uprobe/startBackgroundRead")
int uprobe_startBackgroundRead(struct pt_regs *ctx)

SEC("uprobe/WriteHeader")
int uprobe_WriteHeader(struct pt_regs *ctx)

SEC("uprobe/roundTrip")
int uprobe_roundTrip(struct pt_regs *ctx)

SEC("uprobe/roundTrip_return")
int uprobe_roundTripReturn(struct pt_regs *ctx)

go-generate的描述,在beyla/pkg/internal/ebpf/nethttp/nethttp.go中:

  • $BPF2GO即是bpf2go工具;
  • 可以看到,使用了go_nethttp.c;
//go:generate $BPF2GO -cc $BPF_CLANG -cflags $BPF_CFLAGS -target amd64,arm64 bpf ../../../../bpf/go_nethttp.c -- -I../../../../bpf/headers
//go:generate $BPF2GO -cc $BPF_CLANG -cflags $BPF_CFLAGS -target amd64,arm64 bpf_debug ../../../../bpf/go_nethttp.c -- -I../../../../bpf/headers -DBPF_DEBUG

同时在beyla/pkg/internal/ebpf/nethttp/nethttp.go,也描述了go_nethttp.c中相同的跟踪点信息:

func (p *Tracer) GoProbes() map[string]ebpfcommon.FunctionPrograms {
   return map[string]ebpfcommon.FunctionPrograms{
      "net/http.HandlerFunc.ServeHTTP": {
         Start: p.bpfObjects.UprobeServeHTTP,
      },
      "net/http.(*connReader).startBackgroundRead": {
         Start: p.bpfObjects.UprobeStartBackgroundRead,
      },
      "net/http.(*response).WriteHeader": {
         Start: p.bpfObjects.UprobeWriteHeader,
      },
      "net/http.(*Transport).roundTrip": { // HTTP client, works with Client.Do as well as using the RoundTripper directly
         Start: p.bpfObjects.UprobeRoundTrip,
         End:   p.bpfObjects.UprobeRoundTripReturn,
      },
   }
}

参考:

1.https://github.com/grafana/beyla
2.https://mp.weixin.qq.com/s/Oj4kvUy_5LaRz4kUbBwo5Q


a朋
63 声望38 粉丝