1

Golang-app应用来源于signoz提供的example,在该example中,使用opentelemetry-sdk产生trace和metrics,并通过gRPC上报到otel-collector。

一.整体架构

image.png

golang-app使用opentelemetry-sdk产生trace和metrics,通过gRPC协议上报到signoz-otel-collector:

  • trace的gRPC方法:/opentelemetry.proto.collector.trace.v1.TraceService/Export
  • metrics的gRPC方法:/opentelemetry.proto.collector.metrics.v1.MetricsService/Export

Signoz-otel-collector在标准的otel-collector的基础上,实现了clickhouse-exporter,它可以将trace和metrics数据导出到clickhouse中。

clickhouse中分别为trace和metrics创建了database。

二.Golang-app

1. trace

app使用gin作为http框架,trace的代码流程:

  • 初始化traceProvider;
  • 使用otel的middleware;
func main() {
   cleanup := initTracer()        // 初始化TraceProvider
   defer cleanup(context.Background())
   …
   r := gin.Default()
   r.Use(otelgin.Middleware(serviceName))  // 使用middlware
   ...
}

初始化TraceProvider的流程:

  • 创建exporter对象:指定gRPC上报和上报的url;
  • 创建resource对象:指定该应用的名称、开发语言和版本等信息;
  • 使用上面的exporter和resource对象,创建traceProvider对象;
func initTracer() func(context.Context) error {
    // 创建exporter对象:指定gRPC和url
    exporter, err := otlptrace.New(
        context.Background(),
        otlptracegrpc.NewClient(
            secureOption,
            otlptracegrpc.WithEndpoint(collectorURL),
        ),
    )
    // 创建resource对象:指定服务的名称和开发语言
    resources, err := resource.New(
        context.Background(),
        resource.WithAttributes(
            attribute.String("service.name", serviceName),
            attribute.String("library.language", "go"),
        ),
    )
    // 使用上面的exporter和resource对象,创建traceProvider对象
    otel.SetTracerProvider(
        sdktrace.NewTracerProvider(
            sdktrace.WithSampler(sdktrace.AlwaysSample()),
            sdktrace.WithBatcher(exporter),
            sdktrace.WithResource(resources),
        ),
    )
    return exporter.Shutdown
}

通过otel的middleware,当有HTTP请求进来时,就可以通过traceProvider对请求的链路进行采样。

2.metrics

metrics的使用流程:

  • 创建metricProvider;
  • 通过metricProvider创建meter对象;
  • 通过meter对象,在业务代码中创建并操作各种类型的Metrics对象;
func main() {
    ...
    // 创建metricProvider
    provider := metrics.InitMeter()
    defer provider.Shutdown(context.Background())
    // 创建metric对象
    meter := provider.Meter("sample-golang-app")
    // 使用metric对象创建各种Metric对象
    metrics.GenerateMetrics(meter)
    ...
}

创建metricProvider的流程:

  • 创建exporter对象:指定gRPC和url;
  • 创建resource对象:执行服务的名称、开发语言等;
  • 使用上面的exporter和resource对象,创建metricProvider;
func InitMeter() *metricsdk.MeterProvider {
    // 创建exporter对象,指定gRPC和上报的url
    exporter, err := otlpmetricgrpc.New(
        context.Background(),
        secureOption,
        otlpmetricgrpc.WithEndpoint(collectorURL),
    )
    // 创建resource对象,指定服务名称和开发语言
    res, err := resource.New(
        context.Background(),
        resource.WithAttributes(
            attribute.String("service.name", serviceName),
            attribute.String("library.language", "go"),
        ),
    )
    // 创建metricProvider对象
    // Register the exporter with an SDK via a periodic reader.
    provider := metricsdk.NewMeterProvider(
        metricsdk.WithResource(res),
        metricsdk.WithReader(metricsdk.NewPeriodicReader(exporter)),
    )
    return provider
}

metricProvider创建metric对象,最终由metric创建和操作业务代码中的各种指标;

比如下面创建的counter类型指标,指标名称=exceptions:

func exceptionsCounter(meter api.Meter) {
   counter, err := meter.Int64Counter("exceptions", api.WithUnit("1"),
      api.WithDescription("Counts exceptions since start"),
   )
   ...
   for {
      // Increment the counter by 1.
      // The attributes describe the exception.
      counter.Add(context.Background(), 1, api.WithAttributes(attribute.KeyValue{
         Key: attribute.Key("exception_type"), Value: attribute.StringValue("NullPointerException"),
      }))
      time.Sleep(time.Duration(rand.Int63n(5)) * time.Millisecond)
   }
}

3.trace和metrics的上报

trace是通过opentelemetry-sdk中的traceServiceClient,使用gRPC上报到otel-collector;

代码中可以看到gRPC的method:

// go.opentelemetry.io/proto/otlp/trace/v1/trace_service_grpc.pb.go
func (c *traceServiceClient) Export(ctx context.Context, in *ExportTraceServiceRequest, opts ...grpc.CallOption) (*ExportTraceServiceResponse, error) {
    out := new(ExportTraceServiceResponse)
    err := c.cc.Invoke(ctx, "/opentelemetry.proto.collector.trace.v1.TraceService/Export", in, out, opts...)
    if err != nil {
        return nil, err
    }
    return out, nil
}

metrics是通过opentelemetry-sdk中的metricServiceClient,使用gRPC上报到otel-collector;

代码中可以看到gRPC的method:

// go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service_grpc.pb.go
func (c *metricsServiceClient) Export(ctx context.Context, in *ExportMetricsServiceRequest, opts ...grpc.CallOption) (*ExportMetricsServiceResponse, error) {
    out := new(ExportMetricsServiceResponse)
    err := c.cc.Invoke(ctx, "/opentelemetry.proto.collector.metrics.v1.MetricsService/Export", in, out, opts...)
    if err != nil {
        return nil, err
    }
    return out, nil
}

三.signoz-otel-collector

Signoz-otel-collector在标准otel-collector的基础上,实现了clickhouse-exporter,可以将trace和metrics数据导出到clickhouse中。

查看signoz-otel-collector的配置:

  • Trace: 接收jaeger和otlp协议的数据,使用clickhousetraces导出到clickhouse;

    • 导出器clickhousetraces由signoz自己实现;
  • Metrics: 接口otlp协议的数据,使用clickhousemetricswrite导出到clickhouse;

    • 导出器clickhousemetricswrite由signoz自己实现;
service:
  ...
  pipelines:
    traces:
      receivers: [jaeger, otlp]
      processors: [signozspanmetrics/prometheus, batch]
      exporters: [clickhousetraces]
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [clickhousemetricswrite]

1. receiver模块

在接收otlp格式的数据时,gRPC端口=4317;

receivers:
  ...
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

2. exporter

在导出trace数据时,导出到clickhouse的signoz_traces数据库;
在导出metrics数据时,导出到clickhouse的signoz_metrics数据库;

exporters:
  clickhousetraces:
    datasource: tcp://clickhouse:9000/?database=signoz_traces
    docker_multi_node_cluster: ${DOCKER_MULTI_NODE_CLUSTER}
    low_cardinal_exception_grouping: ${LOW_CARDINAL_EXCEPTION_GROUPING}
  clickhousemetricswrite:
    endpoint: tcp://clickhouse:9000/?database=signoz_metrics
    resource_to_telemetry_conversion:
      enabled: true

四. clickhouse

在signoz平台的页面上查询trace和metrics,都是从clickhouse中读取。

可以在clickhouse上看到:

  • 保存trace的database=signoz_traces;
  • 保存metrics的database=signoz_metrics;
clickhouse :) show databases;
┌─name───────────────┐
│ INFORMATION_SCHEMA │
│ default            │
│ information_schema │
│ signoz_metrics     │
│ signoz_traces      │
│ system             │
└────────────────────┘

查看trace span,该trace是GET /books接口:

clickhouse :) use signoz_traces;
clickhouse :) select * from signoz_spans where traceID='00b6a4fd7a63e9bb71c58feaf4b5d451’;
Query id: 837fd791-0405-412d-b746-aaddf3124df1

┌─────────────────────timestamp─┬─traceID──────────────────────────┬─model──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 2023-10-10 09:32:41.514338000 │ 00b6a4fd7a63e9bb71c58feaf4b5d451 │ {"traceId":"00b6a4fd7a63e9bb71c58feaf4b5d451","spanId":"a4d8a3c2d39bce4c","name":"gorm.Query","durationNano":249773,"startTimeUnixNano":1696930361514338000,"serviceName":"goApp","kind":3,"references":[{"traceId":"00b6a4fd7a63e9bb71c58feaf4b5d451","spanId":"84245fdc686341fa","refType":"CHILD_OF"}],"tagMap":{"db.rows_affected":"3","db.sql.table":"books","db.statement":"SELECT * FROM `books`","db.system":"sqlite","library.language":"go","service.name":"goApp","signoz.collector.id":"92ad0181-f387-4ed2-b23e-cd176a4042d9"},"stringTagMap":{"db.sql.table":"books","db.statement":"SELECT * FROM `books`","db.system":"sqlite","library.language":"go","service.name":"goApp","signoz.collector.id":"92ad0181-f387-4ed2-b23e-cd176a4042d9"},"numberTagMap":{"db.rows_affected":3}} │
│ 2023-10-10 09:32:41.514319000 │ 00b6a4fd7a63e9bb71c58feaf4b5d451 │ {"traceId":"00b6a4fd7a63e9bb71c58feaf4b5d451","spanId":"84245fdc686341fa","name":"/books","durationNano":302063,"startTimeUnixNano":1696930361514319000,"serviceName":"goApp","kind":2,"references":[{"traceId":"00b6a4fd7a63e9bb71c58feaf4b5d451","refType":"CHILD_OF"}],"tagMap":{"controller":"books","http.flavor":"1.1","http.method":"GET","http.route":"/books","http.scheme":"http","http.status_code":"200","http.user_agent":"curl/7.54.0","library.language":"go","net.host.name":"goApp","net.host.port":"8090","net.sock.peer.addr":"127.0.0.1","net.sock.peer.port":"49360","service.name":"goApp","signoz.collector.id":"92ad0181-f387-4ed2-b23e-cd176a4042d9"},"stringTagMap":{"controller":"books","http.flavor":"1.1","http.method":"GET","http.route":"/books","http.scheme":"http","http.user_agent":"curl/7.54.0","library.language":"go","net.host.name":"goApp","net.sock.peer.addr":"127.0.0.1","service.name":"goApp","signoz.collector.id":"92ad0181-f387-4ed2-b23e-cd176a4042d9"},"numberTagMap":{"http.status_code":200,"net.host.port":8090,"net.sock.peer.port":49360},"event":["{\"name\":\"This is a sample event\",\"timeUnixNano\":1696930361514327000,\"attributeMap\":{\"pid\":\"4328\",\"sampleAttribute\":\"Test\"}}"]} │
└───────────────────────────────┴──────────────────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

查看metrics数据,指标数据是series和samples分表存放的;
先看series:

clickhouse :) use signoz_metrics;
clickhouse :) select * from time_series_v2 where metric_name='exceptions';

SELECT *
FROM time_series_v2
WHERE metric_name = 'exceptions'

Query id: f3673d5b-75aa-409e-bbad-2f228ca0f46d

┌─metric_name─┬──────────fingerprint─┬──timestamp_ms─┬─labels──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─temporality─┐
│ exceptions  │ 13922121808371638338 │ 1696930256875 │ {"__name__":"exceptions","__temporality__":"Cumulative","exception_type":"NullPointerException","library_language":"go","service_name":"goApp"} │ Cumulative  │
└─────────────┴──────────────────────┴───────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────────┘
1 row in set. Elapsed: 0.006 sec.

再看samples:

clickhouse :) select * from samples_v2 where metric_name='exceptions' order by timestamp_ms desc limit 10;
SELECT *
FROM samples_v2
WHERE metric_name = 'exceptions'
ORDER BY timestamp_ms DESC
LIMIT 10

Query id: 948abe53-92d8-4aae-a914-2d835a83ba57

┌─metric_name─┬──────────fingerprint─┬──timestamp_ms─┬───value─┐
│ exceptions  │ 13922121808371638338 │ 1697004493806 │ 5896945 │
└─────────────┴──────────────────────┴───────────────┴─────────┘
┌─metric_name─┬──────────fingerprint─┬──timestamp_ms─┬───value─┐
│ exceptions  │ 13922121808371638338 │ 1697004433806 │ 5870732 │
│ exceptions  │ 13922121808371638338 │ 1697004373807 │ 5844194 │
│ exceptions  │ 13922121808371638338 │ 1697004313807 │ 5817643 │
│ exceptions  │ 13922121808371638338 │ 1697004253807 │ 5791466 │
│ exceptions  │ 13922121808371638338 │ 1697004193807 │ 5765159 │
│ exceptions  │ 13922121808371638338 │ 1697004133808 │ 5738712 │
│ exceptions  │ 13922121808371638338 │ 1697004073808 │ 5712423 │
│ exceptions  │ 13922121808371638338 │ 1697004013808 │ 5686062 │
│ exceptions  │ 13922121808371638338 │ 1697003953809 │ 5659481 │
└─────────────┴──────────────────────┴───────────────┴─────────┘

10 rows in set. Elapsed: 0.015 sec. Processed 16.41 thousand rows, 213.78 KB (1.12 million rows/s., 14.56 MB/s.)

参考:

1.https://signoz.io/docs/instrumentation/golang/
2.https://github.com/SigNoz/sample-golang-app
3.https://signoz.io/opentelemetry/go/


a朋
63 声望38 粉丝