Golang-app应用来源于signoz提供的example,在该example中,使用opentelemetry-sdk产生trace和metrics,并通过gRPC上报到otel-collector。
一.整体架构
golang-app使用opentelemetry-sdk产生trace和metrics,通过gRPC协议上报到signoz-otel-collector:
- trace的gRPC方法:/opentelemetry.proto.collector.trace.v1.TraceService/Export
- metrics的gRPC方法:/opentelemetry.proto.collector.metrics.v1.MetricsService/Export
Signoz-otel-collector在标准的otel-collector的基础上,实现了clickhouse-exporter,它可以将trace和metrics数据导出到clickhouse中。
clickhouse中分别为trace和metrics创建了database。
二.Golang-app
1. trace
app使用gin作为http框架,trace的代码流程:
- 初始化traceProvider;
- 使用otel的middleware;
func main() {
cleanup := initTracer() // 初始化TraceProvider
defer cleanup(context.Background())
…
r := gin.Default()
r.Use(otelgin.Middleware(serviceName)) // 使用middlware
...
}
初始化TraceProvider的流程:
- 创建exporter对象:指定gRPC上报和上报的url;
- 创建resource对象:指定该应用的名称、开发语言和版本等信息;
- 使用上面的exporter和resource对象,创建traceProvider对象;
func initTracer() func(context.Context) error {
// 创建exporter对象:指定gRPC和url
exporter, err := otlptrace.New(
context.Background(),
otlptracegrpc.NewClient(
secureOption,
otlptracegrpc.WithEndpoint(collectorURL),
),
)
// 创建resource对象:指定服务的名称和开发语言
resources, err := resource.New(
context.Background(),
resource.WithAttributes(
attribute.String("service.name", serviceName),
attribute.String("library.language", "go"),
),
)
// 使用上面的exporter和resource对象,创建traceProvider对象
otel.SetTracerProvider(
sdktrace.NewTracerProvider(
sdktrace.WithSampler(sdktrace.AlwaysSample()),
sdktrace.WithBatcher(exporter),
sdktrace.WithResource(resources),
),
)
return exporter.Shutdown
}
通过otel的middleware,当有HTTP请求进来时,就可以通过traceProvider对请求的链路进行采样。
2.metrics
metrics的使用流程:
- 创建metricProvider;
- 通过metricProvider创建meter对象;
- 通过meter对象,在业务代码中创建并操作各种类型的Metrics对象;
func main() {
...
// 创建metricProvider
provider := metrics.InitMeter()
defer provider.Shutdown(context.Background())
// 创建metric对象
meter := provider.Meter("sample-golang-app")
// 使用metric对象创建各种Metric对象
metrics.GenerateMetrics(meter)
...
}
创建metricProvider的流程:
- 创建exporter对象:指定gRPC和url;
- 创建resource对象:执行服务的名称、开发语言等;
- 使用上面的exporter和resource对象,创建metricProvider;
func InitMeter() *metricsdk.MeterProvider {
// 创建exporter对象,指定gRPC和上报的url
exporter, err := otlpmetricgrpc.New(
context.Background(),
secureOption,
otlpmetricgrpc.WithEndpoint(collectorURL),
)
// 创建resource对象,指定服务名称和开发语言
res, err := resource.New(
context.Background(),
resource.WithAttributes(
attribute.String("service.name", serviceName),
attribute.String("library.language", "go"),
),
)
// 创建metricProvider对象
// Register the exporter with an SDK via a periodic reader.
provider := metricsdk.NewMeterProvider(
metricsdk.WithResource(res),
metricsdk.WithReader(metricsdk.NewPeriodicReader(exporter)),
)
return provider
}
metricProvider创建metric对象,最终由metric创建和操作业务代码中的各种指标;
比如下面创建的counter类型指标,指标名称=exceptions:
func exceptionsCounter(meter api.Meter) {
counter, err := meter.Int64Counter("exceptions", api.WithUnit("1"),
api.WithDescription("Counts exceptions since start"),
)
...
for {
// Increment the counter by 1.
// The attributes describe the exception.
counter.Add(context.Background(), 1, api.WithAttributes(attribute.KeyValue{
Key: attribute.Key("exception_type"), Value: attribute.StringValue("NullPointerException"),
}))
time.Sleep(time.Duration(rand.Int63n(5)) * time.Millisecond)
}
}
3.trace和metrics的上报
trace是通过opentelemetry-sdk中的traceServiceClient,使用gRPC上报到otel-collector;
代码中可以看到gRPC的method:
// go.opentelemetry.io/proto/otlp/trace/v1/trace_service_grpc.pb.go
func (c *traceServiceClient) Export(ctx context.Context, in *ExportTraceServiceRequest, opts ...grpc.CallOption) (*ExportTraceServiceResponse, error) {
out := new(ExportTraceServiceResponse)
err := c.cc.Invoke(ctx, "/opentelemetry.proto.collector.trace.v1.TraceService/Export", in, out, opts...)
if err != nil {
return nil, err
}
return out, nil
}
metrics是通过opentelemetry-sdk中的metricServiceClient,使用gRPC上报到otel-collector;
代码中可以看到gRPC的method:
// go.opentelemetry.io/proto/otlp/collector/metrics/v1/metrics_service_grpc.pb.go
func (c *metricsServiceClient) Export(ctx context.Context, in *ExportMetricsServiceRequest, opts ...grpc.CallOption) (*ExportMetricsServiceResponse, error) {
out := new(ExportMetricsServiceResponse)
err := c.cc.Invoke(ctx, "/opentelemetry.proto.collector.metrics.v1.MetricsService/Export", in, out, opts...)
if err != nil {
return nil, err
}
return out, nil
}
三.signoz-otel-collector
Signoz-otel-collector在标准otel-collector的基础上,实现了clickhouse-exporter,可以将trace和metrics数据导出到clickhouse中。
查看signoz-otel-collector的配置:
Trace: 接收jaeger和otlp协议的数据,使用clickhousetraces导出到clickhouse;
- 导出器clickhousetraces由signoz自己实现;
Metrics: 接口otlp协议的数据,使用clickhousemetricswrite导出到clickhouse;
- 导出器clickhousemetricswrite由signoz自己实现;
service:
...
pipelines:
traces:
receivers: [jaeger, otlp]
processors: [signozspanmetrics/prometheus, batch]
exporters: [clickhousetraces]
metrics:
receivers: [otlp]
processors: [batch]
exporters: [clickhousemetricswrite]
1. receiver模块
在接收otlp格式的数据时,gRPC端口=4317;
receivers:
...
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
2. exporter
在导出trace数据时,导出到clickhouse的signoz_traces数据库;
在导出metrics数据时,导出到clickhouse的signoz_metrics数据库;
exporters:
clickhousetraces:
datasource: tcp://clickhouse:9000/?database=signoz_traces
docker_multi_node_cluster: ${DOCKER_MULTI_NODE_CLUSTER}
low_cardinal_exception_grouping: ${LOW_CARDINAL_EXCEPTION_GROUPING}
clickhousemetricswrite:
endpoint: tcp://clickhouse:9000/?database=signoz_metrics
resource_to_telemetry_conversion:
enabled: true
四. clickhouse
在signoz平台的页面上查询trace和metrics,都是从clickhouse中读取。
可以在clickhouse上看到:
- 保存trace的database=signoz_traces;
- 保存metrics的database=signoz_metrics;
clickhouse :) show databases;
┌─name───────────────┐
│ INFORMATION_SCHEMA │
│ default │
│ information_schema │
│ signoz_metrics │
│ signoz_traces │
│ system │
└────────────────────┘
查看trace span,该trace是GET /books接口:
clickhouse :) use signoz_traces;
clickhouse :) select * from signoz_spans where traceID='00b6a4fd7a63e9bb71c58feaf4b5d451’;
Query id: 837fd791-0405-412d-b746-aaddf3124df1
┌─────────────────────timestamp─┬─traceID──────────────────────────┬─model──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ 2023-10-10 09:32:41.514338000 │ 00b6a4fd7a63e9bb71c58feaf4b5d451 │ {"traceId":"00b6a4fd7a63e9bb71c58feaf4b5d451","spanId":"a4d8a3c2d39bce4c","name":"gorm.Query","durationNano":249773,"startTimeUnixNano":1696930361514338000,"serviceName":"goApp","kind":3,"references":[{"traceId":"00b6a4fd7a63e9bb71c58feaf4b5d451","spanId":"84245fdc686341fa","refType":"CHILD_OF"}],"tagMap":{"db.rows_affected":"3","db.sql.table":"books","db.statement":"SELECT * FROM `books`","db.system":"sqlite","library.language":"go","service.name":"goApp","signoz.collector.id":"92ad0181-f387-4ed2-b23e-cd176a4042d9"},"stringTagMap":{"db.sql.table":"books","db.statement":"SELECT * FROM `books`","db.system":"sqlite","library.language":"go","service.name":"goApp","signoz.collector.id":"92ad0181-f387-4ed2-b23e-cd176a4042d9"},"numberTagMap":{"db.rows_affected":3}} │
│ 2023-10-10 09:32:41.514319000 │ 00b6a4fd7a63e9bb71c58feaf4b5d451 │ {"traceId":"00b6a4fd7a63e9bb71c58feaf4b5d451","spanId":"84245fdc686341fa","name":"/books","durationNano":302063,"startTimeUnixNano":1696930361514319000,"serviceName":"goApp","kind":2,"references":[{"traceId":"00b6a4fd7a63e9bb71c58feaf4b5d451","refType":"CHILD_OF"}],"tagMap":{"controller":"books","http.flavor":"1.1","http.method":"GET","http.route":"/books","http.scheme":"http","http.status_code":"200","http.user_agent":"curl/7.54.0","library.language":"go","net.host.name":"goApp","net.host.port":"8090","net.sock.peer.addr":"127.0.0.1","net.sock.peer.port":"49360","service.name":"goApp","signoz.collector.id":"92ad0181-f387-4ed2-b23e-cd176a4042d9"},"stringTagMap":{"controller":"books","http.flavor":"1.1","http.method":"GET","http.route":"/books","http.scheme":"http","http.user_agent":"curl/7.54.0","library.language":"go","net.host.name":"goApp","net.sock.peer.addr":"127.0.0.1","service.name":"goApp","signoz.collector.id":"92ad0181-f387-4ed2-b23e-cd176a4042d9"},"numberTagMap":{"http.status_code":200,"net.host.port":8090,"net.sock.peer.port":49360},"event":["{\"name\":\"This is a sample event\",\"timeUnixNano\":1696930361514327000,\"attributeMap\":{\"pid\":\"4328\",\"sampleAttribute\":\"Test\"}}"]} │
└───────────────────────────────┴──────────────────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
查看metrics数据,指标数据是series和samples分表存放的;
先看series:
clickhouse :) use signoz_metrics;
clickhouse :) select * from time_series_v2 where metric_name='exceptions';
SELECT *
FROM time_series_v2
WHERE metric_name = 'exceptions'
Query id: f3673d5b-75aa-409e-bbad-2f228ca0f46d
┌─metric_name─┬──────────fingerprint─┬──timestamp_ms─┬─labels──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┬─temporality─┐
│ exceptions │ 13922121808371638338 │ 1696930256875 │ {"__name__":"exceptions","__temporality__":"Cumulative","exception_type":"NullPointerException","library_language":"go","service_name":"goApp"} │ Cumulative │
└─────────────┴──────────────────────┴───────────────┴─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┴─────────────┘
1 row in set. Elapsed: 0.006 sec.
再看samples:
clickhouse :) select * from samples_v2 where metric_name='exceptions' order by timestamp_ms desc limit 10;
SELECT *
FROM samples_v2
WHERE metric_name = 'exceptions'
ORDER BY timestamp_ms DESC
LIMIT 10
Query id: 948abe53-92d8-4aae-a914-2d835a83ba57
┌─metric_name─┬──────────fingerprint─┬──timestamp_ms─┬───value─┐
│ exceptions │ 13922121808371638338 │ 1697004493806 │ 5896945 │
└─────────────┴──────────────────────┴───────────────┴─────────┘
┌─metric_name─┬──────────fingerprint─┬──timestamp_ms─┬───value─┐
│ exceptions │ 13922121808371638338 │ 1697004433806 │ 5870732 │
│ exceptions │ 13922121808371638338 │ 1697004373807 │ 5844194 │
│ exceptions │ 13922121808371638338 │ 1697004313807 │ 5817643 │
│ exceptions │ 13922121808371638338 │ 1697004253807 │ 5791466 │
│ exceptions │ 13922121808371638338 │ 1697004193807 │ 5765159 │
│ exceptions │ 13922121808371638338 │ 1697004133808 │ 5738712 │
│ exceptions │ 13922121808371638338 │ 1697004073808 │ 5712423 │
│ exceptions │ 13922121808371638338 │ 1697004013808 │ 5686062 │
│ exceptions │ 13922121808371638338 │ 1697003953809 │ 5659481 │
└─────────────┴──────────────────────┴───────────────┴─────────┘
10 rows in set. Elapsed: 0.015 sec. Processed 16.41 thousand rows, 213.78 KB (1.12 million rows/s., 14.56 MB/s.)
参考:
1.https://signoz.io/docs/instrumentation/golang/
2.https://github.com/SigNoz/sample-golang-app
3.https://signoz.io/opentelemetry/go/
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。