头图
延伸扩展:XX核心业务场景

路由标签打标、传播、检索

路由标签

链路标签染色与传播

SW: SkyWalking的简写
  1. 用户请求携带HTTP头信息X-sw8-correlation

    • "X-sw8-correlation: key1=value1,key2=value2,key3=value3"
  2. 网关侧读取解析HTTP头信息X-sw8-correlation,然后通过SW的关联上下文传播

    • ContextManager.getCorrelationContext().put(key, value)

      • SW插件使用
    • TraceContext.putCorrelation(key, value)

      • 应用使用
  3. 上游业务应用通过SW的关联上下文传播路由标签

    • CorrelationContext

链路标签染色与传播方案

业务应用通过属性变量设置SW关联上下文的Span跨度标签键的集合,进行自动打标搜索

  • JVM启动参数属性: java -Dskywalking.correlation.auto_tag_keys=autotag1,autotag2
  • 环境变量: SW_CORRELATION_AUTO_TAG_KEYS=autotag1,autotag2

应用变更

  • apm-gateway-trace-plugin,SW插件,feature/guangyi/20240531_span_tag_sw8_correlation分支

    • 网关应用java代理
  • [x] 链路标签染色与传播

具体实现

响应式的网关上下文

public class GatewayContextConstructorInterceptor implements InstanceConstructorInterceptor {
    @Override
    public void onConstruct(EnhancedInstance objInst, Object[] allArguments) throws Throwable {
        String url = "gtx";
        String method = "none";
        ContextCarrier carrier = new ContextCarrier();
        Map<String, String> data = null;
        if (allArguments.length > 0 && allArguments[0] != null && allArguments[0] instanceof HttpServerRequest) {
            // HTTP请求
            HttpServerRequest request = (HttpServerRequest) allArguments[0];
            HttpHeaders headers = request.requestHeaders();
            // sw8协议头
            CarrierItem next = carrier.items();
            while (next.hasNext()) {
                next = next.next();
                String header = headers.get(next.getHeadKey());
                if (!StringUtils.isBlank(header)) {
                    next.setHeadValue(header);
                }
            }
            url = request.uri();
            method = request.method().name();
            String sw8Correlation = headers.get(GatewayPluginConst.X_SW8_CORRELATION);
            if (StringUtils.isNotBlank(sw8Correlation)) {
                data = split(sw8Correlation);
            }
        }
        // ...
        // 追踪上下文
        AbstractSpan span = ContextManager.createEntrySpan(url, carrier);
        span.setComponent(ComponentsDefine.NETTY_HTTP);
        SpanLayer.asHttp(span);
        Tags.URL.set(span, url);
        Tags.HTTP.METHOD.set(span, method);
        if (Objects.nonNull(data)) {
            // 跨度打标签
//            data.forEach((key, value) -> span.tag(new StringTag(key), value));
            // 关联上下文传播标签数据【追踪上下文】
            CorrelationContext correlationContext = ContextManager.getCorrelationContext();
            if (Objects.nonNull(correlationContext)) {
                data.forEach(correlationContext::put);
            }
        }
        // user async interface
        // 准备异步请求
        span.prepareForAsync();
        // ...
        // 动态字段
        ContextSnapshot capture = ContextManager.capture();
        EnhanceObjectCache cache2 = new EnhanceObjectCache();
        cache2.setSnapshot(capture);
        cache2.setGwEntrySpan(span);
        objInst.setSkyWalkingDynamicField(cache2);
        // ...
        ContextManager.stopSpan(span);
    }

    private static Map<String, String> split(String value) {
        final Map<String, String> data = new HashMap<>(8);
        for (String perData : value.split(",")) {
            final String[] parts = perData.split(":");
            if (parts.length != 2) {
                continue;
            }
            data.put(parts[0], parts[1]);
        }
        return data;
    }
}
public class GatewayPluginConst {
    /**
     * sw8-correlation
     * 关联上下文
     */
    public static final String X_SW8_CORRELATION = "X-sw8-correlation";
}

使用示例-案例实战

请求示例
SkyWalking管控台

通过Arthas命令验证

monitor/watch/trace 相关 - Arthas 命令列表

观测HttpServerRequest.requestHeaders()头信息

[arthas@1]$ watch reactor.netty.http.server.HttpServerRequest requestHeaders '{returnObj}' 'returnObj.contains("X-sw8-correlation")'

ts=2024-06-10 14:23:50; [cost=0.007682ms] result=@DefaultHttpHeaders[
    headers=@DefaultHeadersImpl[DefaultHeadersImpl[host: xxx.com, user-agent: PostmanRuntime-ApipostRuntime/1.1.0, cache-control: no-cache, content-type: application/json, accept: */*, X-sw8-correlation: cyborg-flow:true,scene-label:biz-route,scene-tag:stress-test, content-length: 241, x-forwarded-for: xxx.xxx.xxx.xxx, x-forwarded-proto: https, x-envoy-external-address: xxx.xxx.xxx.xxx, x-request-id: 25c629d1-3fa6-4f9f-9fa7-d2872b4f60b9, x-envoy-attempt-count: 1, x-forwarded-client-cert: By=spiffe://cluster.local/ns/sit/sa/default;Hash=cae34793a85e9861fa9a5cf1518b4d84344719e59b5a3c499a73edd158e3da1e;Subject="";URI=spiffe://cluster.local/ns/istio-system/sa/istio-public-api-ingress-gateway-service-account]],
]

X-sw8-correlation: cyborg-flow:true,scene-label:biz-route,scene-tag:stress-test,

观测ContextManager.getCorrelationContext()关联上下文实例

[arthas@1]$ watch org.apache.skywalking.apm.agent.core.context.ContextManager getCorrelationContext '{returnObj}' -x 3

ts=2024-06-10 14:55:17; [cost=0.0627ms] result=@ArrayList[
    @CorrelationContext[
        data=@ConcurrentHashMap[isEmpty=true;size=0],
        AUTO_TAG_KEYS=@ArrayList[
            @String[cyborg-flow],
            @String[scene-label],
            @String[scene-tag],
        ],
    ],
]

观测CorrelationContext.put(key, value)操作后,关联上下文的状态变化过程

[arthas@1]$ watch org.apache.skywalking.apm.agent.core.context.CorrelationContext put '{params, target, returnObj}' -x 3

method=org.apache.skywalking.apm.agent.core.context.CorrelationContext.put location=AtExit
ts=2024-06-10 15:01:22; [cost=0.134165ms] result=@ArrayList[
    @Object[][
        @String[scene-label],
        @String[biz-route],
    ],
    @CorrelationContext[
        data=@ConcurrentHashMap[
            @String[scene-label]:@String[biz-route],
        ],
        AUTO_TAG_KEYS=@ArrayList[
            @String[cyborg-flow],
            @String[scene-label],
            @String[scene-tag],
        ],
    ],
]
method=org.apache.skywalking.apm.agent.core.context.CorrelationContext.put location=AtExit
ts=2024-06-10 15:01:22; [cost=0.01605ms] result=@ArrayList[
    @Object[][
        @String[cyborg-flow],
        @String[true],
    ],
    @CorrelationContext[
        data=@ConcurrentHashMap[
            @String[scene-label]:@String[biz-route],
            @String[cyborg-flow]:@String[true],
        ],
        AUTO_TAG_KEYS=@ArrayList[
            @String[cyborg-flow],
            @String[scene-label],
            @String[scene-tag],
        ],
    ],
]
method=org.apache.skywalking.apm.agent.core.context.CorrelationContext.put location=AtExit
ts=2024-06-10 15:01:22; [cost=0.007927ms] result=@ArrayList[
    @Object[][
        @String[scene-tag],
        @String[stress-test],
    ],
    @CorrelationContext[
        data=@ConcurrentHashMap[
            @String[scene-label]:@String[biz-route],
            @String[cyborg-flow]:@String[true],
            @String[scene-tag]:@String[stress-test],
        ],
        AUTO_TAG_KEYS=@ArrayList[
            @String[cyborg-flow],
            @String[scene-label],
            @String[scene-tag],
        ],
    ],
]

观测提供者上下文拦截器ContextFilter.invoke(invoker, invocation)的调用对象

[arthas@7]$ watch org.apache.dubbo.rpc.filter.ContextFilter invoke '{params[1].getAttachments(), returnObj}' -x 3

method=org.apache.dubbo.rpc.filter.ContextFilter.invoke location=AtExit
ts=2024-06-10 15:16:30; [cost=24.479313ms] result=@ArrayList[
    @ObjectToStringMap[
        @String[traceid]:@String[0a57ddf0732748208240f278a248de88.66.17181765903061237],
        @String[x-request-id]:@String[0fe97869-15d9-452f-9374-228f23e56f43],
        @String[x-forwarded-proto]:@String[http],
        @String[sw8-correlation]:@String[c2NlbmUtbGFiZWw=:Yml6LXJvdXRl,Y3lib3JnLWZsb3c=:dHJ1ZQ==,c2NlbmUtdGFn:c3RyZXNzLXRlc3Q=],
        @String[timeout]:@String[5000],
        @String[generic]:@String[gson],
        @String[x-envoy-attempt-count]:@String[1],
        @String[remote.application]:@String[xxx-reactor-gateway],
        @String[sw8-x]:@String[0- ],
        @String[sw8]:@String[1-MGE1N2RkZjA3MzI3NDgyMDgyNDBmMjc4YTI0OGRlODguNjYuMTcxODE3NjU5MDMwNjEyMzc=-MGE1N2RkZjA3MzI3NDgyMDgyNDBmMjc4YTI0OGRlODguNjYuMTcxODE3NjU5MDMwODEyMzg=-0-bGVmaXQtcmVhY3Rvci1nYXRld2F5fHxzaXQ=-ZjdjNjRjNjcwYjcyNDkxZGFmNGQ5YTIyOTc5ZGZjZjdAMTkyLjE2OC4xMTAuMjUx-bnVsbC5nZXRBZHZlcnRpc2VDb25maWdOZXcoKQ==-c2l0L2xlZml0LWNtcy5zaXQuc3ZjLmNsdXN0ZXIubG9jYWw6MA==],
        @String[x-forwarded-client-cert]:@String[By=spiffe://cluster.local/ns/sit/sa/default;Hash=7e7ef818f1a9cd3156d98010276ff6004b5439ce8548d1b5972066e4138a8e0f;Subject="";URI=spiffe://cluster.local/ns/sit/sa/default],
        @String[id]:@String[605975],
    ],
    @AsyncRpcResult[
        invocation=@RpcInvocation[
            targetServiceUniqueName=@String[com.xxx.cms.api.XxxService],
            protocolServiceKey=@String[com.xxx.cms.api.XxxService:tri],
            serviceModel=@ProviderModel[org.apache.dubbo.rpc.model.ProviderModel@4142a666],
            methodName=@String[getXxxConfig],
            interfaceName=@String[com.xxx.cms.api.XxxService],
            parameterTypes=@Class[][isEmpty=false;size=1],
            parameterTypesDesc=@String[Lcom/xxx/dubbo/common/CommonRequest;],
            arguments=@Object[][isEmpty=false;size=1],
            attachments=@HashMap[isEmpty=false;size=18],
            invoker=@CopyOfFilterChainNode[org.apache.dubbo.registry.integration.RegistryProtocol$InvokerDelegate@3537aa7b],
            returnType=@Class[class com.xxx.dubbo.common.CommonResponse],
        ],
        async=@Boolean[false],
        responseFuture=@CompletableFuture[
            result=@AppResponse[AppResponse [value=com.xxx.dubbo.common.CommonResponse@697af028[code=00000,msg=<null>,data=AppPlaceResponse(type=1, advertiseBoxId=984056422369603584, boxTemplatePlaceNum=1, boxTemplateId=null, name=首页营销专区, intro=null, list=[XxxAppResponse(advertisePlaceId=984148049901760512, name=首页营销专区, subhead=null, pictureUrl=https://img.xxx.com/xxx.png, extraPictureUrl=null, concernment=999, boxTemplatePlace=1, colorType=null)]),page=<null>], exception=null]],
        ],
    ],
]

@String[sw8-correlation]:@String[c2NlbmUtbGFiZWw=:Yml6LXJvdXRl,Y3lib3JnLWZsb3c=:dHJ1ZQ==,c2NlbmUtdGFn:c3RyZXNzLXRlc3Q=],

P0核心应用如何定义

分层遍历并展示应用依赖关系和拓扑
  1. 业务手工标识
  2. 应用依赖1度关系输出(链路追踪的1度关系拓扑)
  3. 基于P0场景,根据追踪ID输出调用链路的应用列表和资源列表

拓扑分层展示

N叉树层序遍历框架,「遍历」的思维模式

将力扣LeetCode的N叉树层序遍历框架,扩展延伸到有向无环图DAG
通过p0种子应用列表,基于链路追踪的应用依赖1度关系。

链路标签数据检索

{"query":"query queryTraces($condition: TraceQueryCondition) {\n data: queryBasicTraces(condition: $condition) {\n traces {\n key: segmentId\n endpointNames\n duration\n start\n isError\n traceIds\n }\n }}","variables":{"condition":{"queryDuration":{"start":"2024-06-03 0624","end":"2024-06-03 0654","step":"MINUTE"},"traceState":"ALL","queryOrder":"BY\_START\_TIME","paging":{"pageNum":1,"pageSize":20},"tags":[{"key":"http.status\_code","value":"200"}],"minTraceDuration":null,"maxTraceDuration":null,"serviceId":"bGVmaXQtdXNlcnx8cHJvZA==.1","serviceInstanceId":"bGVmaXQtdXNlcnx8cHJvZA==.1\_YmRiZGU2MTMyM2NmNGVmMDhmNDYyNzQxOWY1MDg1ZDNAMTkyLjE2OC4xMDcuMjA4"}}}

踩过的坑

关联上下文的自动跨度标签的键集合未配置未生效

【原因】可能是当时改了application.yml文件中的searchableTracesTags: ${SW_SEARCHABLE_TAG_KEYS:配置项,OAP和UI未重启。OAP和UI都需要配置和重启,这样才能一起生效。

关联上下文的自动跨度标签的键集合,查看配置是否生效?

1.业务应用侧skywalking-agent.jar使用方

[arthas@1]$ getstatic org.apache.skywalking.apm.agent.core.context.CorrelationContext AUTO_TAG_KEYS
field: AUTO_TAG_KEYS
@ArrayList[
    @String[sw8_userId],
    @String[scene.label],
    @String[scene],
]

[arthas@1]$ getstatic org.apache.skywalking.apm.agent.core.conf.Config$Correlation ELEMENT_MAX_NUMBER
field: ELEMENT_MAX_NUMBER
@Integer[8]

config/agent.config

# Max element count in the correlation context
correlation.element_max_number=${SW_CORRELATION_ELEMENT_MAX_NUMBER:8}

# Max value length of each element.
correlation.value_max_length=${SW_CORRELATION_VALUE_MAX_LENGTH:128}

# Tag the span by the key/value in the correlation context, when the keys listed here exist.
correlation.auto_tag_keys=${SW_CORRELATION_AUTO_TAG_KEYS:sw8_userId,scene.label,scene}

2.SkyWalking OAP/UI服务端

[arthas@1]$ vmtool -x 3 --action getInstances --className org.apache.skywalking.oap.server.core.CoreModuleConfig  --express 'instances[0].searchableTracesTags'
@String[http.method,http.status_code,rpc.status_code,db.type,db.instance,mq.queue,mq.topic,mq.broker,sw8_userId,scene.label,scene]

[arthas@1]$ vmtool -x 3 --action getInstances --className org.apache.skywalking.oap.server.core.config.SearchableTracesTagsWatcher  --express 'instances[0].searchableTags'
@HashSet[
    @String[db.instance],
    @String[mq.topic],
    @String[http.status_code],
    @String[db.type],
    @String[scene.label],
    @String[mq.queue],
    @String[sw8_userId],
    @String[http.method],
    @String[rpc.status_code],
    @String[mq.broker],
    @String[scene],
]

config/application.yml

# Define the set of span tag keys, which should be searchable through the GraphQL.
# The max length of key=value should be less than 256 or will be dropped.
searchableTracesTags: ${SW_SEARCHABLE_TAG_KEYS:http.method,http.status_code,rpc.status_code,db.type,db.instance,mq.queue,mq.topic,mq.broker,sw8_userId,scene.label,scene}

ElasticSearch,查看SkyWalking追踪段的跨度标签索引信息

# 追踪段的跨度标签搜索
GET observability_segment-20240724/_search
{
  "size": 0,
  "aggs": {
    "unique_tags": {
      "terms": {
        "field": "tags",
        "size": 50000000
      }
    }
  }
}
{
  "took" : 3586,
  "timed_out" : false,
  "_shards" : {
    "total" : 18,
    "successful" : 18,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "unique_tags" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "scene=p0",
          "doc_count" : 77
        },
        {
          "key" : "sw8_userId=123456789",
          "doc_count" : 76
        },
        {
          "key" : "cyborg-flow=true",
          "doc_count" : 33
        },
        {
          "key" : "scene-tag=p0",
          "doc_count" : 33
        }
      ]
    }
  }
}

参考


简放视野
18 声望0 粉丝

Microservices, Cloud Native, Service Mesh. Java, Go.