延伸扩展:XX核心业务场景
路由标签打标、传播、检索
链路标签染色与传播
SW: SkyWalking的简写
用户请求携带HTTP头信息
X-sw8-correlation
- "X-sw8-correlation: key1=value1,key2=value2,key3=value3"
网关侧读取解析HTTP头信息
X-sw8-correlation
,然后通过SW的关联上下文传播ContextManager.getCorrelationContext().put(key, value)
- SW插件使用
TraceContext.putCorrelation(key, value)
- 应用使用
上游业务应用通过SW的关联上下文传播路由标签
CorrelationContext
业务应用通过属性变量设置SW关联上下文的Span跨度标签键的集合,进行自动打标搜索。
- JVM启动参数属性:
java -Dskywalking.correlation.auto_tag_keys=autotag1,autotag2
- 环境变量:
SW_CORRELATION_AUTO_TAG_KEYS=autotag1,autotag2
应用变更
apm-gateway-trace-plugin,SW插件,
feature/guangyi/20240531_span_tag_sw8_correlation
分支- 网关应用java代理
- [x] 链路标签染色与传播
具体实现
响应式的网关上下文
public class GatewayContextConstructorInterceptor implements InstanceConstructorInterceptor {
@Override
public void onConstruct(EnhancedInstance objInst, Object[] allArguments) throws Throwable {
String url = "gtx";
String method = "none";
ContextCarrier carrier = new ContextCarrier();
Map<String, String> data = null;
if (allArguments.length > 0 && allArguments[0] != null && allArguments[0] instanceof HttpServerRequest) {
// HTTP请求
HttpServerRequest request = (HttpServerRequest) allArguments[0];
HttpHeaders headers = request.requestHeaders();
// sw8协议头
CarrierItem next = carrier.items();
while (next.hasNext()) {
next = next.next();
String header = headers.get(next.getHeadKey());
if (!StringUtils.isBlank(header)) {
next.setHeadValue(header);
}
}
url = request.uri();
method = request.method().name();
String sw8Correlation = headers.get(GatewayPluginConst.X_SW8_CORRELATION);
if (StringUtils.isNotBlank(sw8Correlation)) {
data = split(sw8Correlation);
}
}
// ...
// 追踪上下文
AbstractSpan span = ContextManager.createEntrySpan(url, carrier);
span.setComponent(ComponentsDefine.NETTY_HTTP);
SpanLayer.asHttp(span);
Tags.URL.set(span, url);
Tags.HTTP.METHOD.set(span, method);
if (Objects.nonNull(data)) {
// 跨度打标签
// data.forEach((key, value) -> span.tag(new StringTag(key), value));
// 关联上下文传播标签数据【追踪上下文】
CorrelationContext correlationContext = ContextManager.getCorrelationContext();
if (Objects.nonNull(correlationContext)) {
data.forEach(correlationContext::put);
}
}
// user async interface
// 准备异步请求
span.prepareForAsync();
// ...
// 动态字段
ContextSnapshot capture = ContextManager.capture();
EnhanceObjectCache cache2 = new EnhanceObjectCache();
cache2.setSnapshot(capture);
cache2.setGwEntrySpan(span);
objInst.setSkyWalkingDynamicField(cache2);
// ...
ContextManager.stopSpan(span);
}
private static Map<String, String> split(String value) {
final Map<String, String> data = new HashMap<>(8);
for (String perData : value.split(",")) {
final String[] parts = perData.split(":");
if (parts.length != 2) {
continue;
}
data.put(parts[0], parts[1]);
}
return data;
}
}
public class GatewayPluginConst {
/**
* sw8-correlation
* 关联上下文
*/
public static final String X_SW8_CORRELATION = "X-sw8-correlation";
}
使用示例-案例实战
通过Arthas命令验证
monitor/watch/trace 相关 - Arthas 命令列表
观测HttpServerRequest.requestHeaders()
的头信息
[arthas@1]$ watch reactor.netty.http.server.HttpServerRequest requestHeaders '{returnObj}' 'returnObj.contains("X-sw8-correlation")'
ts=2024-06-10 14:23:50; [cost=0.007682ms] result=@DefaultHttpHeaders[
headers=@DefaultHeadersImpl[DefaultHeadersImpl[host: xxx.com, user-agent: PostmanRuntime-ApipostRuntime/1.1.0, cache-control: no-cache, content-type: application/json, accept: */*, X-sw8-correlation: cyborg-flow:true,scene-label:biz-route,scene-tag:stress-test, content-length: 241, x-forwarded-for: xxx.xxx.xxx.xxx, x-forwarded-proto: https, x-envoy-external-address: xxx.xxx.xxx.xxx, x-request-id: 25c629d1-3fa6-4f9f-9fa7-d2872b4f60b9, x-envoy-attempt-count: 1, x-forwarded-client-cert: By=spiffe://cluster.local/ns/sit/sa/default;Hash=cae34793a85e9861fa9a5cf1518b4d84344719e59b5a3c499a73edd158e3da1e;Subject="";URI=spiffe://cluster.local/ns/istio-system/sa/istio-public-api-ingress-gateway-service-account]],
]
X-sw8-correlation: cyborg-flow:true,scene-label:biz-route,scene-tag:stress-test,
观测ContextManager.getCorrelationContext()
的关联上下文实例
[arthas@1]$ watch org.apache.skywalking.apm.agent.core.context.ContextManager getCorrelationContext '{returnObj}' -x 3
ts=2024-06-10 14:55:17; [cost=0.0627ms] result=@ArrayList[
@CorrelationContext[
data=@ConcurrentHashMap[isEmpty=true;size=0],
AUTO_TAG_KEYS=@ArrayList[
@String[cyborg-flow],
@String[scene-label],
@String[scene-tag],
],
],
]
观测CorrelationContext.put(key, value)
操作后,关联上下文的状态变化过程
[arthas@1]$ watch org.apache.skywalking.apm.agent.core.context.CorrelationContext put '{params, target, returnObj}' -x 3
method=org.apache.skywalking.apm.agent.core.context.CorrelationContext.put location=AtExit
ts=2024-06-10 15:01:22; [cost=0.134165ms] result=@ArrayList[
@Object[][
@String[scene-label],
@String[biz-route],
],
@CorrelationContext[
data=@ConcurrentHashMap[
@String[scene-label]:@String[biz-route],
],
AUTO_TAG_KEYS=@ArrayList[
@String[cyborg-flow],
@String[scene-label],
@String[scene-tag],
],
],
]
method=org.apache.skywalking.apm.agent.core.context.CorrelationContext.put location=AtExit
ts=2024-06-10 15:01:22; [cost=0.01605ms] result=@ArrayList[
@Object[][
@String[cyborg-flow],
@String[true],
],
@CorrelationContext[
data=@ConcurrentHashMap[
@String[scene-label]:@String[biz-route],
@String[cyborg-flow]:@String[true],
],
AUTO_TAG_KEYS=@ArrayList[
@String[cyborg-flow],
@String[scene-label],
@String[scene-tag],
],
],
]
method=org.apache.skywalking.apm.agent.core.context.CorrelationContext.put location=AtExit
ts=2024-06-10 15:01:22; [cost=0.007927ms] result=@ArrayList[
@Object[][
@String[scene-tag],
@String[stress-test],
],
@CorrelationContext[
data=@ConcurrentHashMap[
@String[scene-label]:@String[biz-route],
@String[cyborg-flow]:@String[true],
@String[scene-tag]:@String[stress-test],
],
AUTO_TAG_KEYS=@ArrayList[
@String[cyborg-flow],
@String[scene-label],
@String[scene-tag],
],
],
]
观测提供者上下文拦截器ContextFilter.invoke(invoker, invocation)
的调用对象
[arthas@7]$ watch org.apache.dubbo.rpc.filter.ContextFilter invoke '{params[1].getAttachments(), returnObj}' -x 3
method=org.apache.dubbo.rpc.filter.ContextFilter.invoke location=AtExit
ts=2024-06-10 15:16:30; [cost=24.479313ms] result=@ArrayList[
@ObjectToStringMap[
@String[traceid]:@String[0a57ddf0732748208240f278a248de88.66.17181765903061237],
@String[x-request-id]:@String[0fe97869-15d9-452f-9374-228f23e56f43],
@String[x-forwarded-proto]:@String[http],
@String[sw8-correlation]:@String[c2NlbmUtbGFiZWw=:Yml6LXJvdXRl,Y3lib3JnLWZsb3c=:dHJ1ZQ==,c2NlbmUtdGFn:c3RyZXNzLXRlc3Q=],
@String[timeout]:@String[5000],
@String[generic]:@String[gson],
@String[x-envoy-attempt-count]:@String[1],
@String[remote.application]:@String[xxx-reactor-gateway],
@String[sw8-x]:@String[0- ],
@String[sw8]:@String[1-MGE1N2RkZjA3MzI3NDgyMDgyNDBmMjc4YTI0OGRlODguNjYuMTcxODE3NjU5MDMwNjEyMzc=-MGE1N2RkZjA3MzI3NDgyMDgyNDBmMjc4YTI0OGRlODguNjYuMTcxODE3NjU5MDMwODEyMzg=-0-bGVmaXQtcmVhY3Rvci1nYXRld2F5fHxzaXQ=-ZjdjNjRjNjcwYjcyNDkxZGFmNGQ5YTIyOTc5ZGZjZjdAMTkyLjE2OC4xMTAuMjUx-bnVsbC5nZXRBZHZlcnRpc2VDb25maWdOZXcoKQ==-c2l0L2xlZml0LWNtcy5zaXQuc3ZjLmNsdXN0ZXIubG9jYWw6MA==],
@String[x-forwarded-client-cert]:@String[By=spiffe://cluster.local/ns/sit/sa/default;Hash=7e7ef818f1a9cd3156d98010276ff6004b5439ce8548d1b5972066e4138a8e0f;Subject="";URI=spiffe://cluster.local/ns/sit/sa/default],
@String[id]:@String[605975],
],
@AsyncRpcResult[
invocation=@RpcInvocation[
targetServiceUniqueName=@String[com.xxx.cms.api.XxxService],
protocolServiceKey=@String[com.xxx.cms.api.XxxService:tri],
serviceModel=@ProviderModel[org.apache.dubbo.rpc.model.ProviderModel@4142a666],
methodName=@String[getXxxConfig],
interfaceName=@String[com.xxx.cms.api.XxxService],
parameterTypes=@Class[][isEmpty=false;size=1],
parameterTypesDesc=@String[Lcom/xxx/dubbo/common/CommonRequest;],
arguments=@Object[][isEmpty=false;size=1],
attachments=@HashMap[isEmpty=false;size=18],
invoker=@CopyOfFilterChainNode[org.apache.dubbo.registry.integration.RegistryProtocol$InvokerDelegate@3537aa7b],
returnType=@Class[class com.xxx.dubbo.common.CommonResponse],
],
async=@Boolean[false],
responseFuture=@CompletableFuture[
result=@AppResponse[AppResponse [value=com.xxx.dubbo.common.CommonResponse@697af028[code=00000,msg=<null>,data=AppPlaceResponse(type=1, advertiseBoxId=984056422369603584, boxTemplatePlaceNum=1, boxTemplateId=null, name=首页营销专区, intro=null, list=[XxxAppResponse(advertisePlaceId=984148049901760512, name=首页营销专区, subhead=null, pictureUrl=https://img.xxx.com/xxx.png, extraPictureUrl=null, concernment=999, boxTemplatePlace=1, colorType=null)]),page=<null>], exception=null]],
],
],
]
@String[sw8-correlation]:@String[c2NlbmUtbGFiZWw=:Yml6LXJvdXRl,Y3lib3JnLWZsb3c=:dHJ1ZQ==,c2NlbmUtdGFn:c3RyZXNzLXRlc3Q=],
P0核心应用如何定义
分层遍历并展示应用依赖关系和拓扑
- 业务手工标识
- 应用依赖1度关系输出(链路追踪的1度关系拓扑)
- 基于P0场景,根据追踪ID输出调用链路的应用列表和资源列表
拓扑分层展示
N叉树层序遍历框架,「遍历」的思维模式
将力扣LeetCode的N叉树层序遍历框架,扩展延伸到有向无环图DAG。
通过p0种子应用列表,基于链路追踪的应用依赖1度关系。
链路标签数据检索
{"query":"query queryTraces($condition: TraceQueryCondition) {\n data: queryBasicTraces(condition: $condition) {\n traces {\n key: segmentId\n endpointNames\n duration\n start\n isError\n traceIds\n }\n }}","variables":{"condition":{"queryDuration":{"start":"2024-06-03 0624","end":"2024-06-03 0654","step":"MINUTE"},"traceState":"ALL","queryOrder":"BY\_START\_TIME","paging":{"pageNum":1,"pageSize":20},"tags":[{"key":"http.status\_code","value":"200"}],"minTraceDuration":null,"maxTraceDuration":null,"serviceId":"bGVmaXQtdXNlcnx8cHJvZA==.1","serviceInstanceId":"bGVmaXQtdXNlcnx8cHJvZA==.1\_YmRiZGU2MTMyM2NmNGVmMDhmNDYyNzQxOWY1MDg1ZDNAMTkyLjE2OC4xMDcuMjA4"}}}
踩过的坑
关联上下文的自动跨度标签的键集合未配置未生效
【原因】可能是当时改了application.yml
文件中的searchableTracesTags: ${SW_SEARCHABLE_TAG_KEYS:
配置项,OAP和UI未重启。OAP和UI都需要配置和重启,这样才能一起生效。
关联上下文的自动跨度标签的键集合,查看配置是否生效?
1.业务应用侧,skywalking-agent.jar
使用方
[arthas@1]$ getstatic org.apache.skywalking.apm.agent.core.context.CorrelationContext AUTO_TAG_KEYS
field: AUTO_TAG_KEYS
@ArrayList[
@String[sw8_userId],
@String[scene.label],
@String[scene],
]
[arthas@1]$ getstatic org.apache.skywalking.apm.agent.core.conf.Config$Correlation ELEMENT_MAX_NUMBER
field: ELEMENT_MAX_NUMBER
@Integer[8]
config/agent.config
# Max element count in the correlation context
correlation.element_max_number=${SW_CORRELATION_ELEMENT_MAX_NUMBER:8}
# Max value length of each element.
correlation.value_max_length=${SW_CORRELATION_VALUE_MAX_LENGTH:128}
# Tag the span by the key/value in the correlation context, when the keys listed here exist.
correlation.auto_tag_keys=${SW_CORRELATION_AUTO_TAG_KEYS:sw8_userId,scene.label,scene}
2.SkyWalking OAP/UI服务端
[arthas@1]$ vmtool -x 3 --action getInstances --className org.apache.skywalking.oap.server.core.CoreModuleConfig --express 'instances[0].searchableTracesTags'
@String[http.method,http.status_code,rpc.status_code,db.type,db.instance,mq.queue,mq.topic,mq.broker,sw8_userId,scene.label,scene]
[arthas@1]$ vmtool -x 3 --action getInstances --className org.apache.skywalking.oap.server.core.config.SearchableTracesTagsWatcher --express 'instances[0].searchableTags'
@HashSet[
@String[db.instance],
@String[mq.topic],
@String[http.status_code],
@String[db.type],
@String[scene.label],
@String[mq.queue],
@String[sw8_userId],
@String[http.method],
@String[rpc.status_code],
@String[mq.broker],
@String[scene],
]
config/application.yml
# Define the set of span tag keys, which should be searchable through the GraphQL.
# The max length of key=value should be less than 256 or will be dropped.
searchableTracesTags: ${SW_SEARCHABLE_TAG_KEYS:http.method,http.status_code,rpc.status_code,db.type,db.instance,mq.queue,mq.topic,mq.broker,sw8_userId,scene.label,scene}
ElasticSearch,查看SkyWalking追踪段的跨度标签索引信息
# 追踪段的跨度标签搜索
GET observability_segment-20240724/_search
{
"size": 0,
"aggs": {
"unique_tags": {
"terms": {
"field": "tags",
"size": 50000000
}
}
}
}
{
"took" : 3586,
"timed_out" : false,
"_shards" : {
"total" : 18,
"successful" : 18,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 10000,
"relation" : "gte"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"unique_tags" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "scene=p0",
"doc_count" : 77
},
{
"key" : "sw8_userId=123456789",
"doc_count" : 76
},
{
"key" : "cyborg-flow=true",
"doc_count" : 33
},
{
"key" : "scene-tag=p0",
"doc_count" : 33
}
]
}
}
}
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。