Jaeger&#39;s client-side sampling configuration (Java Edition)

Welcome to my GitHub

Content: Classification and summary of all original articles and supporting source code, involving Java, Docker, Kubernetes, DevOPS, etc.;

About sampling (Sampling)

Sampling is easy to understand: when using Jaeger, it is not necessary to report all requests to Jaeger. Sometimes it is enough to sample some of the observations, which is to sample according to a certain strategy;
Jaeger SDK supports multiple sampling configurations. In a distributed system, they follow the principle of consistent upfront or head-based. In simple terms, if the consumer service calls the provider service, then a certain request only needs the consumer Decide not to sample, then the provider will not sample when processing this request. That is to say, for a complete trace, as long as the front service does not report to Jaeger, then the services involved in the whole trace will not be reported to Jaeger.
Jaeger sampling configuration is divided into client and server configuration, the default is the server configuration
In this article, let’s understand how to configure sampling on the client (that is, the application that accesses Jaeger) and verify the effect by hand. There are three commonly used client sampling strategies:

Fixed: Either sample all, not all essential gates
Scale: Sampling according to the specified scale
Speed limit: a fixed number of samples in a fixed time period, such as one per second

Next, configure and experience the effects of these three samplings one by one

About actual combat engineering

The actual sampling configuration does not involve coding, only some configuration needs to be changed, so there is no need to write code for a new project with . Use the two maven sub-projects in the article 161c909130c304 "Introduction to Jaeger Development (Java Edition)" font color="blue">jaeger-service-provider and the service caller jaeger-service-consumer are made into docker images and started with docker-compose, The network architecture is as follows:

在这里插入图片描述

Please make sure that traceId, spanId, sampledVariables, as shown in the red box in the following figure, with these configurations, we can see in the log whether the corresponding trace is sampled (this step is very important ) :

在这里插入图片描述

In order to facilitate the redeployment and startup after modifying the code, I wrote a shell script file named full.sh. Run the modified code to make the latest image and use docker -compose up and running:

#!/bin/bash
echo "停止docker-compose"
cd jaeger-service-provider && docker-compose down && cd ..

echo "编译构建"
mvn clean package -U -DskipTests

echo “创建provider镜像”
cd jaeger-service-provider && docker build -t bolingcavalry/jaeger-service-provider:0.0.1 . && cd ..

echo “创建consumer镜像”
cd jaeger-service-consumer && docker build -t bolingcavalry/jaeger-service-consumer:0.0.1 . && cd ..

echo "清理无效资源"
docker system prune --volumes -f

echo "启动docker-compose"
cd jaeger-service-provider && docker-compose up -d && cd ..

If you are using IDEA, add a custom command in the red box position in the figure below, select the above shell file, you can use the run command in IDEA to compile, build and deploy:

在这里插入图片描述

Now that the preparatory work has been completed, let's start the actual combat, starting from the simplest fixed sampling;

Fixed sampling

The logic of fixed sampling is very simple: either all of them are reported, or none of them are reported.
The configuration of fixed sampling is shown in the red box in the figure below:

在这里插入图片描述

It should be noted that: According to the principle of consistent upfront or head-based, just write the above configuration into the configuration file of the jaeger-service-consumer project. As for jaeger-service-provider remains as it is without any changes
Execute the full.sh script written earlier, compile, build and deploy
The browser visits http://localhost:18080/hello to generate some web requests. Visit several times
Look at the log of the jaeger-service-consumer container, as shown in the figure below. The sampled=false in the red box indicates that it is not sampled. The logs of the three requests are all like this:

在这里插入图片描述

Look at the log of the jaeger-service-provider container. The red box in the figure below is not sampled. This proves that Jaeger's pre-judgment principle (consistent upfront or head-based) is accurate, and jaeger-service-consumer is a trace The source of the sampled trace is closed by it, and the sampling will be automatically closed in the follow-up service:

在这里插入图片描述

Go to Jaeger's web page to see, it is empty, and there is no jaeger-service-consumer and jaeger-service-provider in the service list:

在这里插入图片描述

I tried all without sampling, and then try the configuration of all sampling, as shown in the red box in the figure below:

在这里插入图片描述

Re-deploy, generate a few more requests, go to the log of the jaeger-service-consumer container, as shown in the red box in the figure below, all have been sampled:

在这里插入图片描述

Look at the log of the jaeger-service-provider container, the same is true, all traces are sampled:

在这里插入图片描述

Open Jaeger's web page, it can be seen that the traces corresponding to the three requests of jaeger-service-consumer are all reported:

在这里插入图片描述

At this point, the simplest fixed sampling has been completed, let’s take a look at the more practical proportional sampling

Proportional sampling

As the name implies, sampling is based on a certain percentage, and the configuration is shown in the figure below:

在这里插入图片描述

Execute the full.sh script written earlier, compile, build and deploy
The method of testing proportional sampling is to send multiple requests to check whether the sampled trace is one-tenth of the total. I use jmeter to execute multiple requests here. You can choose your own good tool, or write code and script, or even manually Multiple visits
Use jmeter to control the number of requests, using Loop Controller, as shown in the red box in the following figure:

在这里插入图片描述

After sending one hundred requests to the /hello interface of jaeger-service-consumer, you can check the sampling situation from the docker container log, here use grep and wc command combination to count the number of lines that appear in the log with sampled=true and sampled=false, The complete command is as follows:

docker logs jaeger-service-consumer| grep 'sampled=true'|wc -l

100 requests, the sampling rate is 10%, but the result obtained by the above command is not the exact value of 10, but 8, and then count the number of unsampled log lines (change true to false), and the result is 92. The total number matches, but the sampling number is not exactly ten percent, as shown in the figure below:

在这里插入图片描述

Then increase the total number of requests to one thousand, and the resulting sampling ratio is close to 10%, as follows:

在这里插入图片描述

Open Jaeger's web page, it can be seen that there are really only 106 traces:

在这里插入图片描述

Proportional sampling is complete, and then the rate-limiting sampling

Rate limited sampling

Regarding the speed limit, it seems that it is not specific enough to be easy to understand, but look at the keyword leaky bucket on the official document, as shown in the red box in the figure below. You are smart enough to think of the key. Leaky bucket current limiting algorithm (note that it is a leaky bucket, not a token bucket. The peak value of the leaky bucket algorithm is related to the bucket size):

在这里插入图片描述

The configuration is shown in the red box in the following figure:

在这里插入图片描述

Execute the full.sh script written earlier, compile, build and deploy
Our configuration is to sample once per second, so when verifying, we need to control the length of the request. I still use jmeter to send the request. As shown in the red box in the figure below, jmeter has a kind of The Runtime Controller type controller can control the duration of the continuous request. I set it to 10 seconds here:

在这里插入图片描述

Use jmeter to continuously send requests for 10 seconds. From the summary report of jmeter, it can be seen that a total of 70 requests were sent:

在这里插入图片描述

Use the command docker logs jaeger-service-consumer| grep'sampled=true'|wc -l to view the total number of samples. The expected value of 10 seconds is 10. The result is as follows, which is not accurate , It's just close:

在这里插入图片描述

Clear all the data, try changing the duration to 100 seconds, and issue a total of 852 requests:

在这里插入图片描述

The total number of samples is 96, close to expectations:

在这里插入图片描述

Opening Jaeger's web page is also 96 traces:

在这里插入图片描述

A glimpse of server configuration

Remember "Jaeger, a distributed call chain tracking tool?" Two-minute speed experience" , "Introduction to Jaeger Development (java version)" and other articles? At that time, we did not add any configuration related to sampling, but the corresponding trace can be found on Jaeger's web page for each request, which means that all requests are sampled. Why?
If there is no sampling-related content in the configuration file, remote configuration is used by default. The specific information is in the all-in-one container of jaeger. You can see the remote sampling configuration by executing the following command:

docker exec jaeger cat /etc/jaeger/sampling_strategies.json

The above command can see the contents of sampling_strategies.json as follows. The original server configuration is proportional sampling, but the ratio is 100%. This can explain why trace information can be found on Jaeger's web page for all requests:

{
  "default_strategy": {
    "type": "probabilistic",
    "param": 1
  }
}

At this point, the actual sampling configuration has been completed. I hope to provide you with some reference to help you customize a more appropriate sampling strategy for the actual situation.

You are not lonely, Xinchen is with you all the way

Welcome to pay attention to the public account: programmer Xin Chen

Search "Programmer Xin Chen" on WeChat, I am Xin Chen, and I look forward to traveling the Java world with you...
https://github.com/zq2599/blog_demos

Jaeger's client-side sampling configuration (Java Edition)

Welcome to my GitHub

About sampling (Sampling)

About actual combat engineering

Fixed sampling

Proportional sampling

Rate limited sampling

A glimpse of server configuration

You are not lonely, Xinchen is with you all the way

Welcome to pay attention to the public account: programmer Xin Chen

程序员欣宸

引用和评论

quarkus依赖注入之十二：禁用类级别拦截器

云电竞巅峰对决：ToDesk/网易云/START实战测评，谁是真王者？

国产化环境下的 K8s 全离线部署：鲲鹏 + 麒麟 V10 + KubeSphere + Harbor

Higress 开源 Remote MCP Server 托管方案，并将上线 MCP 市场

助力资本与创新协同——2025年民营科技企业投贷融合赋能行动在蓉举行

实时云渲染：颠覆传统工作流的五大核心优势

通过阿里云Milvus与通义千问VL大模型，快速实现多模态搜索

Jaeger&#39;s client-side sampling configuration (Java Edition)

Welcome to my GitHub

About sampling (Sampling)

About actual combat engineering

Fixed sampling

Proportional sampling

Rate limited sampling

A glimpse of server configuration

You are not lonely, Xinchen is with you all the way

Welcome to pay attention to the public account: programmer Xin Chen

程序员欣宸

引用和评论

quarkus依赖注入之十二：禁用类级别拦截器

云电竞巅峰对决：ToDesk/网易云/START实战测评，谁是真王者？

国产化环境下的 K8s 全离线部署：鲲鹏 + 麒麟 V10 + KubeSphere + Harbor

Higress 开源 Remote MCP Server 托管方案，并将上线 MCP 市场

助力资本与创新协同——2025年民营科技企业投贷融合赋能行动在蓉举行

实时云渲染：颠覆传统工作流的五大核心优势

通过阿里云Milvus与通义千问VL大模型，快速实现多模态搜索

Jaeger's client-side sampling configuration (Java Edition)