Author: Yan Huqing

The person in charge of the DBLE open source project is responsible for the research and development of distributed database middleware; continues to focus on database technology, and is always engaged in development on the front line; has an in-depth understanding and practice of data replication, separation of reads and writes, and sub-databases.

Source of this article: original submission

* Produced by the Aikesheng open source community, original content is not allowed to be used without authorization, please contact the editor and indicate the source for reprinting.


By the way, one day, the testing partners of the dble team need to perform performance tuning and need the principles and methods of demodulating and optimizing dble, so I threw him a document for him to learn:

https://actiontech.github.io/dble-docs-cn/2.Function/2.18_performance_observation.html。

The test student read the document with contempt, and then asked me, what is the purpose of this command in the picture?

Isn't it enough to use the top command in Linux? What are you doing with that?

The opportunity for performance came, so I told him patiently that although the native top command is more accurate for cpu statistics, it is not friendly to Java threads. The thread names all show java, which is not compatible with the actual thread name of the application. It needs to be passed Jstack came to check it, like this, while talking, I started the demo.

First, the top command provides a dynamic real-time view of what is running. It can display system summary information and a list of processes or threads currently managed by the Linux kernel. The type and order of the displayed information are configurable and persistent. . . ("Speaking of the main point, don't make up the word count", the test classmate interrupted me who was talking).

Okay, let’s talk about the important point. The default top display is the sum of the indicators of all threads in a single process. We can specify the display thread information through the parameter -H, and the parameter -p can also specify the specific process id.

Run to see:

top -H -p `pidof java`

The result is probably this:

What is the use of this result? You have to use the jstack command to print a stack (of course, if the online environment may have to bear certain risks)

jstack -l `pidof java` > /tmp/dble_jstack.log

With these two results, we can see which thread a certain thread number corresponds to in the application.

How to do it specifically? For example, we take a thread number like 10849 and convert it to a hexadecimal number

printf "%x\n" 10849
2a61

Then, look up the name of the thread in the results of jstack.

cat  /tmp/dble_jstack.log | grep "nid=0x2a61"
"BusinessExecutor0" #23 daemon prio=5 os_prio=0 tid=0x00007f95dc620800 nid=0x2a61 waiting on condition [0x00007f96281f6000]

Of course, if you only need the thread name, this is enough, and two lines of commands can also be spliced together. If you need more context information, you can check the grep manual to find the response parameters such as -A and -B, and you can even open the file to find it.

I also told him how to troubleshoot if the CPU of the java process is so high during the actual operation and maintenance work? The same method is the same. First find the thread with high CPU through the top command, then know what the thread is doing through jstack, try to solve the problem of high CPU, and according to my experience, most of it is because of gc problems, and I have also encountered nio's epoll bug.

The test student quietly listened to the introduction and demonstration in the second half of my talk, and typed the same line of commands on his own Ubuntu terminal:

top -H -p `pidof java`

Then pointed to the result and asked me: Isn't this a thread name?

this. . . The uppercase embarrassment and face-slapped scene, full of question marks, I had to obediently admit the counsel, and to study why the top command is more obedient on his machine than mine.

After several rounds of sleepless investigations (and none), I finally found out whether the thread name was displayed. It turns out that some people raised questions as early as 2011 (see JDK-7102541), and gave a simpler description in JDK-8179011. The repair was done as early as 2019, and openjdk backported it to the 8u222 version, which means that the openjdk version after that has fixed the problem, but as of the time of writing this article, the oracle jdk 8u301 still did not fix this problem. So readers know how to choose jdk.

Now that the bug has been fixed, I can't help but want to see how it is implemented. Here we are mainly concerned about the realization of the Linux platform. There is another interesting thing here. We can actually see that this is a function implemented by a big cow from SAP in 2014.

Let's look at the code in detail. First, set the thread name at the thread implementation level:

Then there is the specific implementation of the set_native_thread_name method, the code is as follows under the Linux platform:

As you can see, the first 15 characters of the thread name are intercepted here, and then the Linux::_pthread_setname_np method is called. Let's take a look at what this method does:

As you can see, here is actually calling the pthread_setname_np method of the operating system through dlsym.

At this point, the investigation is over. We summarize the following points:

  1. If you use Oracle jdk8 or earlier jdk, you have to use jstack or other methods to correspond to the thread number and logical thread name.
  2. If you use openjdk8, it is recommended to upgrade to 222, so that you can directly see the thread name through the top command to speed up the diagnosis.
  3. It is recommended that when setting the thread name, the application should try to express the unique meaning within 15 characters, which is convenient for observation and analysis. Of course, this dble is not well done, and it will be adjusted and modified later.
  4. Of course, there are some other tools in the community. For example, Ali's Arthas should also be able to achieve the function of thread id and name correspondence, but the introduction of third parties is always a troublesome thing, or the original one is more fragrant.

Reference documents:

http://hg.openjdk.java.net/jdk8u/jdk8u-dev/hotspot/rev/bf1c9a3312a4
https://bugs.openjdk.java.net/browse/JDK-8224140
https://bugs.openjdk.java.net/browse/JDK-7102541
https://bugs.openjdk.java.net/browse/JDK-8179011
https://man7.org/linux/man-pages/man3/pthread_setname_np.3.html
https://man7.org/linux/man-pages/man3/dlsym.3.html


爱可生开源社区
429 声望211 粉丝

成立于 2017 年,以开源高质量的运维工具、日常分享技术干货内容、持续的全国性的社区活动为社区己任;目前开源的产品有:SQL审核工具 SQLE,分布式中间件 DBLE、数据传输组件DTLE。