One: background
1. Tell a story
At the end of last month, a friend of WeChat approached me and said that his program was too slow to process RabbitMQ in multi-threading. Please help analyze the reasons. The screenshot is as follows:
This problem was thrown out, a bit daunting, no CPU bursting high, no memory leaks, no program stuck. . . The ghost knows why Rabbitmq is too slow 🤣🤣🤣 🤣🤣🤣
Since there is no reason, then look for it in the direction of freeze and talk to windbg.
Two: Windbg analysis
1. Look for the cause of freeze
There are many reasons for the slow processing. It may be that there are fewer people working, or these people have gone fishing. In short, the work is not saturated. Next, let's see if this group of people have fished, starting from the thread pool.
0:000> !tp
CPU utilization: 81%
Worker Thread: Total: 187 Running: 172 Idle: 0 MaxLimit: 32767 MinLimit: 8
Work Request in Queue: 0
--------------------------------------
Number of Timers: 0
--------------------------------------
Completion Port Thread:Total: 1 Free: 1 MaxFree: 16 CurrentLimit: 1 MaxLimit: 1000 MinLimit: 8
From the output, the current CPU=81%
is probably caused by this program. Since the CPU is too high, it may be some computationally intensive operations, it may be a context switch caused by a large number of locks, or it may be caused by frequent GC triggers.
2. Is it a GC trigger?
At this point in the series, I believe that many friends should know how to see whether it is caused by GC from the thread stack. ~*e !dumpstack
, it was found that thread 214 triggered the GC, as shown in the following figure:
!dumpstack
is too much information output with 06191c63e5e5b0, in order to simplify the information, use !clrstack
instead.
0:214> !clrstack
OS Thread Id: 0x290c (214)
Child SP IP Call Site
000000bb2983c7d0 00007ffed8c3e335 System.Text.UTF8Encoding.GetString(Byte[], Int32, Int32)
000000bb2983c810 00007ffe7aaecee9 StackExchange.Redis.RedisValue.op_Implicit(StackExchange.Redis.RedisValue)
000000bb2983c860 00007ffe7aaecd92 xxxx.Util.Helper.RedisHelper.ConvertObj[[System.__Canon, mscorlib]](StackExchange.Redis.RedisValue)
000000bb2983c8f0 00007ffe7abc3997 xxxx.Util.Helper.RedisHelper+c__DisplayClass29_0`1[[System.__Canon, mscorlib]].b__0(StackExchange.Redis.IDatabase)
000000bb2983ca20 00007ffe7aaeaca0 xxxx.Util.Helper.RedisHelper.Do[[System.__Canon, mscorlib]](System.Func`2<StackExchange.Redis.IDatabase,System.__Canon>)
000000bb2983caa0 00007ffe7abc3762 xxxx.Util.Helper.RedisHelper.HashGet[[System.__Canon, mscorlib]](System.String, System.String)
000000bb2983cb40 00007ffe7abc2f3d xxxx.Implementation.xxx.GetCompany(System.String)
...
000000bb2983e170 00007ffe7ad2fadf xxx.xxx.MQ.xxx+c__DisplayClass21_0`1[[System.__Canon, mscorlib]].b__1()
000000bb2983e1c0 00007ffed8c11862 System.Threading.Tasks.Task`1[[System.__Canon, mscorlib]].InnerInvoke()
000000bb2983e220 00007ffed8c10937 System.Threading.Tasks.Task.Execute()
000000bb2983e260 00007ffed8bc674e System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
000000bb2983e330 00007ffed8bc65e7 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
000000bb2983e360 00007ffed8c10bdd System.Threading.Tasks.Task.ExecuteWithThreadLocal(System.Threading.Tasks.Task ByRef)
000000bb2983e410 00007ffed8c10303 System.Threading.Tasks.Task.ExecuteEntry(Boolean)
000000bb2983e450 00007ffed8bcfa10 System.Threading.ThreadPoolWorkQueue.Dispatch()
000000bb2983e8e8 00007ffed9d465d3 [DebuggerU2MCatchHandlerFrame: 000000bb2983e8e8]
From the information in the hexagram, the program RedisHelper.HashGet
. Let me go. Is it because I read some big data from redis? Next, use !clrstack -a
see what you have read from redis?
0:214> !clrstack -a
OS Thread Id: 0x290c (214)
Child SP IP Call Site
000000bb2983c860 00007ffe7aaecd92 xxx.Util.Helper.RedisHelper.ConvertObj[[System.__Canon, mscorlib]](StackExchange.Redis.RedisValue)
PARAMETERS:
this (0x000000bb2983c8f0) = 0x000000bb182da3d0
value (0x000000bb2983c960) = 0x000000bb18440778
LOCALS:
0x000000bb2983c8c4 = 0x0000000000000000
0x000000bb2983c8a0 = 0x0000000000000000
0x000000bb2983c898 = 0x0000000000000000
0:214> !do 0x000000bb18440778
Name: System.Byte[]
MethodTable: 00007ffed8db93d0
EEClass: 00007ffed87f4dc8
Size: 6679(0x1a17) bytes
Array: Rank 1, Number of elements 6655, Type Byte (Print Array)
Content: {"ID":104x,"Code":"130x","xxx":"1304","xxx":"8888","Name":"...............","xxx":"...............","MqConnStr"
As you can see, redis has read about 6.6 k
data, only such a little data, I thought dozens of M 🤣🤣🤣, if this thing can trigger GC, I don't believe it.
3. What caused the GC to trigger?
So who is the culprit of In fact, if you think about it carefully, GC triggering is nothing more than pushing the
gen0
gen1
, and then from gen1
to gen2
. Then the RabbitMQ of my friend is processing slowly, and it must have triggered multiple GCs, which is the result of this slow processing that is visible to the naked eye. .
There is a breakthrough here. Since multiple GCs are triggered, there must be many rooted objects on gen2, so it is better to start here, and use the !dumpgen
command to view.
0:214> !dumpgen 2 -stat
Count Total Size Type
-------------------------------------------------
87,738 24,925,316 System.Int32[]
509,643 56,399,716 System.String
83,077 65,760,888 System.Reflection.Emit.__FixupData[]
2,158,346 69,067,072 System.Reflection.Emit.GenericMethodInfo
92,388 172,765,392 System.Object[]
796,870 179,587,796 **** FREE ****
14,030,441 336,730,584 System.RuntimeMethodHandle
428,376 348,743,274 System.Byte[]
I went, there is actually 1.4 kw
. This is so curious. Then I will pick a few addresses and see what its reference root is.
0:214> !dumpgen 2 -type System.RuntimeMethodHandle
Object MT Size Name
-------------------------------------------------------------------
000000bac9f8fe68 00007FFED8DB9978 24 System.RuntimeMethodHandle
000000bac9f8fe80 00007FFED8DB9978 24 System.RuntimeMethodHandle
000000bac9f8fe98 00007FFED8DB9978 24 System.RuntimeMethodHandle
000000bac9f8feb0 00007FFED8DB9978 24 System.RuntimeMethodHandle
000000bac9f8fec8 00007FFED8DB9978 24 System.RuntimeMethodHandle
...
000000baca1af510 00007FFED8DB9978 24 System.RuntimeMethodHandle
000000baca1af548 00007FFED8DB9978 24 System.RuntimeMethodHandle
000000baca1af560 00007FFED8DB9978 24 System.RuntimeMethodHandle
000000baca1afd90 00007FFED8DB9978 24 System.RuntimeMethodHandle
000000baca1afde0 00007FFED8DB9978 24 System.RuntimeMethodHandle
0:214> !gcroot 000000baca1afde0
Thread 187c0:
000000baeeb5cb30 00007ffed8ba9c60 System.Collections.Generic.Dictionary`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib]].FindEntry(System.__Canon)
rsi:
-> 000000baca4c6ef0 System.Collections.Generic.Dictionary`2[[System.Tuple`4[[System.String, mscorlib],[System.String, mscorlib],[System.Int32, mscorlib],[System.Int32, mscorlib]], mscorlib],[System.Delegate, mscorlib]]
-> 000000badab3b8a0 System.Collections.Generic.Dictionary`2+Entry[[System.Tuple`4[[System.String, mscorlib],[System.String, mscorlib],[System.Int32, mscorlib],[System.Int32, mscorlib]], mscorlib],[System.Delegate, mscorlib]][]
-> 000000baca1b5210 System.Func`2[[System.Data.IDataReader, System.Data],[xxx.BAR_AccountDC, xxxx]]
-> 000000baca1ae8b0 System.Reflection.Emit.DynamicMethod+RTDynamicMethod
-> 000000baca1ae7f8 System.Reflection.Emit.DynamicMethod
-> 000000baca1b51b0 System.RuntimeMethodInfoStub
-> 000000baca1b4a18 System.Reflection.Emit.DynamicResolver
-> 000000baca1aea60 System.Reflection.Emit.DynamicScope
-> 000000baca1aea78 System.Collections.Generic.List`1[[System.Object, mscorlib]]
-> 000000baca1b3b20 System.Object[]
-> 000000baca1afde0 System.RuntimeMethodHandle
From the reference chain, it is Dictionary
, then we will dump this dictionary to see what it is.
0:214> !do 000000baca4c6ef0
Name: System.Collections.Generic.Dictionary`2[[System.Tuple`4[[System.String, mscorlib],[System.String, mscorlib],[System.Int32, mscorlib],[System.Int32, mscorlib]], mscorlib],[System.Delegate, mscorlib]]
MethodTable: 00007ffe7a937f18
EEClass: 00007ffed87f7cb8
Size: 80(0x50) bytes
File: C:\Windows\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
Fields:
MT Field Offset Type VT Attr Value Name
00007ffed8dc3e30 400182a 8 System.Int32[] 0 instance 000000badaaa2c10 buckets
00007ffed9874d38 400182b 10 ...non, mscorlib]][] 0 instance 000000badab3b8a0 entries
00007ffed8dc3e98 400182c 38 System.Int32 1 instance 83000 count
00007ffed8dc3e98 400182d 3c System.Int32 1 instance 83000 version
00007ffed8dc3e98 400182e 40 System.Int32 1 instance -1 freeList
00007ffed8dc3e98 400182f 44 System.Int32 1 instance 0 freeCount
00007ffed8d9a430 4001830 18 ...Canon, mscorlib]] 0 instance 000000baca2ec958 comparer
00007ffed8d9c550 4001831 20 ...Canon, mscorlib]] 0 instance 0000000000000000 keys
00007ffed8dcef28 4001832 28 ...Canon, mscorlib]] 0 instance 0000000000000000 values
00007ffed8dc1c98 4001833 30 System.Object 0 instance 0000000000000000 _syncRoot
0:214> !objsize 000000baca4c6ef0
sizeof(000000baca4c6ef0) = 1116325152 (0x4289c520) bytes (System.Collections.Generic.Dictionary`2[[System.Tuple`4[[System.String, mscorlib],[System.String, mscorlib],[System.Int32, mscorlib],[System.Int32, mscorlib]], mscorlib],[System.Delegate, mscorlib]])
As you can see, the current dict's size= 8.3w
total size of 1.1G
, and then use !mdt
check the contents of the dictionary.
It can be seen that the items in the dictionary are probably key= select * from xxx where AccountNo= xxxx
, value = Func
,
4. View source code
!gcroot
know from 06191c63e5e8f9 that the current root 187c0
, and then we use the thread stack to find the source code.
0:089> ~~[187c0]s
ntdll!NtWaitForSingleObject+0xa:
00007ffe`e9ea06fa c3 ret
0:089> !clrstack
OS Thread Id: 0x187c0 (89)
Child SP IP Call Site
000000baeeb5caa0 00007ffed9718dfe System.Tuple`4[[System.__Canon, mscorlib],[System.__Canon, mscorlib],[System.Int32, mscorlib],[System.Int32, mscorlib]].System.Collections.IStructuralEquatable.GetHashCode(System.Collections.IEqualityComparer)
000000baeeb5cb00 00007ffed8ba90a7 System.Collections.Generic.ObjectEqualityComparer`1[[System.__Canon, mscorlib]].GetHashCode(System.__Canon)
000000baeeb5cb30 00007ffed8ba9c60 System.Collections.Generic.Dictionary`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib]].FindEntry(System.__Canon)
000000baeeb5cba0 00007ffed8bc53a4 System.Collections.Generic.Dictionary`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib]].TryGetValue(System.__Canon, System.__Canon ByRef)
000000baeeb5cbe0 00007ffe7abd8069 xxx.Internal.Cache`2[[System.__Canon, mscorlib],[System.__Canon, mscorlib]].Get(System.__Canon, System.Func`1<System.__Canon>)
000000baeeb5cc50 00007ffe7abd6d52 xxx.Database+d__49`1[[System.__Canon, mscorlib]].MoveNext()
000000baeeb5ccd0 00007ffed733768e System.Linq.Enumerable.FirstOrDefault[[System.__Canon, mscorlib]](System.Collections.Generic.IEnumerable`1<System.__Canon>)
000000baeeb5cd40 00007ffe7af5489e xxx.xxx.ParsePapPayDebitRecord(xxx, xxx.BAR_AccountDC ByRef, System.String ByRef, xxx ByRef, Boolean, xxx, Boolean)
...
From the thread stack, the program does a db query operation. The source code is probably as follows:
public static InvalidTxnCode ParsePapPayDebitRecord(xxx)
{
sql = "select * FROM [dbo].[xxx] where AccountNO = '" + transRecord.AccountNO + "'";
IEnumerable<BAR_AccountDC> enumerable3 = new YiBll<BAR_AccountDC>(transRecord.AccountNO).QuerySql(sql);
}
The problem arises in this SQL splicing. In order to improve the mapping speed, the underlying framework sql
and the corresponding Func<DataReader,Data>
. Because each SQL is different, the underlying dict becomes larger and larger until The current 1.1 G.
Three: Summary
In general, this accident was caused by the growing dict constantly squeezing the small object heap, and the multithreading of the program to allocate objects at full power, resulting in frequent recycling of gc on the small object heap, resulting in too slow rabbitmq processing.
The modification method is to parameterize sql.
sql = "select * FROM [dbo].[xxx] where AccountNO = '" + transRecord.AccountNO + "'";
Change to
sql = "select * FROM [dbo].[xxx] where AccountNO = @AccountNO";
Finally, this dump is somewhat confusing, because the thread that triggered the GC is not the culprit, it is just the last straw that crushes the camel.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。