One: background
1. Tell a story
A few days ago, a friend of wx asked for help. Its program CPU was often full, and I didn't find the reason. I hope to help.
I have received several dumps with high CPU bursts in a row these days. I am tired of seeing it. I hope that there will be a few other dumps later. From the communication point of view, the performance of this friend is very miserable, which may actually be worse. Now that I have found me, I will try my best to help him find the real murderer behind the scenes, without much to say, go to windbg.
Two: windbg analysis
1. View managed threads
Because threads are supported by cpu, it is also a good idea to start with threads. To view all managed threads of the program, you can use the !t
command.
0:000> !t
ThreadCount: 38
UnstartedThread: 0
BackgroundThread: 34
PendingThread: 0
DeadThread: 3
Hosted Runtime: no
Lock
ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception
0 1 105c 000000000067f600 2a020 Preemptive 0000000000000000:0000000000000000 0000000000671ec0 0 MTA
2 2 13d0 00000000192c4f40 2b220 Preemptive 0000000000000000:0000000000000000 0000000000671ec0 0 MTA (Finalizer)
...
XXXX 15 0 000000001bc64970 8039820 Preemptive 0000000000000000:0000000000000000 0000000000671ec0 0 MTA (Threadpool Completion Port)
24 23 1380 000000001bc660e0 8029220 Preemptive 0000000000000000:0000000000000000 0000000000671ec0 0 MTA (Threadpool Completion Port)
XXXX 53 0 000000001bc63200 8039820 Preemptive 0000000000000000:0000000000000000 0000000000671ec0 0 MTA (Threadpool Completion Port)
XXXX 27 10dc 000000001bd0dbf0 1029220 Preemptive 0000000002CB40F8:0000000002CB4108 0000000000671ec0 1 MTA (GC) (Threadpool Worker)
From the perspective of the hexagram: the program has 38 threads, 3 dead, I went, there was a bright spot, the last line appeared a familiar MTA (GC)
, what does this mean? This means that the current thread triggered the GC, but the strange thing is that the thread that triggered the GC is dead. You must ask how to see it. Look at the XXXX
. Forget it, the dead horse is a living doctor, and the thread is called out. For all managed and unmanaged stacks, look for the words WaitUntilGCComplete
and try_allocate_more_space
2. View the thread stack
To view the thread stacks of all threads, you can use the ~*e !dumpstack
command.
- Search for the
WaitUntilGCComplete
keyword.
From the picture, hey, there are really 18 threads waiting, and I also saw System.String.Concat
. Is it the same as the his
cpu burst height that I posted in the last article? 😘😘😘
- Search for the
try_allocate_more_space
keyword.
I went, there is no try_allocate_more_space
keyword, which is not a routine with his. It is possible that this dump is not a particularly good time, and it is possible that the program is in some weird behavior.
It seems that this path has come to an end, but again, the thread is raised by the cpu, then bite the bullet and see what each thread is doing. In order to make the result clearer, change the command ~*e !clrstack
.
It can be seen from the figure that there are currently 25 threads stuck at FindEntry(System.__Canon)
, and from the call stack, it looks like a dll encapsulated by aliyun. Why are so many threads stuck here? Does this give people a big question mark? Next, I will export this code of Alibaba Cloud to see what happened.
3. View the problem code
To export the problem code, use the classic !ip2md + !savemodule
combination command.
0:000> !ip2md 000007fe9a1a0641
MethodDesc: 000007fe9a5678e0
Method Name: Aliyun.Acs.Core.Utils.CacheTimeHelper.AddLastClearTimePerProduct(System.String, System.String, System.DateTime)
Class: 000007fe9a595a08
MethodTable: 000007fe9a567900
mdToken: 00000000060000a6
Module: 000007fe9a561f58
IsJitted: yes
CodeAddr: 000007fe9a1a0610
Transparency: Critical
0:000> !savemodule 000007fe9a561f58 E:\dumps\AddLastClearTimePerProduct.dll
3 sections in file
section 0 - VA=2000, VASize=14148, FileAddr=200, FileSize=14200
section 1 - VA=18000, VASize=3fc, FileAddr=14400, FileSize=400
section 2 - VA=1a000, VASize=c, FileAddr=14800, FileSize=200
Then use ILSpy to decompile this dll, because it is Alibaba Cloud's code, I can release it with confidence.
// Aliyun.Acs.Core.Utils.CacheTimeHelper
using System;
using System.Collections.Generic;
public class CacheTimeHelper
{
private static Dictionary<string, DateTime> lastClearTimePerProduct = new Dictionary<string, DateTime>();
private const int ENDPOINT_CACHE_TIME = 3600;
public static bool CheckCacheIsExpire(string product, string regionId)
{
string key = product + "_" + regionId;
DateTime dateTime;
if (lastClearTimePerProduct.ContainsKey(key))
{
dateTime = lastClearTimePerProduct[key];
}
else
{
dateTime = DateTime.Now;
lastClearTimePerProduct.Add(key, dateTime);
}
if (3600.0 < (DateTime.Now - dateTime).TotalSeconds)
{
return true;
}
return false;
}
public static void AddLastClearTimePerProduct(string product, string regionId, DateTime lastClearTime)
{
string key = product + "_" + regionId;
if (lastClearTimePerProduct.ContainsKey(key))
{
lastClearTimePerProduct.Remove(key);
}
lastClearTimePerProduct.Add(key, lastClearTime);
}
}
It can be seen that the above code can not if (lastClearTimePerProduct.ContainsKey(key))
. If you chase down, you can refer to the source code of Dictionary.
public class Dictionary<TKey, TValue>
{
// System.Collections.Generic.Dictionary<TKey,TValue>
public bool ContainsKey(TKey key)
{
return FindEntry(key) >= 0;
}
}
CacheTimeHelper
to this point, do you see what is wrong with this 060ab01a2d7d25? Yes, the non-thread-safe Dictionary<string, DateTime>
is used in a multi-threaded environment, which is very problematic.
4. What are the problems with Dictionary?
Dictionary
in a multithreaded environment will definitely cause data confusion. This is undoubtedly, and there will be some exceptions iterations, but if this misuse will cause the CPU to explode, I haven’t seen it in my field of vision. been. . . To be sure, search for
on bing.
Hey, there really is such a case: High CPU in .NET app using a static , and then cut a picture.
From the description of the article, it is exactly the same 😁😁😁, which also concludes that operating the Dictionary in a multi-threaded environment may cause FindEntry(key)
, and then 25 infinite loops will lift the cpu together to supplement the current Full CPU utilization. . .
0:000> !tp
CPU utilization: 100%
Worker Thread: Total: 27 Running: 27 Idle: 0 MaxLimit: 32767 MinLimit: 4
Work Request in Queue: 0
--------------------------------------
Number of Timers: 1
--------------------------------------
Completion Port Thread:Total: 4 Free: 3 MaxFree: 8 CurrentLimit: 3 MaxLimit: 1000 MinLimit: 4
Three: Summary
Since it is a bug in the sdk of Cloud 160ab01a2d7eae, this problem is troublesome. . . You can't change it, and then tell your friend to submit a work order to solve it.
I thought the matter was over like this, but I think about it, other Alibaba Cloud SDKs used a few years ago also encountered a problem similar to the high CPU. Later, I got it through upgrading the SDK. This time I also gambled and watched. Here is the assembly information.
[assembly: CompilationRelaxations(8)]
[assembly: RuntimeCompatibility(WrapNonExceptionThrows = true)]
[assembly: Debuggable(DebuggableAttribute.DebuggingModes.IgnoreSymbolStoreSequencePoints)]
[assembly: TargetFramework(".NETStandard,Version=v2.0", FrameworkDisplayName = "")]
[assembly: AssemblyCompany("Alibaba Cloud")]
[assembly: AssemblyConfiguration("Release")]
[assembly: AssemblyCopyright("©2009-2018 Alibaba Cloud")]
[assembly: AssemblyDescription("Alibaba Cloud SDK for C#")]
[assembly: AssemblyFileVersion("1.1.12.0")]
[assembly: AssemblyInformationalVersion("1.1.12")]
[assembly: AssemblyProduct("aliyun-net-sdk-core")]
[assembly: AssemblyTitle("aliyun-net-sdk-core")]
[assembly: AssemblyVersion("1.1.12.0")]
You can see that your friend is currently using the 1.1.12.0
version, then upgrade aliyun-net-sdk-core to the latest version and see if this CacheTimeHelper has been repaired?
As expected, it has been fixed in the new version, so experience tells me that with Alibaba Cloud's SDK, remember to upgrade frequently, otherwise all kinds of big pits are waiting for you. . . 😔😔😔
More high-quality dry goods: see my GitHub: dotnetfly
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。