One: background
1. Tell a story
I haven’t written a blog for about two months. Friends who follow me should know that I have been spending my energy on the planet recently. In the past two months, I have been asking for help how to analyze the dump. Some friends are too polite. I have a big red envelope, haha 😅, there are more than 10 dumps of different problem types in my hand, and I will contribute analysis ideas one by one in the follow-up.
This dump was provided to me by a friend about a month ago. Since there are many friends who ask for help in wx, I didn't find the relevant screenshots for a while, so I had to break the old rules. 😭😭😭
Since my friend said that the api interface is not responding, showing a hangon phenomenon, from some past experience, there are probably only three situations.
- A lot of lock waiting
- Not enough threads
- Deadlock
With this kind of preconceived thinking, go to windbg to talk about things.
Two: windbg analysis
1. Is there a lot of lock waiting?
To see whether the lock waits, the old rules, look at the synchronization block table.
0:000> !syncblk
Index SyncBlock MonitorHeld Recursion Owning Thread Info SyncBlock Owner
-----------------------------
Total 1673
CCW 3
RCW 4
ComClassFactory 0
Free 397
If it's empty, there is nothing, so let's take a violent look at all the thread stacks.
It's okay if I don't look at it. I was shocked when I saw it. There are 339 threads stuck at System.Threading.Monitor.ObjWait(Boolean, Int32, System.Object)
, but after thinking about it, even if there are 339 threads stuck here, will it really cause the program to hang on? Not necessarily, after all, I have seen that 1000+ threads will not be stuck, but the cpu is so high. Next, continue to determine if the threads are not enough. You can start from the thread pool task queue.
2. Explore the thread pool queue
You can use the !tp
command to view.
0:000> !tp
CPU utilization: 10%
Worker Thread: Total: 328 Running: 328 Idle: 0 MaxLimit: 32767 MinLimit: 4
Work Request in Queue: 74
Unknown Function: 00007ffe91cc17d0 Context: 000001938b5d8d98
Unknown Function: 00007ffe91cc17d0 Context: 000001938b540238
Unknown Function: 00007ffe91cc17d0 Context: 000001938b5eec08
...
Unknown Function: 00007ffe91cc17d0 Context: 0000019390552948
Unknown Function: 00007ffe91cc17d0 Context: 0000019390562398
Unknown Function: 00007ffe91cc17d0 Context: 0000019390555b30
--------------------------------------
Number of Timers: 0
--------------------------------------
Completion Port Thread:Total: 5 Free: 4 MaxFree: 8 CurrentLimit: 4 MaxLimit: 1000 MinLimit: 4
From the output information, 328 threads in the thread pool are all full, and 74 guests are waiting in the work queue. Combining these two points of information, it is already very clear. This hangon is due to the arrival of a large number of guests and exceeds the thread pool. Due to the reception capacity.
3. Is the reception capacity really bad?
I think this title is very good, is it really bad? Whether it works or not, you can start from two points:
- Is the code bad?
- Is qps really beyond the reception capacity?
To find out, you have to start with the 339 stuck threads, and carefully study the call stack of each thread, probably stuck in these three places.
<1>. GetModel
public static T GetModel<T, K>(string url, K content)
{
T result = default(T);
HttpClientHandler httpClientHandler = new HttpClientHandler();
httpClientHandler.AutomaticDecompression = DecompressionMethods.GZip;
HttpClientHandler handler = httpClientHandler;
using (HttpClient httpClient = new HttpClient(handler))
{
string content2 = JsonConvert.SerializeObject((object)content);
HttpContent httpContent = new StringContent(content2);
httpContent.Headers.ContentType = new MediaTypeHeaderValue("application/json");
string mD5ByCrypt = Md5.GetMD5ByCrypt(ConfigurationManager.AppSettings["SsoToken"] + DateTime.Now.ToString("yyyyMMdd"));
httpClient.DefaultRequestHeaders.Add("token", mD5ByCrypt);
httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/json"));
HttpResponseMessage result2 = httpClient.PostAsync(url, httpContent).Result;
if (result2.IsSuccessStatusCode)
{
string result3 = result2.Content.ReadAsStringAsync().Result;
return JsonConvert.DeserializeObject<T>(result3);
}
return result;
}
}
<2>. Get
public static T Get<T>(string url, string serviceModuleName)
{
try
{
T val3 = default(T);
HttpClient httpClient = TryGetClient(serviceModuleName, true);
using (HttpResponseMessage httpResponseMessage = httpClient.GetAsync(GetRelativeRquestUrl(url, serviceModuleName, true)).Result)
{
if (httpResponseMessage.IsSuccessStatusCode)
{
string result = httpResponseMessage.Content.ReadAsStringAsync().Result;
if (!string.IsNullOrEmpty(result))
{
val3 = JsonConvert.DeserializeObject<T>(result);
}
}
}
T val4 = val3;
val5 = val4;
return val5;
}
catch (Exception exception)
{
throw;
}
}
<3>. GetStreamByApi
public static Stream GetStreamByApi<T>(string url, T content)
{
Stream result = null;
HttpClientHandler httpClientHandler = new HttpClientHandler();
httpClientHandler.AutomaticDecompression = DecompressionMethods.GZip;
HttpClientHandler handler = httpClientHandler;
using (HttpClient httpClient = new HttpClient(handler))
{
httpClient.DefaultRequestHeaders.Accept.Add(new MediaTypeWithQualityHeaderValue("application/octet-stream"));
string content2 = JsonConvert.SerializeObject((object)content);
HttpContent httpContent = new StringContent(content2);
httpContent.Headers.ContentType = new MediaTypeHeaderValue("application/json");
HttpResponseMessage result2 = httpClient.PostAsync(url, httpContent).Result;
if (result2.IsSuccessStatusCode)
{
result = result2.Content.ReadAsStreamAsync().Result;
}
httpContent.Dispose();
return result;
}
}
4. Find the truth
I have listed the codes of the three methods above. I don't know if you can see what the problem is? Yes, it is the asynchronous method synchronization. This writing method itself is very inefficient, mainly in two aspects.
- Opening and closing threads is a relatively resource-consuming and inefficient operation.
- Frequent thread scheduling puts huge pressure on the cpu
And this way of writing does not show any problems when the request volume is relatively small. Once the request volume is slightly larger, the dump will be encountered immediately.
Three: Summary
Taken together, this hangon accident was caused by the developer's asynchronous method not being asynchronous. The modification method is very simple. The pure asynchronous transformation (await, async) is carried out to liberate the calling thread and make full use of the ability to drive the device.
This dump also reminds me of the CLR Via C#
book 06100c83fa2e37 (P646, 647) where await and async are used to transform synchronous requests.
I think this dump is the best proof of this example! 😄😄😄
More high-quality dry goods: see my GitHub: dotnetfly
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。