One: background

1. Tell a story

The day before yesterday, his old brother came to me again. Last time, I solved the problem of high CPU. He seemed to trust me. This time another program encountered a memory leak. I hope I can help diagnose it.

In fact, this old man is still very good at technology. Since he can dump me, it is really a very difficult problem. I have to be mentally prepared. The communication is probably the program. The memory will slowly expand until it self-destructs. The problem is just such a problem. Then I sacrificed my housekeeping tool windbg.

Two: windbg analysis

1. Where is the leak?

I have said in many previous articles that when encountering this kind of memory leak, we must first investigate whether it is the managed heap or the unmanaged heap? If it is the latter, in most cases, only raise your hand to surrender, because the water is too deep inside. . . Don't look at those cases where the AllocHGlobal method is used to allocate unmanaged memory, and then use !heap to find pediatrics. The reality is much more complicated than this. . .

Next, use !address -summary look at the committed memory of the current process.


0:000> !address -summary

--- Usage Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
Free                                    345     7dfd`ca3ca000 ( 125.991 TB)           98.43%
<unknown>                             37399      201`54dbf000 (   2.005 TB)  99.83%    1.57%
Heap                                  29887        0`d179b000 (   3.273 GB)   0.16%    0.00%
Image                                  1312        0`0861b000 ( 134.105 MB)   0.01%    0.00%
Stack                                   228        0`06e40000 ( 110.250 MB)   0.01%    0.00%
Other                                    10        0`001d8000 (   1.844 MB)   0.00%    0.00%
TEB                                      76        0`00098000 ( 608.000 kB)   0.00%    0.00%
PEB                                       1        0`00001000 (   4.000 kB)   0.00%    0.00%

--- Type Summary (for busy) ------ RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_MAPPED                              352      200`00a40000 (   2.000 TB)  99.57%    1.56%
MEM_PRIVATE                           67249        2`2cbcb000 (   8.699 GB)   0.42%    0.01%
MEM_IMAGE                              1312        0`0861b000 ( 134.105 MB)   0.01%    0.00%

--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_FREE                                345     7dfd`ca3ca000 ( 125.991 TB)           98.43%
MEM_RESERVE                           11805      200`22ae8000 (   2.001 TB)  99.60%    1.56%
MEM_COMMIT                            57108        2`1313e000 (   8.298 GB)   0.40%    0.01%

From the perspective of the hexagram, the process submits the memory MEM_COMMIT = 8.2G , and then we look at the managed heap size and use the !eeheap -gc command.


0:000> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x0000027795928060
generation 1 starts at 0x000002779572F0D0
generation 2 starts at 0x000002763DCE1000

Total Size:              Size: 0xcd28c510 (3442001168) bytes.
------------------------------
GC Heap Size:    Size: 0xcd28c510 (3442001168) bytes.

As can be seen from the last line, the current GC heap Size= 3442001168 /1024/1024/1024 =3.2G , that is to say, roughly: The memory of 8.2G - 3.2G = 5G . . Nima, the typical unmanaged memory leaks, it is really which pot is not opened and which pot is picked, this time may really be planted. . .

2. Look for unmanaged memory leaks

In addition to the GC heap, there is also a loader heap in the process. There are many things in it. There are high-frequency heap, low-frequency heap, Stub heap, JIT heap, etc., which store AppDomain, Module, method descriptor, method table, EEClass and other related information, from experience, this loader heap is the unmanaged leaks. If you want to view it, you can use the !eeheap -loader command.


0:000> !eeheap -loader
...
Module 00007ffe2b1b6ca8: Size: 0x0 (0) bytes.
Module 00007ffe2b1b7e80: Size: 0x0 (0) bytes.
Module 00007ffe2b1b9058: Size: 0x0 (0) bytes.
Module 00007ffe2b1ba230: Size: 0x0 (0) bytes.
Module 00007ffe2b1bb408: Size: 0x0 (0) bytes.
Module 00007ffe2b1bc280: Size: 0x0 (0) bytes.
Module 00007ffe2b1bd458: Size: 0x0 (0) bytes.
Module 00007ffe2b1be630: Size: 0x0 (0) bytes.
Module 00007ffe2b1bf808: Size: 0x0 (0) bytes.
Module 00007ffe2b1f0a50: Size: 0x0 (0) bytes.
Module 00007ffe2b1f1c28: Size: 0x0 (0) bytes.
Module 00007ffe2b1f2aa0: Size: 0x0 (0) bytes.
Total size:      Size: 0x0 (0) bytes.
--------------------------------------
Total LoaderHeap size:   Size: 0xc0fb9000 (3237711872) bytes total, 0x5818000 (92372992) bytes wasted.

It's okay if this command is not lost, and I was shocked when I lost it. The windbg interface was refreshed for several minutes before it stopped. . . Two pieces of information can be obtained from the output:

  • Total loader heap occupied: 3237711872 /1024/1024/1024 = 3.01G
  • There are a lot of modules produced, I estimate there are tens of thousands. . .

In order to satisfy my curiosity, I decided to write a small script to see how many modules are there? ? ?

I went, there are as many modules as 19w. No wonder it takes up more than 3 gigabytes. It feels not far from the truth. The next question is what are these modules and where do they come from? ? ?

3. Find the source of the module

If you want to find the source, you can think about it carefully. The nesting relationship of the module should be: Module -> Assembly -> Appdomain , so checking AppDomain may give us more information. Next, use !DumpDomain export all the application domains of the current process. A few minutes of brushing, hey. . . Screenshot below:

As can be seen from the figure, there are a large number of Dynamic types of assemblies. You definitely want to ask what does this mean? Yes, this is the assembly dynamically created by the code, which is as high as 19w. . . The next question to be solved is: how are these assemblies created? ? ?

4. Export module content

Old readers should know how I exported the problem code from the module. Yes, just look for the startaddress of the module. Here I will pick one of the modules: 00007ffe2b1f2aa0.


2:2:152> !dumpmodule 00007ffe2b1f2aa0
Name: Unknown Module
Attributes:              Reflection SupportsUpdateableMethods IsDynamic IsInMemory 
Assembly:                000002776c1d8470
BaseAddress:             0000000000000000
PEFile:                  000002776C1D8BF0
ModuleId:                00007FFE2B1F2EB8
ModuleIndex:             00000000000177CF
LoaderHeap:              0000000000000000
TypeDefToMethodTableMap: 00007FFE2B1EE8C0
TypeRefToMethodTableMap: 00007FFE2B1EE8E8
MethodDefToDescMap:      00007FFE2B1EE910
FieldDefToDescMap:       00007FFE2B1EE960
MemberRefToDescMap:      0000000000000000
FileReferencesMap:       00007FFE2B1EEA00
AssemblyReferencesMap:   00007FFE2B1EEA28

Let me go, BaseAddress does not have an address, which is really unlucky. This means that you cannot export the module. It is right to think about it. After all, it is dynamically generated. Maybe people who write the code don't know what is in the module. Is there really no way? But as the saying goes, there is no way to go !dumpmodule . There is an mt (methodtable) parameter in the 060adb9bd82ede command to show which types are in the current module. This is a major clue.


||2:2:152> !dumpmodule -mt 00007ffe2b1f2aa0 
Name: Unknown Module
Attributes:              Reflection SupportsUpdateableMethods IsDynamic IsInMemory 
Assembly:                000002776c1d8470

Types defined in this module

              MT          TypeDef Name
------------------------------------------------------------------------------
00007ffe2b1f3168 0x02000002 <Unloaded Type>
00007ffe2b1f2f60 0x02000003 <Unloaded Type>

Types referenced in this module

              MT            TypeRef Name
------------------------------------------------------------------------------
00007ffdb9f70af0 0x02000001 System.Object
00007ffdbaed3730 0x02000002 Castle.DynamicProxy.IProxyTargetAccessor
00007ffdbaec8f98 0x02000003 Castle.DynamicProxy.ProxyGenerationOptions
00007ffdbaec7fe8 0x02000004 Castle.DynamicProxy.IInterceptor

type defined in the module, both of which have their method table addresses. Next, use mt to exchange for md (method descriptor) to get the final module content.

At this point, I finally figured it out. It turned out that this old man used Castle to make an AOP function. It should be that AOP was not used correctly, which led to the generation of 19w + . No wonder the memory will eventually be blown up. . . The root has finally been found, how to modify it next? ? ?

5. Modify Castle AOP problem code

I'm stumped this time, after all, I really haven't played Castle 😥😥😥, but the old rules, go to bing and see Tianya has fallen into humans, hey, there is really an article about Castle AOP causing memory leaks: Castle Windsor Interceptor memory leak , the solution is also provided, the screenshot is as follows:

Hurry up and throw this link to my old man, I feel that I can only help him here, and the rest can only depend on good luck.

Three: Summary

It's really good fortune to trick people, my brother got it done quickly, and completed the self-test and went online that night.


I hurriedly asked how my old man changed it. I didn't hesitate to release the source code. Sure enough, ProxyGenerator set 060adb9bd83044 to static according to the foreigner's suggestion. . . Otherwise, a new assembly, and then look at the code before the change, the screenshot is as follows:

After solving these two difficult problems, do you feel like you want to send me a small trophy? 😕😕😕

More high-quality dry goods: see my GitHub: dotnetfly


一线码农
369 声望1.6k 粉丝