One: background
1. Tell a story
A few days ago, a friend added wx and said that his program suffered a surge in memory. How to analyze it for help?
Talk to this friend, this dump is also taken from a HIS system, as my friend said, I was really on the bar with the hospital 🤣🤣🤣, that’s okay, save yourself some resources😁😁😁, okay, Stop talking, talk to windbg.
Two: windbg analysis
1. Managed or unmanaged?
Since the memory is skyrocketing, let's see how big the commit memory of the current process is?
0:000> !address -summary
--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_FREE 174 7ffe`baac0000 ( 127.995 TB) 100.00%
MEM_COMMIT 1153 1`33bd3000 ( 4.808 GB) 94.59% 0.00%
MEM_RESERVE 221 0`1195d000 ( 281.363 MB) 5.41% 0.00%
It can be seen that it probably accounts for 4.8G
, and then let's look at the managed heap memory.
0:000> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x00000207a4fc48c8
generation 1 starts at 0x00000207a3dc3138
generation 2 starts at 0x0000020697fc1000
ephemeral segment allocation context: none
------------------------------
GC Heap Size: Size: 0x1241b3858 (4900730968) bytes.
It can be seen from the last line that the managed heap occupies 4900730968/1024/1024/1024=4.5G
, and the two indicators are compared. It turns out that there is a problem with the managed memory, which is easier to handle. . .
2. View the managed heap
Since the memory is eaten up by the managed heap, let's see what is on the managed heap? ? ?
0:000> !dumpheap -stat
Statistics:
MT Count TotalSize Class Name
...
00007ffd00397b98 1065873 102323808 System.Data.DataRow
00000206978b8250 1507805 223310768 Free
00007ffd20d216b8 4668930 364025578 System.String
00007ffd20d22aa8 797 403971664 System.String[]
00007ffd20d193d0 406282 3399800382 System.Byte[]
Total 9442152 objects
I don’t know. I was shocked. System.Byte[]
almost occupies 3.3 G of memory, which means that the gc heap is almost eaten by it. According to experience, there must be something big. Then how to analyze it? In addition to using scripts to byte[]
, are there other techniques for pure human flesh? Of course, you can use !heapstat
observe the generation information of these objects on the managed heap.
0:000> !heapstat
Heap Gen0 Gen1 Gen2 LOH
Heap0 2252000 18880400 3968704192 910894376
Free space: Percentage
Heap0 43128 770160 185203264 39849984SOH: 4% LOH: 4%
It can be seen from the figure that the current head is on Gen2, and then you can use eeheap -gc
to find the segment address range of Gen2, so as to minimize the display of the content on the heap.
0:000> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x00000207a4fc48c8
generation 1 starts at 0x00000207a3dc3138
generation 2 starts at 0x0000020697fc1000
ephemeral segment allocation context: none
segment begin allocated size
0000020697fc0000 0000020697fc1000 00000206a7fbec48 0xfffdc48(268426312)
00000206bbeb0000 00000206bbeb1000 00000206cbeaef50 0xfffdf50(268427088)
00000206ccc40000 00000206ccc41000 00000206dcc3f668 0xfffe668(268428904)
00000206dcc40000 00000206dcc41000 00000206ecc3f098 0xfffe098(268427416)
0000020680000000 0000020680001000 000002068ffff8c0 0xfffe8c0(268429504)
00000206ff4d0000 00000206ff4d1000 000002070f4cf588 0xfffe588(268428680)
000002070f4d0000 000002070f4d1000 000002071f4cf9f0 0xfffe9f0(268429808)
000002071f4d0000 000002071f4d1000 000002072f4cfef0 0xfffeef0(268431088)
000002072f4d0000 000002072f4d1000 000002073f4cf748 0xfffe748(268429128)
000002073f4d0000 000002073f4d1000 000002074f4ce900 0xfffd900(268425472)
00000207574d0000 00000207574d1000 00000207674cfe70 0xfffee70(268430960)
00000207674d0000 00000207674d1000 00000207774ceaf8 0xfffdaf8(268425976)
00000207774d0000 00000207774d1000 00000207874cf270 0xfffe270(268427888)
00000207874d0000 00000207874d1000 00000207974cf7a8 0xfffe7a8(268429224)
00000207974d0000 00000207974d1000 00000207a51ea5a8 0xdd195a8(231839144)
Generally speaking, the first segment is for gen0 + gen1
, and the subsequent segment is gen2
. Next, I choose segment: 00000206dcc41000 - 00000206ecc3f098
, and then use !dumpheap
export all objects in the interval.
0:000> !dumpheap -stat 00000206dcc41000 00000206ecc3f098
Statistics:
MT Count TotalSize Class Name
00007ffd00397b98 191803 18413088 System.Data.DataRow
00007ffd20d216b8 662179 37834152 System.String
00007ffd20d193d0 23115 187896401 System.Byte[]
From this point of view memory segment, Byte[]
have a 2.3w, not too much, come out to see what all the dump features.
0:000> !dumpheap -mt 00007ffd20d193d0 00000206dcc41000 00000206ecc3f098
Address MT Size
00000206dcc410e8 00007ffd20d193d0 8232
00000206dcc43588 00007ffd20d193d0 8232
00000206dcc45a48 00007ffd20d193d0 8232
00000206dcc47d78 00007ffd20d193d0 8232
00000206dcc4a028 00007ffd20d193d0 8232
00000206dcc4c4b0 00007ffd20d193d0 8232
00000206dcc4eb08 00007ffd20d193d0 8232
00000206dcc50e88 00007ffd20d193d0 8232
00000206dcc535b0 00007ffd20d193d0 8232
00000206dcc575d8 00007ffd20d193d0 8232
00000206dcc5a5a8 00007ffd20d193d0 8232
00000206dcc5cbf8 00007ffd20d193d0 8232
00000206dcc5eef8 00007ffd20d193d0 8232
00000206dcc611f8 00007ffd20d193d0 8232
00000206dcc634e8 00007ffd20d193d0 8232
00000206dcc657f0 00007ffd20d193d0 8232
00000206dcc67af8 00007ffd20d193d0 8232
00000206dcc69e00 00007ffd20d193d0 8232
...
I went, 99% are 8232byte
. It turned out to be some 8k
. Then who is using it, use !gcroot
check the reference root.
0:000> !gcroot 00000206dcc410e8
Thread 8c1c:
rsi:
-> 00000206983d5730 System.ServiceProcess.ServiceBase[]
...
-> 000002069dcb6d38 OracleInternal.ConnectionPool.OraclePool
...
-> 000002069dc949c0 OracleInternal.TTC.OraBufReader
-> 000002069dc94a70 System.Collections.Generic.List`1[[OracleInternal.Network.OraBuf, Oracle.ManagedDataAccess]]
-> 00000206ab8c2200 OracleInternal.Network.OraBuf[]
-> 00000206dcc41018 OracleInternal.Network.OraBuf
-> 00000206dcc410e8 System.Byte[]
Judging from the reference chain, it seems to be held by 060b9892d9de98, which is very confusing. Could it be that a bug in the Oracle OracleInternal.Network.OraBuf[]
The curiosity is coming, let's take a look at the number and size of the elements?
0:000> !do 00000206ab8c2200
Name: OracleInternal.Network.OraBuf[]
MethodTable: 00007ffcc7833c68
EEClass: 00007ffd20757728
Size: 4194328(0x400018) bytes
Array: Rank 1, Number of elements 524288, Type CLASS (Print Array)
Fields:
None
0:000> !objsize 00000206ab8c2200
sizeof(00000206ab8c2200) = -1086824024 (0xbf3861a8) bytes (OracleInternal.Network.OraBuf[])
The current array has 52w, and totalsize is directly negative 😓.
3. Find the problem code
After knowing the phenomenon, then use ILSpy to decompile the Oracle SDK, and finally compare it, as shown in the following figure:
It turns out that m_tempOBList
is the culprit of the memory skyrocket, which is very embarrassing. Why does it skyrocket? Why not release? Since I am not familiar with Oracle, I can only turn to the magical StackOverflow. I went, and there is really no end to the world, Huge managed memory allocation when reading (iterating) data with DbDataReader
It probably means that this phenomenon is caused by a bug in the Oracle SDK reading Clob type fields. The solution is also very simple. It will be released after use. For details, see the following figure:
4. Find the truth
Since the post says that there is a problem with reading the Clob type, then call up all the thread stacks to see if there is a trace of the Clob in the thread stack at this time?
Judging from the thread stack, the code uses the ToDataTable
method to convert the IDataReader to a DataTable. When the large fields are read during the conversion process, there will naturally be GetCompleteClobData
, which means that the perfect hit the post said. In order to make the conclusion more accurate, I will Go and dig how many rows have been read by the current DataReader?
0:028> !clrstack -a
OS Thread Id: 0xbab0 (28)
000000e78ef7d520 00007ffd00724458 System.Data.DataTable.Load(System.Data.IDataReader, System.Data.LoadOption, System.Data.FillErrorEventHandler)
PARAMETERS:
this = <no data>
reader (<CLR reg>) = 0x00000206a530ac20
loadOption = <no data>
errorHandler = <no data>
0:028> !do 0x00000206a530ac20
Name: Oracle.ManagedDataAccess.Client.OracleDataReader
MethodTable: 00007ffcc7933b10
EEClass: 00007ffcc78efd30
Size: 256(0x100) bytes
File: D:\xxx.dll
Fields:
00007ffd20d23e98 4000337 d0 System.Int32 1 instance 1061652 m_RowNumber
From m_RowNumber, 106w rows have been read. It is not common to read 100w+ records at a time. If there are large fields, it is also 🐂👃.
Three: Summary
Taken together, this accident was caused by reading millions of data containing large fields to the DataTable at one time. The solution is very simple. Read the DataReader through for and release it immediately after processing the OracleClob type. Refer to the post code:
var item = oracleDataReader.GetOracleValue(columnIndex);
if (item is OracleClob clob)
{
if (clob != null)
{
// use clob.Value ...
clob.Close();
}
}
More high-quality dry goods: see my GitHub: dotnetfly
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。