One: background
1. Tell a story
In mid-July, a friend added wx to ask for help. The memory of his program ran up during production. It seems that there is no trend of looking back. Asking how to solve it, the screenshot is as follows:
After chatting with this friend, I felt like I was a small boss in a small county. I have a regular life, local resources, various small relationships, and a taste of financial freedom. This is also the way of life I have always longed for 😄😄 😄.
Now that my friend has found me, I have to find a way to solve the problem for him. Since the memory is skyrocketing, I will gamble on the hosting level. Hey, talk to windbg.
Two: windbg analysis
1. Managed or unmanaged
Friends who have been following this series should know that I have used the !address -summary
and !eeheap -gc
countless times to determine whether the current memory belongs to the managed layer or the unmanaged layer.
0:000> !address -summary
--- State Summary ---------------- RgnCount ----------- Total Size -------- %ofBusy %ofTotal
MEM_FREE 393 7dfe`f2105000 ( 125.996 TB) 98.43%
MEM_RESERVE 1691 200`0f1e4000 ( 2.000 TB) 99.81% 1.56%
MEM_COMMIT 6191 0`fed07000 ( 3.981 GB) 0.19% 0.00%
0:000> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x000001D2E572BBC8
generation 1 starts at 0x000001D2E54F70E0
generation 2 starts at 0x000001D252051000
ephemeral segment allocation context: none
segment begin allocated size
000001D252050000 000001D252051000 000001D26204FFE0 0xfffefe0(268431328)
Large object heap starts at 0x000001D262051000
segment begin allocated size
000001D262050000 000001D262051000 000001D2655F3F80 0x35a2f80(56242048)
Total Size: Size: 0xbf4dbf80 (3209543552) bytes.
------------------------------
GC Heap Size: Size: 0xbf4dbf80 (3209543552) bytes.
The process index on the 3.98G
is 06119c6ad09c13, and the GC heap index is 3209543552 = 3G
. Obviously, this accident belongs to the hosting level.
2. Look for large objects on the hosting layer
We all know that C# is a managed language, so even useful and useless objects can't escape the GC heap. The implication is to look at the GC heap and pick a few large objects.
0:000> !dumpheap -stat
Statistics:
MT Count TotalSize Class Name
00007ff98a68f090 391475 43869284 System.Int32[]
00007ff98b6adfa0 1902760 45666240 System.Collections.ObjectModel.ReadOnlyCollection`1[[System.Linq.Expressions.Expression, System.Linq.Expressions]]
00007ff98b6ac3c0 1951470 46835280 System.Linq.Expressions.ConstantExpression
00007ff98bc452e0 1681178 53797696 System.Linq.Expressions.TypedConstantExpression
00007ff98eacb6b8 1902708 60886656 System.Dynamic.Utils.ListArgumentProvider
00007ff98f236518 1774982 70999280 Microsoft.EntityFrameworkCore.Query.Expressions.ColumnExpression
00007ff98c650c58 1681142 80694816 System.Linq.Expressions.MethodCallExpression3
00007ff98a82bc38 3414094 81938256 System.RuntimeMethodHandle
00007ff98fd96fc0 17750 83936016 System.Collections.Generic.Dictionary`2+Entry[[System.Reflection.MemberInfo, System.Private.CoreLib],[System.Linq.Expressions.Expression, System.Linq.Expressions]][]
00007ff98e5ed5d8 35493 101740504 System.Collections.Generic.Dictionary`2+Entry[[Microsoft.Extensions.DependencyInjection.ServiceLookup.ServiceCacheKey, Microsoft.Extensions.DependencyInjection],[System.Object, System.Private.CoreLib]][]
00007ff98bcff6a8 3639389 116460448 System.Linq.Expressions.PropertyExpression
00007ff98b85cf00 5028347 160907104 System.Reflection.Emit.GenericFieldInfo
00007ff98a671e18 2178117 168395994 System.String
00007ff98a5b6610 160565 171498416 System.Object[]
00007ff98eaa8ab0 4981589 199263560 System.Linq.Expressions.MemberAssignment
00007ff98a672360 398740 391928469 System.Byte[]
00007ff98a746d68 181886 486150592 System.Char[]
From the managed heap, the System.Linq.Expressions.MemberAssignment
object is as high as 498w. Obviously there is a problem. From the class name, it may be related to ExpressionTree. Then take a few objects to see if there are too large objects in its reference chain.
0:000> !gcroot 000001d25399f690
HandleTable:
000001D251B715A8 (pinned handle)
-> 000001D262068CF0 System.Object[]
-> 000001D2531C3B78 Microsoft.EntityFrameworkCore.Internal.ServiceProviderCache
-> 000001D25399E3D0 Remotion.Linq.QueryModel
-> 000001D25399E3B8 Remotion.Linq.Clauses.SelectClause
-> 000001D25442C068 System.Linq.Expressions.MemberInitExpression
-> 000001D25442C050 System.Runtime.CompilerServices.TrueReadOnlyCollection`1[[System.Linq.Expressions.MemberBinding, System.Linq.Expressions]]
-> 000001D2539A0290 System.Linq.Expressions.MemberBinding[]
-> 000001D25399F690 System.Linq.Expressions.MemberAssignment
The reference chain is very long, I will intercept it here. After a bit of investigation, I found that the large object is actually Remotion.Linq.Clauses.SelectClause
, and the objsize object is directly exploded. It is really weird, as shown in the following code:
0:000> !objsize 000001D25399E3B8
sizeof(000001D25399E3B8) = -1187378032 (0xb93a0c90) bytes (Remotion.Linq.Clauses.SelectClause)
A bit confused, this object is actually the culprit. From the reference chain, it is a widget under EF to build an expression tree. What is certain is that there is something wrong with my friend when using EF, but I have to bite the bullet char[]
, I found that this class has a large number of such 06119c6ad09d3d arrays. After exporting, it will look like the following.
Logistics.Text30),
| Text31 = string TryReadValue(t1.Outer.Outer, 42, WmsOutboundConfirmLogistics.Text31),
| Text32 = string TryReadValue(t1.Outer.Outer, 43, WmsOutboundConfirmLogistics.Text32),
| Text33 = string TryReadValue(t1.Outer.Outer, 44, WmsOutboundConfirmLogistics.Text33),
| Text34 = string TryReadValue(t1.Outer.Outer, 45, WmsOutboundConfirmLogistics.Text34),
| Text35 = string TryReadValue(t1.Outer.Outer, 46, WmsOutboundConfirmLogistics.Text35),
| IsQueue = Nullable<bool> TryReadValue(t1.Outer.Outer, 47, WmsOutboundConfirmLogistics.IsQueue),
| IsStop = Nullable<bool> TryReadValue(t1.Outer.Outer, 48, WmsOutboundConfirmLogistics.IsStop),
| CheckCode = string TryReadValue(t1.Outer.Outer, 49, WmsOutboundConfirmLogistics.CheckCode),
| ClientCode = string TryReadValue(t1.Outer.Inner, 50, WmsOutboundOrder.ClientCode),
| WarehouseCode = string TryReadValue(t1.Outer.Inner, 51, WmsOutboundOrder.WarehouseCode),
| ErpNumber = string TryReadValue(t1.Outer.Inner, 52, WmsOutboundOrder.ErpNumber),
| OrderCategory = string TryReadValue(t1.Outer.Inner, 53, WmsOutboundOrder.OrderCategory),
| OrderStatus = string TryReadValue(t1.Outer.Inner, 54, WmsOutboundOrder.OrderStatus),
| OrderType = string TryReadValue(t1.Outer.Inner, 55, WmsOutboundOrder.OrderType),
| SendCompany = string TryReadValue(t1.Outer.Inner, 56, WmsOutboundOrder.SendCompany),
| SendName = string TryReadValue(t1.Outer.Inner, 57, WmsOutboundOrder.SendName),
| SendTel = string TryReadValue(t1.Outer.Inner, 58, WmsOutboundOrder.SendTel),
| SendMobile = string TryReadValue(t1.Outer.Inner, 59, WmsOutboundOrder.SendMobile),
| SendProvince = string TryReadValue(t1.Outer.Inner, 60, WmsOutboundOrder.SendProvince),
| SendCity = string TryReadValue(t1.Outer.Inner, 61, WmsOutboundOrder.SendCity),
| SendArea = string TryReadValue(t1.Outer.Inner, 62, WmsOutboundOrder.SendArea),
| ...
| CategoryName = string TryReadValue(t1.Outer.Inner, 88, WmsOutboundOrder.CategoryName),
| SourcePlatformCode = string TryReadValue(t1.Outer.Inner, 89, WmsOutboundOrder.SourcePlatformCode),
| PayMode = (string)string TryReadValue(t1.Outer.Outer, 90, null),
| List = List<WmsOutboundConfirmLogisticsLinesDTO> WmsOutboundConfirmLogisticsBusiness.GetOrderLines(string TryReadValue(t1.Outer.Outer, 5, WmsOutboundConfirmLogistics.OrderNumber)),
| ConfirmTime = DateTime TryReadValue(t1.Inner, 91, WmsOutboundOrderConfirmation.CreateTime),
| ReturnUrl = (string)string TryReadValue(t1.Outer.Outer, 92, null)
| }
|__ ),
|__ contextType: Core.DataRepository.BaseDbContext,
|__ logger: DiagnosticsLogger<Query>,
|__ queryContext: Unhandled parameter: queryContext)
From the content point of view, ExpressionTree, which should be the select statement, said that I asked a friend and said it was probably a report business, but this information did not seem to help him much. To be honest, I actually don’t know how to continue the investigation. Fell into despair.
3. Find hope from despair
I was thinking, since EF has built a large number of ExpressionTrees like this, there must be a problem, but I can't figure out what the problem is. After a long time, I suddenly had an inspiration. Now that EF has built a tree, it is possible that SQL will also come out, right. , Why don’t I search the select sql statement directly on the heap. . . .
0:000> !strings /m:*select*
Address Gen Length Value
000001d2e4de64e0 2 1964 SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4e11e78 2 1964 SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4e3d1f0 2 1964 SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4e673c8 2 1964 SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4e91760 2 1964 SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4ebb2e8 2 1964 SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4ee54f8 2 1964 SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4f10758 2 1964 SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
000001d2e4f398d0 2 1964 SELECT a."Id", a."CreateTime" AS "CreateTime0", a."CreatorId", a."CreatorRealName", a."Deleted", a."OrderNumber", a."CarrierId",...
---------------------------------------
18128 matching strings
Sure enough, a large number of duplicate select statements were found, and they were very close from the leftmost memory address, which means that they were generated in a certain operation at the same time, and then we exported several sql statements.
SELECT a."Id", ....
FROM "WmsOutboundConfirmLogistics" AS a
INNER JOIN "WmsOutboundOrder" AS b ON a."OrderNumber" = b."OrderNumber"
INNER JOIN "WmsOutboundOrderConfirmation" AS c ON a."OrderNumber" = c."OrderNumber"
WHERE (a."OrderNumber" = @__pagination_OrderNumber_0) AND (b."FreezeStatus" = FALSE)
ORDER BY a."Id"";
SELECT a."Id", ....
FROM "WmsOutboundConfirmLogistics" AS a
INNER JOIN "WmsOutboundOrder" AS b ON a."OrderNumber" = b."OrderNumber"
INNER JOIN "WmsOutboundOrderConfirmation" AS c ON a."OrderNumber" = c."OrderNumber"
WHERE (a."OrderNumber" = @__pagination_OrderNumber_0) AND (b."FreezeStatus" = FALSE)
ORDER BY a."Id""
I got this 1.8w
to my friend. My friend said this is the sql for query report.
4. Integration of all leads
That here there is a big problem, since it is a query report, why is there the same sql 1.8w, the only difference is a."OrderNumber" = @__pagination_OrderNumber_0
the order number, is it should not be a.OrderNumber in (xxxx)
or table associated with the query do? ? ? Putting it all together is the following conjecture:
-- 理想
select * from a where a.id in (1,2,3)
-- 现实
select * from a where a.id=1;
select * from a where a.id=2;
select * from a where a.id=3;
Coupled with the similar memory address of each SQL, combined with the Remotion.Linq.Clauses.SelectClause
object of the exploded table, the whole process is probably: the table is associated or in operation, and the result becomes countless single SQL statement queries, resulting in explosive growth of memory at the bottom of EF.
Three: Summary
After reading the wording of my friend's query ef, I guess that most of the human flesh builds ExpressionTree to query the database, capitalized 🐂👃, such as the picture below:
The solution is to ask friends to check the writing of the expression tree, or directly write the good SQL. To be honest, this dump still took a lot of effort. I thought it was very simple, but it still encountered a little bit in practice. Difficulties, just grow by experience!
More high-quality dry goods: see my GitHub: dotnetfly
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。