最近遇到个memory相关的问题,在debug过程中发现非常有意思,在此记录下来,希望有所帮助。
问题
pod中某个container的memory usage持续增长,初始值为60Mi。在运行2天之后,通过kubectl top command --containers
command查看,发现memory usage已经达到了400Mi。
但是通过docker stats
看,memory usage为正常值。示例如下:
- 通过kubectl命令,memory usage 19Mi
[@ ~]$kubectl top pod nginx-deployment-66979f666d-wd24b --containers
POD NAME CPU(cores) MEMORY(bytes)
nginx-deployment-66979f666d-wd24b nginx 500m 19Mi
- 但是通过docker命令看,memory usage为正常值2.461Mi
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
5d14f804062d k8s_nginx_nginx-deployment-66979f666d-wd24b_default_65cad64e-9696-4a3a-9bc6-08f93c6d263b_0 49.94% 2.461MiB / 20MiB 12.30% 0B / 0B 0B / 0B 4
Debug过程
1. 为什么kubectl top
和docker stats
结果不一致
因为2者的统计方式不一样。通过如下的命令可以看到container详细的memory usage
curl --unix-socket /var/run/docker.sock "http:/v1.24/containers/5d14f804062d/stats"
{
"memory_stats":{
"usage":20332544,
"max_usage":20971520,
"stats":{
"active_anon":1990656,
"active_file":17420288,
"cache":17608704,
"dirty":0,
"hierarchical_memory_limit":20971520,
"hierarchical_memsw_limit":20971520,
"inactive_anon":4096,
"inactive_file":184320,
"mapped_file":4096,
"pgfault":7646701079,
"pgmajfault":0,
"pgpgin":1265901953,
"pgpgout":1265897156,
"rss":2039808,
"rss_huge":0,
"total_active_anon":1990656,
"total_active_file":17420288,
"total_cache":17608704,
"total_dirty":0,
"total_inactive_anon":4096,
"total_inactive_file":184320,
"total_mapped_file":4096,
"total_pgfault":7646701079,
"total_pgmajfault":0,
"total_pgpgin":1265901953,
"total_pgpgout":1265897156,
"total_rss":2039808,
"total_rss_huge":0,
"total_unevictable":0,
"total_writeback":0,
"unevictable":0,
"writeback":0
},
"failcnt":10181490,
"limit":20971520
},
}
kubectl top
command统计的是 "usage":20332544
, 约为19Mi(并不是精确相等)docker stats
command 统计的是 usage - cache
, 即20332544 - 17608704, 约为2.5Mi.
此外还可以看到cache持续的增长。
那么现在问题来了:cache是什么 ? 为什么cache持续增长?
cache是什么?
|
From https://docs.docker.com/confi...,
cache
The amount of memory used by the processes of this control group that can be associated precisely with a block on a block device. When you read from and write to files on disk, this amount increases. This is the case if you use “conventional” I/O (open
,read
,write
syscalls) as well as mapped files (withmmap
). It also accounts for the memory used bytmpfs
mounts, though the reasons are unclear.
2. 为什么cache持续增长?
出问题的container逻辑非常简单,只有简单的逻辑操作和写入log到logfile。首先排除了业务逻辑的问题,那么有没有可能是log的问题呢?
通过实验,发现当暂停写入log时,cache不再增长。因为logfile是在EmptyDir的volume,也就是位于host的disk上的。通过docker inspect
可以看到mount的路径
"Mounts": [
{
"Type": "bind",
"Source": "/var/lib/kubelet/pods/65cad64e-9696-4a3a-9bc6-08f93c6d263b/volumes/kubernetes.io~empty-dir/nginx-vol",
"Destination": "/tmp/memorytest",
"Mode": "Z",
"RW": true,
"Propagation": "rprivate"
},
那么现在问题来了:为什么log写入disk上,会导致memory cache的增加?
3. 为什么写disk会导致memory cache的增加?
原因是Linux is borrowing unused memory for disk caching. 具体分析可见https://www.linuxatemyram.com...
需要注意的是
disk caching only borrows the ram that applications don't currently want. It will not use swap. If applications want more memory, they just take it back from the disk cache. They will not start swapping.
由于log导致的memory增加,那么会不会导致memory usage达到memory limit, 导致container Killed?
4. memory usage增加会不会导致OOMKilled?
不会。首先
- 如上描述,当memory快到达limit时,application will take it back from disk cache. 也就是接近limit时,此时log会写到disk上,memory cache不会再增长。
- k8s中container的resource memory limit是传递给docker的,等同于
docker run --memory
. 也就是说只有docker意义的memory超了,才会出现OOMKilled的问题。
Reference Link:
https://docs.docker.com/confi...
https://www.linuxatemyram.com...
https://www.linuxatemyram.com...
https://www.ibm.com/support/p...
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。