cms yonggc频率突然暴增,cpu被打满---如何定位是什么原因导致的这个现象?

1.昨天中午服务器监控提示:服务器xx.xx.xx.xx机器cpu使用率连续36次超过20%
2.服务器基础信息:javaweb项目,提供查询,导出等功能。
3.监控信息如下:
yonggc监控:

clipboard.png
堆内存监控:

clipboard.png
cpu监控:

clipboard.png
full-gc监控:

clipboard.png
机器运行实例监控

clipboard.png

当时机器的gclog:
以上是监控信息~
由于其他同事在14:32时手动执行了一次fullgc导致,内存的一回收,才恢复平静。
当时,启动参数:-Xms4g -Xmx4g -XX:PermSize=300m -XX:MaxPermSize=300m -XX:NewSize=2g -XX:MaxNewSize=2g -XX:SurvivorRatio=20 -XX:MaxTenuringThreshold=10 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:ParallelGCThreads=4 -XX:+CMSScavengeBeforeRemark -XX:+CMSParallelRemarkEnabled -XX:+ScavengeBeforeFullGC -XX:+UseCMSCompactAtFullCollection -XX:CMSInitiatingPermOccupancyFraction=70 -XX:+ExplicitGCInvokesConcurrent -XX:CMSFullGCsBeforeCompaction=5 -XX:+CMSClassUnloadingEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=30 -Xloggc:/tomcat7-gc-cms.log -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintHeapAtGC -XX:+PrintReferenceGC -XX:+PrintTenuringDistribution
通过查看tomcat7-gc-cms.log分析过程:
14:23有一次fullgc log位置:
fullgc--start

2018-04-16T14:23:45.747+0800: 356595.145: [CMS-concurrent-mark: 1.508/1.538 secs] [Times: user=12.08 sys=0.00, real=1.54 secs] 
2018-04-16T14:23:45.748+0800: 356595.145: [CMS-concurrent-preclean-start]
2018-04-16T14:23:45.748+0800: 356595.145: [Preclean SoftReferences, 0.0005960 secs]2018-04-16T14:23:45.748+0800: 356595.145: [Preclean WeakReferences, 0.0001810 secs]2018-04-16T14:23:45.748+0800: 356595.146: [Preclean FinalReferences, 0.0000650 secs]2018-04-16T14:23:45.748+0800: 356595.146: [Preclean PhantomReferences, 0.0000160 secs]2018-04-16T14:23:45.756+0800: 356595.153: [CMS-concurrent-preclean: 0.008/0.009 secs] [Times: user=0.07 sys=0.00, real=0.01 secs] 
2018-04-16T14:23:45.756+0800: 356595.154: [CMS-concurrent-abortable-preclean-start]
{Heap before GC invocations=652 (full 1):
 par new generation   total 2001856K, used 1915155K [0x00000006ed400000, 0x000000076d400000, 0x000000076d400000)
  eden space 1906560K, 100% used [0x00000006ed400000, 0x00000007619e0000, 0x00000007619e0000)
  from space 95296K,   9% used [0x00000007619e0000, 0x0000000762244da8, 0x00000007676f0000)
  to   space 95296K,   0% used [0x00000007676f0000, 0x00000007676f0000, 0x000000076d400000)
 concurrent mark-sweep generation total 2097152K, used 629223K [0x000000076d400000, 0x00000007ed400000, 0x00000007ed400000)
 concurrent-mark-sweep perm gen total 307200K, used 143890K [0x00000007ed400000, 0x0000000800000000, 0x0000000800000000)
2018-04-16T14:23:45.976+0800: 356595.373: [GC2018-04-16T14:23:45.976+0800: 356595.373: [ParNew2018-04-16T14:23:45.988+0800: 356595.385: [SoftReference, 0 refs, 0.0000510 secs]2018-04-16T14:23:45.988+0800: 356595.385: [WeakReference, 25 refs, 0.0000140 secs]2018-04-16T14:23:45.988+0800: 356595.385: [FinalReference, 45 refs, 0.0000290 secs]2018-04-16T14:23:45.988+0800: 356595.385: [PhantomReference, 0 refs, 0.0000100 secs]2018-04-16T14:23:45.988+0800: 356595.385: [JNI Weak Reference, 0.0000100 secs]
Desired survivor size 48791552 bytes, new threshold 10 (max 10)
- age   1:    3537312 bytes,    3537312 total
- age   2:    1668536 bytes,    5205848 total
- age   3:     337152 bytes,    5543000 total
- age   4:      14496 bytes,    5557496 total
- age   5:      16456 bytes,    5573952 total
- age   6:      10904 bytes,    5584856 total
- age   7:      12664 bytes,    5597520 total
- age   8:      14056 bytes,    5611576 total
- age   9:      11808 bytes,    5623384 total
- age  10:      18184 bytes,    5641568 total
: 1915155K->8115K(2001856K), 0.0122020 secs] 2544378K->637347K(4099008K), 0.0124670 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 
Heap after GC invocations=653 (full 1):
 par new generation   total 2001856K, used 8115K [0x00000006ed400000, 0x000000076d400000, 0x000000076d400000)
  eden space 1906560K,   0% used [0x00000006ed400000, 0x00000006ed400000, 0x00000007619e0000)
  from space 95296K,   8% used [0x00000007676f0000, 0x0000000767edcca0, 0x000000076d400000)
  to   space 95296K,   0% used [0x00000007619e0000, 0x00000007619e0000, 0x00000007676f0000)
 concurrent mark-sweep generation total 2097152K, used 629232K [0x000000076d400000, 0x00000007ed400000, 0x00000007ed400000)
 concurrent-mark-sweep perm gen total 307200K, used 143890K [0x00000007ed400000, 0x0000000800000000, 0x0000000800000000)
}
2018-04-16T14:23:46.346+0800: 356595.743: [CMS-concurrent-abortable-preclean: 0.274/0.590 secs] [Times: user=4.61 sys=0.00, real=0.59 secs] 
2018-04-16T14:23:46.351+0800: 356595.748: [GC[YG occupancy: 1249269 K (2001856 K)]{Heap before GC invocations=653 (full 1):
 par new generation   total 2001856K, used 1249269K [0x00000006ed400000, 0x000000076d400000, 0x000000076d400000)
  eden space 1906560K,  65% used [0x00000006ed400000, 0x00000007390109d8, 0x00000007619e0000)
  from space 95296K,   8% used [0x00000007676f0000, 0x0000000767edcca0, 0x000000076d400000)
  to   space 95296K,   0% used [0x00000007619e0000, 0x00000007619e0000, 0x00000007676f0000)
 concurrent mark-sweep generation total 2097152K, used 629232K [0x000000076d400000, 0x00000007ed400000, 0x00000007ed400000)
 concurrent-mark-sweep perm gen total 307200K, used 143890K [0x00000007ed400000, 0x0000000800000000, 0x0000000800000000)
2018-04-16T14:23:46.351+0800: 356595.748: [GC2018-04-16T14:23:46.351+0800: 356595.748: [ParNew2018-04-16T14:23:46.363+0800: 356595.760: [SoftReference, 0 refs, 0.0000360 secs]2018-04-16T14:23:46.363+0800: 356595.760: [WeakReference, 24 refs, 0.0000140 secs]2018-04-16T14:23:46.363+0800: 356595.760: [FinalReference, 48 refs, 0.0000200 secs]2018-04-16T14:23:46.363+0800: 356595.760: [PhantomReference, 0 refs, 0.0000100 secs]2018-04-16T14:23:46.363+0800: 356595.760: [JNI Weak Reference, 0.0000090 secs]
Desired survivor size 48791552 bytes, new threshold 10 (max 10)
- age   1:    2807512 bytes,    2807512 total
- age   2:    1240272 bytes,    4047784 total
- age   3:     487568 bytes,    4535352 total
- age   4:      86712 bytes,    4622064 total
- age   5:       8384 bytes,    4630448 total
- age   6:      12944 bytes,    4643392 total
- age   7:       9152 bytes,    4652544 total
- age   8:      10872 bytes,    4663416 total
- age   9:      11560 bytes,    4674976 total
- age  10:       8456 bytes,    4683432 total
: 1249269K->5485K(2001856K), 0.0116790 secs] 1878501K->634726K(4099008K), 0.0119130 secs] [Times: user=0.04 sys=0.00, real=0.02 secs] 
Heap after GC invocations=654 (full 1):
 par new generation   total 2001856K, used 5485K [0x00000006ed400000, 0x000000076d400000, 0x000000076d400000)
  eden space 1906560K,   0% used [0x00000006ed400000, 0x00000006ed400000, 0x00000007619e0000)
  from space 95296K,   5% used [0x00000007619e0000, 0x0000000761f3b450, 0x00000007676f0000)
  to   space 95296K,   0% used [0x00000007676f0000, 0x00000007676f0000, 0x000000076d400000)
 concurrent mark-sweep generation total 2097152K, used 629241K [0x000000076d400000, 0x00000007ed400000, 0x00000007ed400000)
 concurrent-mark-sweep perm gen total 307200K, used 143890K [0x00000007ed400000, 0x0000000800000000, 0x0000000800000000)
}
2018-04-16T14:23:46.363+0800: 356595.760: [Rescan (parallel) , 0.0037940 secs]2018-04-16T14:23:46.367+0800: 356595.764: [weak refs processing2018-04-16T14:23:46.367+0800: 356595.764: [SoftReference, 14978 refs, 0.0015960 secs]2018-04-16T14:23:46.368+0800: 356595.766: [WeakReference, 449 refs, 0.0000620 secs]2018-04-16T14:23:46.369+0800: 356595.766: [FinalReference, 1472 refs, 0.0051820 secs]2018-04-16T14:23:46.374+0800: 356595.771: [PhantomReference, 1 refs, 0.0000130 secs]2018-04-16T14:23:46.374+0800: 356595.771: [JNI Weak Reference, 0.0000100 secs], 0.0069140 secs]2018-04-16T14:23:46.374+0800: 356595.771: [class unloading, 0.0269340 secs]2018-04-16T14:23:46.401+0800: 356595.798: [scrub symbol table, 0.0164990 secs]2018-04-16T14:23:46.417+0800: 356595.815: [scrub string table, 0.0036960 secs] [1 CMS-remark: 629241K(2097152K)] 634726K(4099008K), 0.0799200 secs] [Times: user=0.12 sys=0.00, real=0.08 secs] 
2018-04-16T14:23:46.431+0800: 356595.828: [CMS-concurrent-sweep-start]
2018-04-16T14:23:46.802+0800: 356596.199: [CMS-concurrent-sweep: 0.363/0.371 secs] [Times: user=2.89 sys=0.00, real=0.37 secs] 
2018-04-16T14:23:46.802+0800: 356596.199: [CMS-concurrent-reset-start]
2018-04-16T14:23:46.835+0800: 356596.233: [CMS-concurrent-reset: 0.033/0.033 secs] [Times: user=0.27 sys=0.00, real=0.04 secs] 
**fullgc-end**
通过fullgc后的一次yonggc分析发现:
{Heap before GC invocations=654 (full 1):
 par new generation   total 2001856K, used 1912541K [0x00000006ed400000, 0x000000076d400000, 0x000000076d400000)
  eden space 1906560K, 100% used [0x00000006ed400000, 0x00000007619e0000, 0x00000007619e0000)
  from space 95296K,   6% used [0x00000007619e0000, 0x0000000761fb7548, 0x00000007676f0000)
  to   space 95296K,   0% used [0x00000007676f0000, 0x00000007676f0000, 0x000000076d400000)
 concurrent mark-sweep generation total 2097152K, used 259145K [0x000000076d400000, 0x00000007ed400000, 0x00000007ed400000)
 concurrent-mark-sweep perm gen total 307200K, used 143429K [0x00000007ed400000, 0x0000000800000000, 0x0000000800000000)
2018-04-16T14:23:47.037+0800: 356596.434: [GC2018-04-16T14:23:47.037+0800: 356596.434: [ParNew2018-04-16T14:23:47.049+0800: 356596.446: [SoftReference, 0 refs, 0.0000370 secs]2018-04-16T14:23:47.049+0800: 356596.446: [WeakReference, 26 refs, 0.0000130 secs]2018-04-16T14:23:47.049+0800: 356596.446: [FinalReference, 43 refs, 0.0000470 secs]2018-04-16T14:23:47.049+0800: 356596.446: [PhantomReference, 0 refs, 0.0000090 secs]2018-04-16T14:23:47.049+0800: 356596.446: [JNI Weak Reference, 0.0000100 secs]
Desired survivor size 48791552 bytes, new threshold 10 (max 10)
- age   1:    4083168 bytes,    4083168 total
- age   2:    1029504 bytes,    5112672 total
- age   3:     362720 bytes,    5475392 total
- age   4:      28200 bytes,    5503592 total
- age   5:      56616 bytes,    5560208 total
- age   6:       8384 bytes,    5568592 total
- age   7:      12944 bytes,    5581536 total
- age   8:       7328 bytes,    5588864 total
- age   9:       8472 bytes,    5597336 total
- age  10:      11560 bytes,    5608896 total
: 1912541K->8400K(2001856K), 0.0122020 secs] **2171686K->267554K**(4099008K), 0.0124480 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 
Heap after GC invocations=655 (full 1):
 par new generation   total 2001856K, used 8400K [0x00000006ed400000, 0x000000076d400000, 0x000000076d400000)
  eden space 1906560K,   0% used [0x00000006ed400000, 0x00000006ed400000, 0x00000007619e0000)
  from space 95296K,   8% used [0x00000007676f0000, 0x0000000767f24158, 0x000000076d400000)
  to   space 95296K,   0% used [0x00000007619e0000, 0x00000007619e0000, 0x00000007676f0000)
 concurrent mark-sweep generation total 2097152K, used 259153K [0x000000076d400000, 0x00000007ed400000, 0x00000007ed400000)
 concurrent-mark-sweep perm gen total 307200K, used 143429K [0x00000007ed400000, 0x0000000800000000, 0x0000000800000000)
}

老年代确实降下来了:2171686K->267554K
但是新生代,为何频繁回收呢?
当时线程gclog-链接: https://pan.baidu.com/s/1jz5I... 密码: ddrz

暂时自己的解决办法是扩大新生代为3g
有没有遇到过这类问题的?

阅读 6k
4 个回答

1、从第一图和第三图的曲线看,CPU高-->频繁ygc造成
2、通常调整survivorRatio的效益不是那么明显,整体收益不如调整新生代大小。从gc日志看,年老代最大的时候才使用600多M,是可以选择扩大新生代大小再观察下是否还有问题。
3、新生代频繁gc,这个跟你的服务性质有关。提供查询,导出等功能,导出功能的话,看实现的机制,如果是一次性load到内存计算,最后再一次性写到response,那么这段期间会占用内存,如果同一时刻再多来几次同类的导出请求,则新生代垃圾回收压力大。从年老代使用峰值看,没有内存泄露,也就是这些对象基本在几次ygc之后就回收了。于是就是新生代不断产生对象,不断回收部分,不断晋升到年老代。之所以频繁ygc,主要是垃圾回收速度跟不上新生对象创建速度导致。

新生代回收频繁比较正常,朝生夕死。
看到有个参数:
-XX:SurvivorRatio=20
Survivor区是否有点小了?

survivor 是故意设置小一点,让回收不掉的对象,加速进入到老年代

光看内存日志几乎不可能找到原因。必须结合访问日志和业务日志,定位问题可能出现在哪块业务逻辑上,然后再去在代码中排查。

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题