I'm trying to troubleshoot extremely long pause times when using the CMS collector. I'm using Java 1.6.0u20 and planning an upgrade to 1.7.0u71 but we are stuck right now on this older version.
I'm wondering if anyone has any insight into these long "real" pauses.
The machine is a VM but there are only 2 VMs on the ESX host and they are using less than the total number of cores and ram available, so swapping shouldn't be an issue, but I'm not 100% sure. Any tips related to JVM on a VM would be appreciated as well.
Increasing the heap doesn't help - we started with 1gb on the throughput collector and went to 1.5, 2, 4, 5, 6, ... just last night I increased the heap size to 10gb. The problem always remains with larger or smaller new sizes, etc.
Here is a concurrent mode failure:
2014-11-13T09:36:12.805-0700: 34537.287: [GC 34537.288: [ParNew: 2836628K->2836628K(3058944K), 0.0000296 secs]34537.288: [CMS: 3532075K->1009314K(6989824K), 298.2601836 secs] 6368704K->1009314K(10048768K), [CMS Perm : 454750K->105512K(524288K)], 298.2603873 secs] [Times: user=5.89 sys=31.00, real=297.67 secs]
Total time for which application threads were stopped: 298.2647309 seconds
Here is a promotion failure:
2014-11-13T11:23:30.395-0700: 40974.985: [GC 40974.985: [ParNew (promotion failed)
Desired survivor size 223739904 bytes, new threshold 7 (max 7)
- age 1: 126097168 bytes, 126097168 total
: 3058944K->2972027K(3058944K), 1.6271403 secs]40976.612: [CMS: 6369748K->1735350K(6989824K), 26.6789774 secs] 9103364K->1735350K(10048768K), [CMS Perm : 129283K->105970K(524288K)], 28.3063205 secs] [Times: user=8.05 sys=2.08, real=28.38 secs]
Total time for which application threads were stopped: 28.3069287 seconds
Why are the "real" times so much longer than the cpu/kernel times??
[Times: user=5.89 sys=31.00, real=297.67 secs]
[Times: user=8.05 sys=2.08, real=28.38 secs]