Troubleshooting a ~40-second minor collection

Mon Dec 2 23:46:38 PST 2013

Hi Aaron,

I just had a quick look at the GC log and one thing that sticks out is 
the user vs. real time data for the long GCs that you mention.

On 2013-12-03 06:12, Aaron Daubman wrote:
>
>     Would it help to narrow the GC logs I have to around the time of
>     the long minor GC events and include them?
>
>
> I've attached (trying one more time with higher bz2 compression to get 
> through the 100K limit) the log for the 12minute period where the two 
> largest pauses occurred and Context Switching became high on one of 
> the servers.
> Over this 12 minutes GCViewer reports there were a total of 243 pauses 
> causing 10.2s total pause time.
> Max pause was 3.45s and the max pause interval was 5.5s
>
> The log starts off close to the largest 3.45s collection time:
> 2013-12-02T16:28:37.838-0500: 2060.900: [GC pause (young) 2060.900: 
> [G1Ergonomics (CSet Construction) start choosing CSet, predicted base 
> time: 5.92 ms, remaining time: 44.08 ms, target pause time: 50.00 ms]
>  2064.332: [G1Ergonomics (CSet Construction) add young regions to 
> CSet, eden: 1238 regions, survivors: 8 regions, predicted young region 
> time: 13.50 ms]
>  2064.332: [G1Ergonomics (CSet Construction) finish choosing CSet, 
> eden: 1238 regions, survivors: 8 regions, old: 0 regions, predicted 
> pause time: 19.42 ms, target pause time: 50.00 ms]
> , 3.45061000 secs]

This GC has has this user/real time info:

[Times: user=0.17 sys=0.00, real=3.45 secs]

That means that during 3.45 seconds all of the VM threads only got 
scheduled to actually run on the CPUs for 0.17 seconds. So, it seems 
like the OS has scheduled the VM threads out for most of that period. 
The reason for that is most likely that the system is running too many 
other applications at the same time.

>
> It finishes near the second largest time:
> 2013-12-02T16:40:43.116-0500: 2786.177: [GC pause (young) 2786.177: 
> [G1Ergonomics (CSet Construction) start choosing CSet, predicted base 
> time: 5.88 ms, remaining time: 44.12 ms, target pause time: 50.00 ms]
>  2786.177: [G1Ergonomics (CSet Construction) add young regions to 
> CSet, eden: 1196 regions, survivors: 4 regions, predicted young region 
> time: 21.35 ms]
>  2788.769: [G1Ergonomics (CSet Construction) finish choosing CSet, 
> eden: 1196 regions, survivors: 4 regions, old: 0 regions, predicted 
> pause time: 27.23 ms, target pause time: 50.00 ms]
> , 2.60856700 secs]

The user/real time info for this GC is:

[Times: user=0.14 sys=0.00, real=2.61 secs]

So, the issue seem to be the same.

For your normal young collections the user/real time looks more like this:

[Times: user=0.12 sys=0.00, real=0.01 secs]

That makes more sense. You are using 23 GC threads, so if they get 
scheduled to run in parallel the user time is higher than the real time.

Hths,
Bengt

>
> This was all from minor (young) collections.
>
> Thanks again,
>      Aaron
>
>
> _______________________________________________
> hotspot-gc-use mailing list
> hotspot-gc-use at openjdk.java.net
> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20131203/6a5dc78b/attachment.html