ridiculous ParNew pause times
Vitaly Davidovich
vitalyd at gmail.com
Wed Apr 9 15:40:03 UTC 2014
That's an interesting theory. Does the jvm fsync log writes? If it doesn't
then I'd expect the write syscalls to not actually block. Perhaps there's
some contention in the kernel though?
Cornelius, maybe you can strace the jvm and see what syscalls are made in
this scenario?
Sent from my phone
On Apr 9, 2014 11:29 AM, "Wolfgang Pedot" <wolfgang.pedot at finkzeit.at>
wrote:
> A crazy idea: could the gc-logging be the problem here because the
> redis-dump is using up all the IO bandwidth?
>
> Wolfgang
>
> Am 09.04.2014 17:06, schrieb Vitaly Davidovich:
>
>> Interesting. How much RAM does this machine have? You're using about
>> 21-22g for jvm and another 5g for redis, right? Is there room to spare
>> in physical memory?
>>
>> If there's no swap activity, then I don't think there's any I/O
>> interference. I'm guessing that redis writing out 5g of memory is
>> blowing the cpu caches for the jvm, and causing page faults. Although
>> if there's no swap activity, it seems these faults would be soft and
>> it's hard to explain how that would balloon to 30 sec pauses and not
>> show up in sys time then.
>>
>> Are there lots of context switches in the java process when this happens?
>>
>> Sent from my phone
>>
>> On Apr 9, 2014 10:41 AM, "Cornelius Riemenschneider" <cri at itscope.de
>> <mailto:cri at itscope.de>> wrote:
>>
>> Hey,____
>>
>> thanks for the hints :-)____
>>
>> The server runs one JVM and one redis instance (A event-based,
>> single threaded in-memory nosql-datastore).____
>>
>> Redis stores about 5G of data, which are written to disk from time
>> to time - it now turns out, that the redis saves align perfectly
>> with our long ParNew times.____
>>
>> By initiating a redis save I was able to get these garbage
>> collections.____
>>
>> 2014-04-09T15:06:07.892+0200: 85043.119:
>> [GC2014-04-09T15:06:18.089+0200: 85053.315: [ParNew:
>> 2964426K->338263K(3145728K), 0.1532950 secs]
>> 7167668K->4541505K(21495808K), 10.3502410 secs] [Times: user=1.93
>> sys=0.01, real=10.35 secs]____
>>
>> 2014-04-09T15:06:39.203+0200: 85074.429:
>> [GC2014-04-09T15:07:11.026+0200: 85106.252: [ParNew:
>> 3145728K->405516K(3145728K), 0.3429470 secs]
>> 7645283K->5212053K(21495808K), 32.1663210 secs] [Times: user=2.96
>> sys=0.01, real=32.17 secs]____
>>
>> I monitored the system during the GCs and swapping definitly does
>> not happen.____
>>
>> Because redis is single-threaded (and only used by the JVM which is
>> during ParNew in a STW phase), during ParNew there is only redis
>> active, trying to write 5G of memory to disk as fast as possible and
>> the GC.____
>>
>> I wasn't able to pinpoint the issue yet, do you have an idea why the
>> jvm could block on I/O during a GC?____
>>
>> Disk access is during these phases probably way slower than usual,
>> as everything is on the / partition, a RAID 1 disk array.____
>>
>> Or does anyone know the right perf options for the perf tool to
>> monitor blocking i/o during garbage collections?____
>>
>> __ __
>>
>> Regards,____
>>
>> Cornelius Riemenschneider____
>>
>> --____
>>
>> ITscope GmbH____
>>
>> Ludwig-Erhard-Allee 20____
>>
>> 76131 Karlsruhe____
>>
>> Email: cornelius.riemenschneider at itscope.de
>> <mailto:cornelius.riemenschneider at itscope.de>____
>>
>> https://www.itscope.com____
>>
>> Handelsregister: AG Mannheim, HRB 232782____
>>
>> Sitz der Gesellschaft: Karlsruhe____
>>
>> Geschäftsführer: Alexander Münkel, Benjamin Mund, Stefan Reger____
>>
>> __ __
>>
>> *Von:*Vitaly Davidovich [mailto:vitalyd at gmail.com
>> <mailto:vitalyd at gmail.com>]
>> *Gesendet:* Mittwoch, 9. April 2014 14:49
>> *An:* Cornelius Riemenschneider
>> *Cc:* hotspot-gc-use
>> *Betreff:* Re: ridiculous ParNew pause times____
>>
>> __ __
>>
>> This is typically caused by one of:____
>>
>> 1) heavy swapping (this won't count towards sys time because kernel
>> is not using cpu while waiting for I/O to complete)
>> 2) oversubscribed machine where gc threads aren't getting enough cpu
>> time due____
>>
>> Have you looked at stats on the machine when these pauses occur,
>> specifically around swap activity? Is your machine running multiple
>> JVMs or any other noisy neighbor apps?____
>>
>> Sent from my phone____
>>
>> On Apr 9, 2014 7:26 AM, "Cornelius Riemenschneider" <cri at itscope.de
>> <mailto:cri at itscope.de>> wrote:____
>>
>> Hello,____
>>
>> we've got the following problem with the ParNew collector:____
>>
>> Our log.gc usually looks like this:____
>>
>> 2014-04-09T12:58:02.485+0200: 77357.712:
>> [GC2014-04-09T12:58:02.485+0200: 77357.712: [ParNew:
>> 2722925K->100795K(3145728K), 0.3202010 secs]
>> 6998057K->4375934K(21495808K), 0.3205670 secs] [Times: user=4.10
>> sys=0.02, real=0.32 secs] ____
>>
>> 2014-04-09T12:58:06.256+0200: 77361.483:
>> [GC2014-04-09T12:58:06.257+0200: 77361.483: [ParNew:
>> 2722235K->101011K(3145728K), 0.3229910 secs]
>> 6997374K->4376165K(21495808K), 0.3233580 secs] [Times: user=4.13
>> sys=0.02, real=0.33 secs] ____
>>
>> 2014-04-09T12:58:12.295+0200: 77367.522:
>> [GC2014-04-09T12:58:12.296+0200: 77367.522: [ParNew:
>> 2722451K->101057K(3145728K), 0.3215320 secs]
>> 6997605K->4376216K(21495808K), 0.3219080 secs] [Times: user=4.12
>> sys=0.01, real=0.32 secs] ____
>>
>> 2014-04-09T12:58:18.461+0200: 77373.688:
>> [GC2014-04-09T12:58:18.462+0200: 77373.688: [ParNew:
>> 2722497K->2232K(3145728K), 0.2944540 secs]
>> 6997656K->4376242K(21495808K), 0.2948280 secs] [Times: user=3.79
>> sys=0.00, real=0.29 secs]____
>>
>> But occasionally we have entries like these:____
>>
>> 2014-04-09T09:56:12.840+0200: 66448.066:
>> [GC2014-04-09T09:56:38.154+0200: 66473.381: [ParNew:
>> 3139808K->524288K(3145728K), 0.8355990 secs]
>> 6845512K->4585563K(21495808K), 26.1502640 secs] [Times: user=5.59
>> sys=0.16, real=26.16 secs] ____
>>
>> 2014-04-09T09:57:09.173+0200: 66504.400:
>> [GC2014-04-09T09:57:24.871+0200: 66520.098: [ParNew:
>> 2950573K->488555K(3145728K), 0.1876660 secs]
>> 8701082K->6239064K(21495808K), 15.8858250 secs] [Times: user=2.38
>> sys=0.00, real=15.89 secs] ____
>>
>> 2014-04-09T12:58:34.661+0200: 77389.888:
>> [GC2014-04-09T12:59:06.390+0200: 77421.616: [ParNew:
>> 2623439K->2083K(3145728K), 0.0292270 secs]
>> 6997709K->4376490K(21495808K), 31.7578950 secs] [Times: user=0.34
>> sys=0.02, real=31.76 secs] ____
>>
>> which I can't explain at all.____
>>
>> The real time of 31.76sec equals a pause of 31.76secs, in which the
>> jvm does not respond to user requests, which is obviously bad.____
>>
>> The application is _/very/_ allocation heavy, so generally pauses of
>> 0.3sec are okay.____
>>
>> Our GC settings for this server are:____
>>
>> -Xms21g____
>>
>> -Xmx21g____
>>
>> -XX:ReservedCodeCacheSize=256m____
>>
>> -XX:PermSize=256m____
>>
>> -XX:MaxPermSize=768m____
>>
>> -server____
>>
>> -verbose:gc____
>>
>> -Xloggc:log.gc____
>>
>> -XX:+PrintGCDetails____
>>
>> -XX:+PrintGCDateStamps____
>>
>> -XX:+ExplicitGCInvokesConcurrent____
>>
>> -XX:NewRatio=5____
>>
>> -XX:SurvivorRatio=5____
>>
>> -XX:+UseConcMarkSweepGC____
>>
>> -XX:+UseParNewGC____
>>
>> -XX:+UseCMSInitiatingOccupancyOnly____
>>
>> -XX:CMSInitiatingOccupancyFraction=40____
>>
>> -XX:+CMSClassUnloadingEnabled____
>>
>> -XX:+CMSScavengeBeforeRemark____
>>
>> -Dsun.rmi.dgc.client.gcInterval=1209600000____
>>
>> -Dsun.rmi.dgc.server.gcInterval=1209600000____
>>
>> ____
>>
>> We run the sun jdk 7u51 on a current debian wheezy.____
>>
>> We previously had issues with long ParNew pauses, but back then, the
>> sys time was high, so we concluded that the server was swapping,____
>>
>> which we were able to prevent.____
>>
>> Do you have any idea or further hint at debugging options which
>> might help us in finding the issue?____
>>
>> ____
>>
>> Regards,____
>>
>> Cornelius Riemenschneider____
>>
>> --____
>>
>> ITscope GmbH____
>>
>> Ludwig-Erhard-Allee 20____
>>
>> 76131 Karlsruhe____
>>
>> Email: cornelius.riemenschneider at itscope.de
>> <mailto:cornelius.riemenschneider at itscope.de>____
>>
>> https://www.itscope.com____
>>
>> Handelsregister: AG Mannheim, HRB 232782____
>>
>> Sitz der Gesellschaft: Karlsruhe____
>>
>> Geschäftsführer: Alexander Münkel, Benjamin Mund, Stefan Reger____
>>
>> ____
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net <mailto:hotspot-gc-use@
>> openjdk.java.net>
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use____
>>
>>
>>
>> _______________________________________________
>> hotspot-gc-use mailing list
>> hotspot-gc-use at openjdk.java.net
>> http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/hotspot-gc-use/attachments/20140409/f8d44b8c/attachment.html>
More information about the hotspot-gc-use
mailing list