Long safepoint pause directly after GC log file rotation in 1.7.0_80
Rainer Jung
rainer.jung at kippdata.de
Tue Nov 10 09:13:35 UTC 2015
Addition: the longest pause that was experienced was more than 2400
seconds ...
And: platform is Solaris Sparc (T4). But we don't know whether it is
platform dependent.
It also happens on test systems, so I'll write a script that calls
pstack when detection is detected to find out, what the threads are
doing or where they are hanging.
Regards,
Rainer
Am 09.11.2015 um 21:23 schrieb Rainer Jung:
> Hi,
>
> after upgrading from 1.7.0_76 to 1.7.0_80 we experience long pauses
> directly after a GC log rotation.
>
> The pause duration varies due to application and load but is in the
> range of 6 seconds to 60 seconds. There is no GC involved, i.e. no GC
> output is written related to these pauses.
>
> Example:
>
> Previous file ends with:
>
> 2015-11-09T01:28:36.832+0100: 38461,486: Application time: 5,2840810
> seconds
> {Heap before GC invocations=7366 (full 8):
> par new generation total 458752K, used 442678K [0xfffffffe00400000,
> 0xfffffffe20400000, 0xfffffffe20400000)
> eden space 393216K, 100% used [0xfffffffe00400000,
> 0xfffffffe18400000, 0xfffffffe18400000)
> from space 65536K, 75% used [0xfffffffe18400000, 0xfffffffe1b44d998,
> 0xfffffffe1c400000)
> to space 65536K, 0% used [0xfffffffe1c400000, 0xfffffffe1c400000,
> 0xfffffffe20400000)
> concurrent mark-sweep generation total 3670016K, used 1887085K
> [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000)
> concurrent-mark-sweep perm gen total 524288K, used 453862K
> [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000)
> 2015-11-09T01:28:36.839+0100: 38461,493:
> [GC2015-11-09T01:28:36.840+0100: 38461,493: [ParNew
> Desired survivor size 33554432 bytes, new threshold 16 (max 31)
> - age 1: 2964800 bytes, 2964800 total
> - age 2: 2628048 bytes, 5592848 total
> - age 3: 1415792 bytes, 7008640 total
> - age 4: 1354008 bytes, 8362648 total
> - age 5: 1132056 bytes, 9494704 total
> - age 6: 1334072 bytes, 10828776 total
> - age 7: 1407336 bytes, 12236112 total
> - age 8: 3321304 bytes, 15557416 total
> - age 9: 1531064 bytes, 17088480 total
> - age 10: 2453024 bytes, 19541504 total
> - age 11: 2797616 bytes, 22339120 total
> - age 12: 1698584 bytes, 24037704 total
> - age 13: 1870064 bytes, 25907768 total
> - age 14: 2211528 bytes, 28119296 total
> - age 15: 3626888 bytes, 31746184 total
> : 442678K->37742K(458752K), 0,0802687 secs]
> 2329763K->1924827K(4128768K), 0,0812120 secs] [Times: user=0,90
> sys=0,03, real=0,08 secs]
> Heap after GC invocations=7367 (full 8):
> par new generation total 458752K, used 37742K [0xfffffffe00400000,
> 0xfffffffe20400000, 0xfffffffe20400000)
> eden space 393216K, 0% used [0xfffffffe00400000,
> 0xfffffffe00400000, 0xfffffffe18400000)
> from space 65536K, 57% used [0xfffffffe1c400000, 0xfffffffe1e8db9a0,
> 0xfffffffe20400000)
> to space 65536K, 0% used [0xfffffffe18400000, 0xfffffffe18400000,
> 0xfffffffe1c400000)
> concurrent mark-sweep generation total 3670016K, used 1887085K
> [0xfffffffe20400000, 0xffffffff00400000, 0xffffffff00400000)
> concurrent-mark-sweep perm gen total 524288K, used 453862K
> [0xffffffff00400000, 0xffffffff20400000, 0xffffffff20400000)
> }
> ....
> 2015-11-09T01:28:36.921+0100: 38461,575: Total time for which
> application threads were stopped: 0,0888232 seconds, Stopping threads
> took: 0,0005420 seconds
> 2015-11-09T01:28:59.821+0100: 38484,474: Application time: 0,0002954
> seconds
> 2015-11-09T01:28:59.823+0100: 38484,477: Total time for which
> application threads were stopped: 0,0026081 seconds, Stopping threads
> took: 0,0004146 seconds
> 2015-11-09T01:28:59.824+0100: 38484,477: Application time: 0,0003073
> seconds
> 2015-11-09T01:28:59.826+0100: 38484,480: Total time for which
> application threads were stopped: 0,0025411 seconds, Stopping threads
> took: 0,0004064 seconds
> 2015-11-09T01:28:59.827+0100: 38484,480: Application time: 0,0002885
> seconds
> 2015-11-09 01:28:59 GC log file has reached the maximum size. Saved as
> ./application/logs-a/mkb_gc.log.2
>
> This output looks normal. Last timestamp is 2015-11-09T01:28:59.827
>
> Now the next file begins:
>
> 2015-11-09 01:28:59 GC log file created ./application/logs-a/mkb_gc.log.3
> Java HotSpot(TM) 64-Bit Server VM (24.80-b11) for solaris-sparc JRE
> (1.7.0_80-b15), built on Apr 10 2015 18:47:18 by "" with Sun Studio 12u1
> Memory: 8k page, physical 133693440k(14956840k free)
> CommandLine flags: -XX:AllocateInstancePrefetchLines=2
> -XX:AllocatePrefetchInstr=1 -XX:AllocatePrefetchLines=6
> -XX:AllocatePrefetchStyle=3 -XX:+CMSClassUnloadingEnabled
> -XX:+ExplicitGCInvokesConcurrentAndUnloadsClasses
> -XX:GCLogFileSize=10485760 -XX:InitialHeapSize=4294967296
> -XX:MaxHeapSize=4294967296 -XX:MaxNewSize=536870912
> -XX:MaxPermSize=536870912 -XX:MaxTenuringThreshold=31
> -XX:NewSize=536870912 -XX:NumberOfGCLogFiles=10 -XX:OldPLABSize=16
> -XX:ParallelGCThreads=16 -XX:PermSize=536870912 -XX:+PrintGC
> -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime
> -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
> -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:SurvivorRatio=6
> -XX:-UseAdaptiveSizePolicy -XX:+UseCompressedOops
> -XX:+UseConcMarkSweepGC -XX:+UseGCLogFileRotation -XX:+UseParNewGC
> 2015-11-09T01:29:55.640+0100: 38540,292: Total time for which
> application threads were stopped: 55,8119519 seconds, Stopping threads
> took: 0,0003857 seconds
> 2015-11-09T01:29:55.648+0100: 38540,299: Application time: 0,0076173
> seconds
>
> Note the 55.8 seconds pause directly after printing the flags and the
> consistent timestamp jump from 01:28:59 to 01:29:55. There's no GC
> output, although verbose GC is active and works. For some other reason
> there is a very long safepoint. Note also, that the time is not due to
> waiting until the safepoint is reached. At least the log claims that
> reaching the safepoint only took 0.008 seconds. Also at that timeof day
> the servers are not very busy.
>
> Is there any idea, what happens here? Anything that rings a bell between
> 1.7.0_76 and 1.7.0_80? Why should there be a long safepoint directly
> after GC rotation opened a new file?
>
> I searched the bug parade, but didn't find a good hit. There was also
> nothing in the change for JDK-7164841 that seemed immediately
> responsible for a long pause.
>
> Unfortunately this happens on a production system and the first thing
> was to roll back to the old Java version.Not sure, how good this will be
> reproducible on a test system (will check tomorrow).
>
> Thanks for any hint,
>
> Rainer
More information about the hotspot-gc-use
mailing list