Bug in G1 logging? incorrect stop times?
charlie hunt
charlie.hunt at oracle.com
Thu Oct 2 13:07:23 UTC 2014
Hi Thomas,
Don't mean to jump in the middle of your discussion with Bengt ...
Since this is M5, have you tried setting the FX Solaris CPU scheduler, (based on the hypothesis this may be thread scheduling related)?
hths,
Charlie
Sent from my iPhone
> On Oct 2, 2014, at 5:35 AM, Thomas Viessmann <thomas.viessmann at oracle.com> wrote:
>
> Hi Bengt,
>
> Better late than never :-) .
> There are 2200 threads, about 10% are runnable.
> The application is the only one running on this M5.
>
> We observe this effect quite often with G1. Relatively short
> GC times and long application thread stop times.
> Interestingly I've never seen this with CMS. But
> CMS is not an option here due to fragmentation.
>
> Thanks and Regards
>
> Thomas
>
>
>
>
>
>
>
>
>> On 10/02/14 11:15, Bengt Rutisson wrote:
>>
>> Hi Thomas,
>>
>> A very late reply. I happened to find this in my inbox and have some comments.
>>
>>> On 2014-08-29 14:15, Thomas Viessmann wrote:
>>> Hi all,
>>>
>>>
>>> I suspect a bug in G1 logging. Before I file a (possibly incorrect) bug I would like to ask here.
>>> Is my understanding correct that the stw evacuation phase starts at 48448.931 and ends at 48454.034 ?
>>> This would mean a stop time of 5.103 s, which is clearly a mismatch to the reported 9.3426141 s
>>> If the stw phase in fact started at a later time e.g. at 48449.288 then the stop pause would have been even shorter.
>>>
>>> In addition I do not understand how all the relatively short evacuation phases sum up to several seconds.
>>> If I add all the MAX values 69.1 + 22.2 + 45 + 150.8 + 0.1 + 155.1 + 0.9 + 0.5 = 443.7ms
>>>
>>>
>>>
>>>
>>> 48443.380: Total time for which application threads were stopped: 0.0084701 seconds
>>> 48448.931: [GC pause (mixed) 48448.931: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 89458, predicted base time: 73.58 ms, remaining time: 176.42 ms, target pause time: 250.00 ms]
>>> 48448.931: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 33 regions, survivors: 5 regions, predicted young region time: 35.73 ms]
>>> 48448.933: [G1Ergonomics (CSet Construction) finish adding old regions to CSet, reason: reclaimable percentage not over threshold, old: 19 regions, max: 77 regions, reclaimable: 2570649712 bytes (9.98 %), threshold: 10.00 %]
>>> 48448.933: [G1Ergonomics (CSet Construction) added expensive regions to CSet, reason: old CSet region num not reached min, old: 19 regions, expensive: 7 regions, min: 26 regions, remaining time: 0.00 ms]
>>> 48448.933: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 33 regions, survivors: 5 regions, old: 19 regions, predicted pause time: 340.58 ms, target pause time: 250.00 ms]
>>> 48449.252: [SoftReference, 962 refs, 0.0064679 secs]48449.259: [WeakReference, 7411 refs, 0.0027722 secs]48449.261: [FinalReference, 272 refs, 0.0022754 secs]48449.264: [PhantomReference, 6 refs, 0.0024348 secs]48449.266: [JNI Weak Reference, 0.0000337 secs] 48449.288: [G1Ergonomics (Concurrent Cycles) do not request concurrent cycle initiation, reason: still doing mixed collections, occupancy: 14898167808 bytes, allocation request: 0 bytes, threshold: 11596411665 bytes (45.00 %), source: end of GC]
>>> 48449.288: [G1Ergonomics (Mixed GCs) do not continue mixed GCs, reason: reclaimable percentage not over threshold, candidate old regions: 156 regions, reclaimable: 2570649712 bytes (9.98 %), threshold: 10.00 %]
>>> , 0.3573636 secs]
>>> [Parallel Time: 314.9 ms, GC Workers: 40]
>>> [GC Worker Start (ms): Min: 48448933.6, Avg: 48448934.2, Max: 48448934.7, Diff: 1.1]
>>> [Ext Root Scanning (ms): Min: 15.1, Avg: 17.7, Max: 69.1, Diff: 54.0, Sum: 708.0]
>>> [Update RS (ms): Min: 0.0, Avg: 14.5, Max: 22.2, Diff: 22.2, Sum: 580.3]
>>> [Processed Buffers: Min: 0, Avg: 22.0, Max: 45, Diff: 45, Sum: 879]
>>> [Scan RS (ms): Min: 88.7, Avg: 142.6, Max: 150.8, Diff: 62.1, Sum: 5705.2]
>>> [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.1]
>>> [Object Copy (ms): Min: 131.4, Avg: 138.2, Max: 155.1, Diff: 23.8, Sum: 5526.6]
>>> [Termination (ms): Min: 0.0, Avg: 0.7, Max: 0.9, Diff: 0.9, Sum: 27.2]
>>> [GC Worker Other (ms): Min: 0.0, Avg: 0.3, Max: 0.5, Diff: 0.5, Sum: 10.0]
>>> [GC Worker Total (ms): Min: 313.2, Avg: 313.9, Max: 314.7, Diff: 1.5, Sum: 12557.5]
>>> [GC Worker End (ms): Min: 48449247.9, Avg: 48449248.1, Max: 48449248.4, Diff: 0.5]
>>> [Code Root Fixup: 2.2 ms]
>>> [Code Root Migration: 0.2 ms]
>>> [Clear CT: 2.0 ms]
>>> [Other: 38.2 ms]
>>> [Choose CSet: 1.7 ms]
>>> [Ref Proc: 15.6 ms]
>>> [Ref Enq: 1.2 ms]
>>> [Free CSet: 2.4 ms]
>>> [Eden: 1056.0M(1056.0M)->0.0B(1088.0M) Survivors: 160.0M->128.0M Heap: 14.5G(24.0G)->13.9G(24.0G)]
>>> [Times: user=12.45 sys=0.03, real=0.36 secs]
>>> 48454.034: Total time for which application threads were stopped: 9.3426141 seconds
>>>
>>> For completeness; here are the cmd line options:
>>>
>>>
>>> /usr/jdk/instances/jdk1.7.0/bin/sparcv9/java -Xms24g -Xmx24g -server -XX:StringTableSize=27001 -XX:PermSize=512m -XX:MaxPermSize=512m -Xss512k
>>> -XX:+UseG1GC -XX:SoftRefLRUPolicyMSPerMB=500 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:/var/cacao/instances/oem-ec/logs/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10m -XX:+PrintGCApplicationStoppedTime -XX:MaxGCPauseMillis=250-XX:InitiatingHeapOccupancyPercent=45 -XX:+ParallelRefProcEnabled -XX:ParallelGCThreads=40 -XX:+PrintAdaptiveSizePolicy -XX:+PrintSafepointStatistics -XX:+PrintReferenceGC -XX:G1HeapRegionSize=32m -XX:HeapBaseMinAddress=2G
>>
>>
>> The GC starts at 48448.931 and reports that it lasts for 0.36 seconds, so it should have ended at 48449.291. The last log line that states that the threads were stopped for 9 seconds is from the PrintGCApplicationStoppedTime flag, which logs how long the safepoint was. i.e. how long the Java threads where stopped.
>>
>> This means that the safepoint lasted for 9 seconds but the GC only lasted for 0.36 seconds. So the rest of the time must be attributed to stopping and starting the Java threads. The safepoint was started at 48444.691 (48454.034 - 9.3426141), which means that it took 4.240 seconds (48448.931 - 48444.691) to stop all Java threads and 4.743 seconds (9.3426141 - 4.240 - 0.36) to start them again.
>>
>> I don't have a good explanation for why it takes so long to stop and start the threads. Are there many Java threads in the application? Are there many other processes running that could cause the Java threads to be scheduled out?
>>
>> Thanks,
>> Bengt
>>
>>>
>>>
>>>
>>> Thanks and Regards
>>>
>>> Thomas
>>>
>>> --
>>> <mime-attachment.gif>
>>> THOMAS VIESSMANN | Senior Principal Technical Support Engineer - Java
>>> Phone: +498914302496 | Mobile: +491743005467
>>> Oracle Customer Technical Support - Java
>>>
>>> ORACLE Deutschland B.V. & Co. KG | Riesstr.25 | D-80992 Muenchen
>>>
>>> ORACLE Deutschland B.V. & Co. KG
>>> Hauptverwaltung: Riesstr. 25, D-80992 Muenchen
>>> Registergericht: Amtsgericht Muenchen, HRA 95603
>>> Geschäftsführere: Juergen Kunz
>>>
>>> Komplementärin: ORACLE Deutschland Verwaltung B.V.
>>> Hertogswetering 163/167, 3543 AS Utrecht, Niederlande
>>> Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697
>>> Geschäftsführer: Alexander van der Ven, Astrid Kepper, Val Maher
>>>
>>> <mime-attachment.gif> Oracle is committed to developing practices and products that help protect the environment
>
> --
> <oracle_sig_logo.gif>
> THOMAS VIESSMANN | Senior Principal Technical Support Engineer - Java
> Phone: +498914302496 | Mobile: +491743005467
> Oracle Customer Technical Support - Java
>
> ORACLE Deutschland B.V. & Co. KG | Riesstr.25 | D-80992 Muenchen
>
> ORACLE Deutschland B.V. & Co. KG
> Hauptverwaltung: Riesstr. 25, D-80992 Muenchen
> Registergericht: Amtsgericht Muenchen, HRA 95603
> Geschäftsführere: Juergen Kunz
>
> Komplementärin: ORACLE Deutschland Verwaltung B.V.
> Hertogswetering 163/167, 3543 AS Utrecht, Niederlande
> Handelsregister der Handelskammer Midden-Niederlande, Nr. 30143697
> Geschäftsführer: Alexander van der Ven, Astrid Kepper, Val Maher
>
> <green-for-email-sig_0.gif> Oracle is committed to developing practices and products that help protect the environment
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20141002/854c51b3/attachment.htm>
More information about the hotspot-gc-dev
mailing list