From shade at redhat.com Fri Dec 1 10:56:31 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 1 Dec 2017 11:56:31 +0100 Subject: RFR: Region sampling should lock while gathering region data Message-ID: http://cr.openjdk.java.net/~shade/shenandoah/sampling-lock/webrev.01/ Another corner case discovered when testing humongous moves: when Full GC is in full swing, heap region states might be inconsistent. It fails the asserts in get_live() that searches for humongous start, and cannot find it when Full GC moves humongous. Full GC does lock the heap when changing heap region states, and thus we can lock the heap sampler too, to always get consistent view of the heap. Testing: hotspot_gc_shenandoah, Visualizer runs Thanks, -Aleksey From shade at redhat.com Fri Dec 1 11:47:12 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 1 Dec 2017 12:47:12 +0100 Subject: RFR: Full GC should compact humongous regions Message-ID: <2922e340-12ec-4227-a8cf-51cc41d767d6@redhat.com> http://cr.openjdk.java.net/~shade/shenandoah/humongous-moves/webrev.01/ This implements humongous regions moves during last-ditch Full GC. This allows Shenandoah to survive humongous-fragmenting workloads, at the expense of potentially longer Full GC, when the alternative is OOME. This mirrors the upstream G1 RFE [1]. New jtreg test is the example of such the workload, and it OOMEs Shenandoah in seconds without this fix. Testing: hotspot_gc_shenandoah {fastdebug|release}, new failing test Thanks, -Aleksey [1] https://bugs.openjdk.java.net/browse/JDK-8191565 From shade at redhat.com Fri Dec 1 16:35:36 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 1 Dec 2017 17:35:36 +0100 Subject: RFR/RFC: Rework shared bool/enum flags with proper types and synchronization Message-ID: http://cr.openjdk.java.net/~shade/shenandoah/refactor-shared-flags/webrev.01/ Current shared flag handling is messy and ad-hoc: we use char, unsigned int, volatile jbyte, etc, sometimes inconsistently updating the fields, etc. This fix commons operations on shared flags with a handy abstraction. Apart from doing the synchronization right, it also pads out the fields to eliminate false sharing against other heavily-mutated fields, since most of these fields are used on critical path checks. Current hotspot_gc_shenandoah testing fails some tests with: http://cr.openjdk.java.net/~shade/shenandoah/after-flag-partial-fail-hs_err.log ...which I believe is the pre-existing bug? Thanks, -Aleksey From rkennke at redhat.com Fri Dec 1 17:29:47 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 1 Dec 2017 18:29:47 +0100 Subject: RFR: Full GC should compact humongous regions In-Reply-To: <2922e340-12ec-4227-a8cf-51cc41d767d6@redhat.com> References: <2922e340-12ec-4227-a8cf-51cc41d767d6@redhat.com> Message-ID: > http://cr.openjdk.java.net/~shade/shenandoah/humongous-moves/webrev.01/ > > This implements humongous regions moves during last-ditch Full GC. This allows Shenandoah to survive > humongous-fragmenting workloads, at the expense of potentially longer Full GC, when the alternative > is OOME. This mirrors the upstream G1 RFE [1]. New jtreg test is the example of such the workload, > and it OOMEs Shenandoah in seconds without this fix. > > Testing: hotspot_gc_shenandoah {fastdebug|release}, new failing test > > Thanks, > -Aleksey > > [1] https://bugs.openjdk.java.net/browse/JDK-8191565 > Looks good to me. From rkennke at redhat.com Fri Dec 1 17:46:14 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 1 Dec 2017 18:46:14 +0100 Subject: RFR/RFC: Rework shared bool/enum flags with proper types and synchronization In-Reply-To: References: Message-ID: <8c5993cd-c69f-4568-30e4-0b2fd79a21b7@redhat.com> Am 01.12.2017 um 17:35 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/refactor-shared-flags/webrev.01/ > > Current shared flag handling is messy and ad-hoc: we use char, unsigned int, volatile jbyte, etc, > sometimes inconsistently updating the fields, etc. This fix commons operations on shared flags with > a handy abstraction. Apart from doing the synchronization right, it also pads out the fields to > eliminate false sharing against other heavily-mutated fields, since most of these fields are used on > critical path checks. > > Current hotspot_gc_shenandoah testing fails some tests with: > http://cr.openjdk.java.net/~shade/shenandoah/after-flag-partial-fail-hs_err.log > > ...which I believe is the pre-existing bug? > > Thanks, > -Aleksey > > This is very good stuff! Some comments: - some _addr() methods are defined in shenandoahHeap.cpp some in shenandoahHeap.inline.hpp. I think it's good enough to define them all in shenandoahHeap.cpp - Likewise, all state accessors should probably follow the same pattern (either with is_ or without, but consistent) and all go into shenandoahHeap.inline.hpp - ShenandoahSharedEnumFlag::cmpxchg() takes expected and new in different order than Atomic::cmpxchg(). I find this surprising. - In ShenandoahSharedFlag, declare constants or enum for 0 and 1 as FALSE and TRUE? Couldn't find what would cause the new failure though... Roman From rkennke at redhat.com Fri Dec 1 17:54:34 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 1 Dec 2017 18:54:34 +0100 Subject: RFR: Region sampling should lock while gathering region data In-Reply-To: References: Message-ID: <168067ca-4fe4-04ed-5012-8c5ed3472cf0@redhat.com> > http://cr.openjdk.java.net/~shade/shenandoah/sampling-lock/webrev.01/ > > Another corner case discovered when testing humongous moves: when Full GC is in full swing, heap > region states might be inconsistent. It fails the asserts in get_live() that searches for humongous > start, and cannot find it when Full GC moves humongous. Full GC does lock the heap when changing > heap region states, and thus we can lock the heap sampler too, to always get consistent view of the > heap. > > Testing: hotspot_gc_shenandoah, Visualizer runs > > Thanks, > -Aleksey > > Ok From shade at redhat.com Fri Dec 1 18:14:05 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 1 Dec 2017 19:14:05 +0100 Subject: RFR/RFC: Rework shared bool/enum flags with proper types and synchronization In-Reply-To: <8c5993cd-c69f-4568-30e4-0b2fd79a21b7@redhat.com> References: <8c5993cd-c69f-4568-30e4-0b2fd79a21b7@redhat.com> Message-ID: On 12/01/2017 06:46 PM, Roman Kennke wrote: > Some comments: > - some _addr() methods are defined in shenandoahHeap.cpp some in shenandoahHeap.inline.hpp. I think > it's good enough to define them all in shenandoahHeap.cpp Done. > - Likewise, all state accessors should probably follow the same pattern (either with is_ or without, > but consistent) and all go into shenandoahHeap.inline.hpp Done. > - ShenandoahSharedEnumFlag::cmpxchg() takes expected and new in different order than > Atomic::cmpxchg(). I find this surprising. I thought it would be more understandable, but reverted to the old order. > - In ShenandoahSharedFlag, declare constants or enum for 0 and 1 as FALSE and TRUE? Yes. New version: http://cr.openjdk.java.net/~shade/shenandoah/refactor-shared-flags/webrev.02/ -Aleksey From cflood at redhat.com Fri Dec 1 18:47:48 2017 From: cflood at redhat.com (Christine Flood) Date: Fri, 1 Dec 2017 13:47:48 -0500 Subject: RFR: Region sampling should lock while gathering region data In-Reply-To: References: Message-ID: This looks fine to me. On Fri, Dec 1, 2017 at 5:56 AM, Aleksey Shipilev wrote: > http://cr.openjdk.java.net/~shade/shenandoah/sampling-lock/webrev.01/ > > Another corner case discovered when testing humongous moves: when Full GC is in full swing, heap > region states might be inconsistent. It fails the asserts in get_live() that searches for humongous > start, and cannot find it when Full GC moves humongous. Full GC does lock the heap when changing > heap region states, and thus we can lock the heap sampler too, to always get consistent view of the > heap. > > Testing: hotspot_gc_shenandoah, Visualizer runs > > Thanks, > -Aleksey > From cflood at redhat.com Fri Dec 1 19:06:11 2017 From: cflood at redhat.com (Christine Flood) Date: Fri, 1 Dec 2017 14:06:11 -0500 Subject: RFR: Full GC should compact humongous regions In-Reply-To: <2922e340-12ec-4227-a8cf-51cc41d767d6@redhat.com> References: <2922e340-12ec-4227-a8cf-51cc41d767d6@redhat.com> Message-ID: This patch looks fine to me. I hate that we have experimental options which default to true, but since that seems to happen in several places it shouldn't hold up this patch. Christine On Fri, Dec 1, 2017 at 6:47 AM, Aleksey Shipilev wrote: > http://cr.openjdk.java.net/~shade/shenandoah/humongous-moves/webrev.01/ > > This implements humongous regions moves during last-ditch Full GC. This allows Shenandoah to survive > humongous-fragmenting workloads, at the expense of potentially longer Full GC, when the alternative > is OOME. This mirrors the upstream G1 RFE [1]. New jtreg test is the example of such the workload, > and it OOMEs Shenandoah in seconds without this fix. > > Testing: hotspot_gc_shenandoah {fastdebug|release}, new failing test > > Thanks, > -Aleksey > > [1] https://bugs.openjdk.java.net/browse/JDK-8191565 > From shade at redhat.com Fri Dec 1 19:05:50 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 1 Dec 2017 20:05:50 +0100 Subject: RFR/RFC: Rework shared bool/enum flags with proper types and synchronization In-Reply-To: References: Message-ID: <6f345b8e-85bb-cd4d-f202-7aadc373899b@redhat.com> On 12/01/2017 05:35 PM, Aleksey Shipilev wrote: > http://cr.openjdk.java.net/~shade/shenandoah/refactor-shared-flags/webrev.01/ > > Current shared flag handling is messy and ad-hoc: we use char, unsigned int, volatile jbyte, etc, > sometimes inconsistently updating the fields, etc. This fix commons operations on shared flags with > a handy abstraction. Apart from doing the synchronization right, it also pads out the fields to > eliminate false sharing against other heavily-mutated fields, since most of these fields are used on > critical path checks. > > Current hotspot_gc_shenandoah testing fails some tests with: > http://cr.openjdk.java.net/~shade/shenandoah/after-flag-partial-fail-hs_err.log > > ...which I believe is the pre-existing bug? Gaaaah! I understand now, this is C1 bug, fixed by: - __ move(new LIR_Address(mark_in_prog_addr, T_CHAR), mark_in_prog); + __ move(new LIR_Address(mark_in_prog_addr, T_BYTE), mark_in_prog); Looked at other usages of the affected flags, and this seems to be the only case like this. New version: http://cr.openjdk.java.net/~shade/shenandoah/refactor-shared-flags/webrev.03/ This now passes hotspot_gc_shenandoah (fastdebug) Thanks, -Aleksey From rkennke at redhat.com Fri Dec 1 19:19:11 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 1 Dec 2017 20:19:11 +0100 Subject: RFR/RFC: Rework shared bool/enum flags with proper types and synchronization In-Reply-To: <6f345b8e-85bb-cd4d-f202-7aadc373899b@redhat.com> References: <6f345b8e-85bb-cd4d-f202-7aadc373899b@redhat.com> Message-ID: <15fb2c9a-f40a-1f0b-d4e9-51c7a965321e@redhat.com> >> http://cr.openjdk.java.net/~shade/shenandoah/refactor-shared-flags/webrev.01/ >> >> Current shared flag handling is messy and ad-hoc: we use char, unsigned int, volatile jbyte, etc, >> sometimes inconsistently updating the fields, etc. This fix commons operations on shared flags with >> a handy abstraction. Apart from doing the synchronization right, it also pads out the fields to >> eliminate false sharing against other heavily-mutated fields, since most of these fields are used on >> critical path checks. >> >> Current hotspot_gc_shenandoah testing fails some tests with: >> http://cr.openjdk.java.net/~shade/shenandoah/after-flag-partial-fail-hs_err.log >> >> ...which I believe is the pre-existing bug? > > Gaaaah! I understand now, this is C1 bug, fixed by: > > - __ move(new LIR_Address(mark_in_prog_addr, T_CHAR), mark_in_prog); > + __ move(new LIR_Address(mark_in_prog_addr, T_BYTE), mark_in_prog); > > Looked at other usages of the affected flags, and this seems to be the only case like this. > > New version: > http://cr.openjdk.java.net/~shade/shenandoah/refactor-shared-flags/webrev.03/ > > This now passes hotspot_gc_shenandoah (fastdebug) Good to go then! Roman From ashipile at redhat.com Fri Dec 1 19:26:41 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Fri, 01 Dec 2017 19:26:41 +0000 Subject: hg: shenandoah/jdk10: 3 new changesets Message-ID: <201712011926.vB1JQgOm026487@aojmv0008.oracle.com> Changeset: b067065f7bde Author: shade Date: 2017-12-01 11:57 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/b067065f7bde Region sampling should lock while gathering region data ! src/hotspot/share/gc/shenandoah/shenandoahHeapRegionCounters.cpp Changeset: e64c7ea17e9c Author: shade Date: 2017-12-01 12:44 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/e64c7ea17e9c Full GC should compact humongous regions ! src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.cpp ! src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.hpp ! src/hotspot/share/gc/shenandoah/shenandoahMarkCompact.cpp ! src/hotspot/share/gc/shenandoah/shenandoahMarkCompact.hpp ! src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.cpp ! src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.hpp ! src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp + test/hotspot/jtreg/gc/shenandoah/acceptance/AllocHumongousFragment.java Changeset: d54166ac952d Author: shade Date: 2017-12-01 19:42 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/d54166ac952d Rework shared bool/enum flags with proper types and synchronization ! src/hotspot/share/c1/c1_LIRGenerator.cpp ! src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.cpp ! src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.cpp ! src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.hpp ! src/hotspot/share/gc/shenandoah/shenandoahCollectorPolicy.cpp ! src/hotspot/share/gc/shenandoah/shenandoahCollectorPolicy.hpp ! src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp ! src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.hpp ! src/hotspot/share/gc/shenandoah/shenandoahConcurrentThread.cpp ! src/hotspot/share/gc/shenandoah/shenandoahConcurrentThread.hpp ! src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp ! src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp ! src/hotspot/share/gc/shenandoah/shenandoahHeap.inline.hpp ! src/hotspot/share/gc/shenandoah/shenandoahHeapRegionCounters.cpp ! src/hotspot/share/gc/shenandoah/shenandoahMarkCompact.cpp ! src/hotspot/share/gc/shenandoah/shenandoahPartialGC.cpp ! src/hotspot/share/gc/shenandoah/shenandoahPartialGC.hpp + src/hotspot/share/gc/shenandoah/shenandoahSharedVariables.hpp From shade at redhat.com Mon Dec 4 10:32:17 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 4 Dec 2017 11:32:17 +0100 Subject: RFR: Account trashed regions from coalesced CM-with-UR Message-ID: http://cr.openjdk.java.net/~shade/shenandoah/heuristics-coalesced-trash/webrev.01/ Adaptive CSet selection runs into trouble with coalesced CM-with-UR: final mark would trash the cset regions from the previous cycle, but that would not get accounted for current CSet sizing. In extreme cases, it would just make empty CSet, relying solely on immediate garbage: [10.687s][info][gc ] GC(9) Pause Init Mark 1.173ms [10.919s][info][gc ] Cancelling concurrent GC: Allocation Failure [10.922s][info][gc ] GC(9) Concurrent marking 1867M->1985M(2048M) 234.823ms [10.922s][info][gc,ergo] GC(9) Adjusting free threshold to: 55% (1126M) [11.011s][info][gc,ergo] GC(9) Adaptive CSet selection: free target = 1228M, actual free = 0M; min cset = 0M, max cset = 0M [11.011s][info][gc,ergo] GC(9) Total Garbage: 355M [11.011s][info][gc,ergo] GC(9) Immediate Garbage: 0M, 0 regions (0% of total) [11.011s][info][gc,ergo] GC(9) Garbage to be collected: 0M (0% of total), 0 regions [11.011s][info][gc,ergo] GC(9) Live objects to be evacuated: 0M [11.011s][info][gc,ergo] GC(9) Live/garbage ratio in collected regions: 0% [11.011s][info][gc,ergo] GC(9) Free: 0M, 0 regions (0% of total) [11.011s][info][gc ] GC(9) Pause Final Mark 88.866ms [11.011s][info][gc ] GC(9) Concurrent cleanup 1985M->1075M(2048M) 0.183ms [11.012s][info][gc ] GC(9) Concurrent cleanup 1075M->1075M(2048M) 0.719ms Testing: hotspot_gc_shenandoah Thanks, -Aleksey From rkennke at redhat.com Mon Dec 4 12:00:57 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 4 Dec 2017 13:00:57 +0100 Subject: RFR: Account trashed regions from coalesced CM-with-UR In-Reply-To: References: Message-ID: Am 04.12.2017 um 11:32 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/heuristics-coalesced-trash/webrev.01/ > > Adaptive CSet selection runs into trouble with coalesced CM-with-UR: final mark would trash the cset > regions from the previous cycle, but that would not get accounted for current CSet sizing. In > extreme cases, it would just make empty CSet, relying solely on immediate garbage: > > [10.687s][info][gc ] GC(9) Pause Init Mark 1.173ms > [10.919s][info][gc ] Cancelling concurrent GC: Allocation Failure > [10.922s][info][gc ] GC(9) Concurrent marking 1867M->1985M(2048M) 234.823ms > [10.922s][info][gc,ergo] GC(9) Adjusting free threshold to: 55% (1126M) > [11.011s][info][gc,ergo] GC(9) Adaptive CSet selection: free target = 1228M, actual free = 0M; min > cset = 0M, max cset = 0M > [11.011s][info][gc,ergo] GC(9) Total Garbage: 355M > [11.011s][info][gc,ergo] GC(9) Immediate Garbage: 0M, 0 regions (0% of total) > [11.011s][info][gc,ergo] GC(9) Garbage to be collected: 0M (0% of total), 0 regions > [11.011s][info][gc,ergo] GC(9) Live objects to be evacuated: 0M > [11.011s][info][gc,ergo] GC(9) Live/garbage ratio in collected regions: 0% > [11.011s][info][gc,ergo] GC(9) Free: 0M, 0 regions (0% of total) > [11.011s][info][gc ] GC(9) Pause Final Mark 88.866ms > [11.011s][info][gc ] GC(9) Concurrent cleanup 1985M->1075M(2048M) 0.183ms > [11.012s][info][gc ] GC(9) Concurrent cleanup 1075M->1075M(2048M) 0.719ms > > Testing: hotspot_gc_shenandoah > > Thanks, > -Aleksey > Ok From shade at redhat.com Mon Dec 4 13:35:41 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 4 Dec 2017 14:35:41 +0100 Subject: RFR: ShenandoahVerifyOptoBarriers should not fail with disabled barriers Message-ID: http://cr.openjdk.java.net/~shade/shenandoah/c2-verify-barriers-disable/webrev.01/ This makes sure ShenandoahVerifyOptoBarriers does not fail with any combination of barriers. I tried to make it work with partially-enabled barriers, but it proves much more difficult that I anticipated. So, in this patch, we just disable verification if unusual barrier combination is requested. Also does some related code touchups. Testing: hotspot_gc_shenandoah {fastdebug|release} Thanks, -Aleksey From ashipile at redhat.com Mon Dec 4 15:34:59 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Mon, 04 Dec 2017 15:34:59 +0000 Subject: hg: shenandoah/jdk10: Account trashed regions from coalesced CM-with-UR Message-ID: <201712041534.vB4FYxGO000287@aojmv0008.oracle.com> Changeset: c6e7e25780c1 Author: shade Date: 2017-12-04 11:28 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/c6e7e25780c1 Account trashed regions from coalesced CM-with-UR ! src/hotspot/share/gc/shenandoah/shenandoahCollectorPolicy.cpp From rkennke at redhat.com Mon Dec 4 19:24:22 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 4 Dec 2017 20:24:22 +0100 Subject: RFR: ShenandoahVerifyOptoBarriers should not fail with disabled barriers In-Reply-To: References: Message-ID: <5cf38750-adf6-3afc-f3fb-eb518af4ccf9@redhat.com> Am 04.12.2017 um 14:35 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/c2-verify-barriers-disable/webrev.01/ > > This makes sure ShenandoahVerifyOptoBarriers does not fail with any combination of barriers. I tried > to make it work with partially-enabled barriers, but it proves much more difficult that I > anticipated. So, in this patch, we just disable verification if unusual barrier combination is > requested. Also does some related code touchups. > > Testing: hotspot_gc_shenandoah {fastdebug|release} > > Thanks, > -Aleksey > Looks ok to me. Roman From ashipile at redhat.com Mon Dec 4 19:47:49 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Mon, 04 Dec 2017 19:47:49 +0000 Subject: hg: shenandoah/jdk10: ShenandoahVerifyOptoBarriers should not fail with disabled barriers Message-ID: <201712041947.vB4JlnYs014383@aojmv0008.oracle.com> Changeset: 9cafa38b1ef1 Author: shade Date: 2017-12-04 18:41 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/9cafa38b1ef1 ShenandoahVerifyOptoBarriers should not fail with disabled barriers ! src/hotspot/share/opto/compile.cpp ! src/hotspot/share/opto/shenandoahSupport.cpp ! src/hotspot/share/opto/shenandoahSupport.hpp ! src/hotspot/share/runtime/arguments.cpp ! test/hotspot/jtreg/gc/shenandoah/TestSelectiveBarrierFlags.java From rwestrel at redhat.com Mon Dec 4 19:48:38 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 04 Dec 2017 20:48:38 +0100 Subject: RFR: ShenandoahVerifyOptoBarriers should not fail with disabled barriers In-Reply-To: References: Message-ID: > http://cr.openjdk.java.net/~shade/shenandoah/c2-verify-barriers-disable/webrev.01/ That looks good to me. Roland. From shade at redhat.com Mon Dec 4 20:28:23 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 4 Dec 2017 21:28:23 +0100 Subject: RFR [9]: Bulk backport to sh/jdk9 Message-ID: http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk9-20171204/webrev.01/ Weekend runs for sh/jdk10 seem fine. Changes: rev 13746 : [backport] Assert Shenandoah-specific safepoints instead of generic ones rev 13747 : [backport] Generic verification is possible only at Shenandoah safepoints rev 13748 : [backport] C2 should use heapword-sized object math rev 13749 : [backport] Trim/expand test heap sizes to fit small heaps rev 13750 : Add missing TestShenandoahWithLogLevel test rev 13751 : [backport] Report illegal transitions verbosely, and remove some no-op transitions rev 13752 : [backport] Cleanup and refactor Full GC code rev 13753 : [backport] Humongous regions should support explicit pinning rev 13754 : [backport] Eagerly drop CSet state from regions during Full GC rev 13755 : [backport] Region sampling should lock while gathering region data rev 13756 : [backport] Full GC should compact humongous regions rev 13757 : [backport] Rework shared bool/enum flags with proper types and synchronization ----- weekend runs tested up to here ----- rev 13758 : [backport] Account trashed regions from coalesced CM-with-UR rev 13759 : [backport] ShenandoahVerifyOptoBarriers should not fail with disabled barriers ----- nightlies are testing now up to here ----- Testing: hotspot_gc_shenandoah {fastdebug|release} Thanks, -Aleksey From rkennke at redhat.com Tue Dec 5 10:50:10 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 5 Dec 2017 11:50:10 +0100 Subject: RFR: Check BS type in immByteMapBase predicate Message-ID: <66f43903-b1ed-a8e5-0283-91afc81a5222@redhat.com> In aarch64, we have an instruction in aarch64.ad that blindly casts the current BarrierSet to CardTableModRefBS, and uses this in the predicate to generate an immediate load if the operand matches the byte_map_base of the CTMRBS. However, when used with a GC that doesn't derive its BS from the CTMRBS, it reads some random thrash and inserts the special instruction sequence (adrp+movk) on immediate loads that happen to match whatever is in the imaginary byte_map_base... This eventually leads to corrupted heap. The fix is to check the BS type in the predicate too: http://cr.openjdk.java.net/~rkennke/aarch64-ctmrbs/webrev.00/src/hotspot/cpu/aarch64/aarch64.ad.udiff.html Test: hotspot_gc_shenandoah on aarch64 I intend to push backports of this to 9 and 8 too. Do I need extra reviews for those? Ok? From shade at redhat.com Tue Dec 5 10:55:50 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 5 Dec 2017 11:55:50 +0100 Subject: RFR: Check BS type in immByteMapBase predicate In-Reply-To: <66f43903-b1ed-a8e5-0283-91afc81a5222@redhat.com> References: <66f43903-b1ed-a8e5-0283-91afc81a5222@redhat.com> Message-ID: On 12/05/2017 11:50 AM, Roman Kennke wrote: > In aarch64, we have an instruction in aarch64.ad that blindly casts the current BarrierSet to > CardTableModRefBS, and uses this in the predicate to generate an immediate load if the operand > matches the byte_map_base of the CTMRBS. However, when used with a GC that doesn't derive its BS > from the CTMRBS, it reads some random thrash and inserts the special instruction sequence > (adrp+movk) on immediate loads that happen to match whatever is in the imaginary byte_map_base... > This eventually leads to corrupted heap. The fix is to check the BS type in the predicate too: > > > http://cr.openjdk.java.net/~rkennke/aarch64-ctmrbs/webrev.00/src/hotspot/cpu/aarch64/aarch64.ad.udiff.html Um. Does this mean something is using immByteMapBase() operand? What would happen if code uses that operand, but new predicate mismatches it (e.g. in Shenandoah)? > I intend to push backports of this to 9 and 8 too. Do I need extra reviews for those? Since this is not 9- or 8u-specific, I think you just push to sh/jdk10, and then regular backports process handles the propagation to sh/jdk9 and sh/jdk10. Thanks, -Aleksey From roman at kennke.org Tue Dec 5 12:11:39 2017 From: roman at kennke.org (Roman Kennke) Date: Tue, 5 Dec 2017 13:11:39 +0100 Subject: RFR: Check BS type in immByteMapBase predicate In-Reply-To: References: <66f43903-b1ed-a8e5-0283-91afc81a5222@redhat.com> Message-ID: <5d41f45a-aca8-c74c-a4dd-37e327b586d3@kennke.org> Am 05.12.2017 um 11:55 schrieb Aleksey Shipilev: > On 12/05/2017 11:50 AM, Roman Kennke wrote: >> In aarch64, we have an instruction in aarch64.ad that blindly casts the current BarrierSet to >> CardTableModRefBS, and uses this in the predicate to generate an immediate load if the operand >> matches the byte_map_base of the CTMRBS. However, when used with a GC that doesn't derive its BS >> from the CTMRBS, it reads some random thrash and inserts the special instruction sequence >> (adrp+movk) on immediate loads that happen to match whatever is in the imaginary byte_map_base... >> This eventually leads to corrupted heap. The fix is to check the BS type in the predicate too: >> >> >> http://cr.openjdk.java.net/~rkennke/aarch64-ctmrbs/webrev.00/src/hotspot/cpu/aarch64/aarch64.ad.udiff.html > Um. Does this mean something is using immByteMapBase() operand? Not in Shenandoah case. But it happens to match a constant to whatever 'pointer' it finds in memory at the location, that's where the myterious 0x7000 and similar values have come from. One thing that is not 100% clear to me is if this can match a constant that happens to have the same address as the byte_map_base, and what would happen if that is the case. > What would happen if code uses that > operand, but new predicate mismatches it (e.g. in Shenandoah)? It cannot be used in Shenandoah because we don't? use the CardTableModRefBS. Checking for the BS type seems the safest way to prevent the bug. >> I intend to push backports of this to 9 and 8 too. Do I need extra reviews for those? > Since this is not 9- or 8u-specific, I think you just push to sh/jdk10, and then regular backports > process handles the propagation to sh/jdk9 and sh/jdk10. Ok. From shade at redhat.com Tue Dec 5 12:19:50 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 5 Dec 2017 13:19:50 +0100 Subject: RFR: Check BS type in immByteMapBase predicate In-Reply-To: <5d41f45a-aca8-c74c-a4dd-37e327b586d3@kennke.org> References: <66f43903-b1ed-a8e5-0283-91afc81a5222@redhat.com> <5d41f45a-aca8-c74c-a4dd-37e327b586d3@kennke.org> Message-ID: <07ece366-e8b1-96f5-2539-cbe07edd8a6d@redhat.com> On 12/05/2017 01:11 PM, Roman Kennke wrote: > Am 05.12.2017 um 11:55 schrieb Aleksey Shipilev: >> On 12/05/2017 11:50 AM, Roman Kennke wrote: >> ?What would happen if code uses that operand, but new predicate mismatches it (e.g. in Shenandoah)? > It cannot be used in Shenandoah because we don't? use the CardTableModRefBS. Checking for the BS > type seems the safest way to prevent the bug. Oh, okay. >>> I intend to push backports of this to 9 and 8 too. Do I need extra reviews for those? >> Since this is not 9- or 8u-specific, I think you just push to sh/jdk10, and then regular backports >> process handles the propagation to sh/jdk9 and sh/jdk10. > > Ok. This is okay to go to sh/jdk10. Can you give aarch64 maintainers a heads-up about this fix? It probably warrants the fix in upstream for other collector's benefit, like Epsilon. Thanks, -Aleksey From roman at kennke.org Tue Dec 5 12:41:28 2017 From: roman at kennke.org (roman at kennke.org) Date: Tue, 05 Dec 2017 12:41:28 +0000 Subject: hg: shenandoah/jdk10: Check BS type in immByteMapBase predicate Message-ID: <201712051241.vB5CfSRk025079@aojmv0008.oracle.com> Changeset: 8954a7894ec1 Author: rkennke Date: 2017-12-05 12:37 +0000 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/8954a7894ec1 Check BS type in immByteMapBase predicate ! src/hotspot/cpu/aarch64/aarch64.ad From shade at redhat.com Tue Dec 5 14:01:51 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 5 Dec 2017 15:01:51 +0100 Subject: RFR [9]: Bulk backport to sh/jdk9 In-Reply-To: References: Message-ID: On 12/04/2017 09:28 PM, Aleksey Shipilev wrote: > http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk9-20171204/webrev.01/ > > Weekend runs for sh/jdk10 seem fine. > > Changes: > > rev 13746 : [backport] Assert Shenandoah-specific safepoints instead of generic ones > rev 13747 : [backport] Generic verification is possible only at Shenandoah safepoints > rev 13748 : [backport] C2 should use heapword-sized object math > rev 13749 : [backport] Trim/expand test heap sizes to fit small heaps > rev 13750 : Add missing TestShenandoahWithLogLevel test > rev 13751 : [backport] Report illegal transitions verbosely, and remove some no-op transitions > rev 13752 : [backport] Cleanup and refactor Full GC code > rev 13753 : [backport] Humongous regions should support explicit pinning > rev 13754 : [backport] Eagerly drop CSet state from regions during Full GC > rev 13755 : [backport] Region sampling should lock while gathering region data > rev 13756 : [backport] Full GC should compact humongous regions > rev 13757 : [backport] Rework shared bool/enum flags with proper types and synchronization > ----- weekend runs tested up to here ----- > rev 13758 : [backport] Account trashed regions from coalesced CM-with-UR > rev 13759 : [backport] ShenandoahVerifyOptoBarriers should not fail with disabled barriers > ----- nightlies are testing now up to here ----- Nightlies are fine. When backporting to sh/jdk8u, I discovered a small omission in arguments.cpp/r13759 -- missing ShenandoahWriteBarrier check. Fixed here: http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk9-20171204/webrev.02/ -Aleksey From shade at redhat.com Tue Dec 5 14:02:07 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 5 Dec 2017 15:02:07 +0100 Subject: RFR: [8u] Bulk backports to sh/jdk8u Message-ID: http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk8u-20171205/webrev.01/ Changes: 46091fe1a0bc: [backport] Assert Shenandoah-specific safepoints instead of generic ones 2d0fb36d2bb3: [backport] Generic verification is possible only at Shenandoah safepoints 956b6e15ff46: [backport] C2 should use heapword-sized object math 8aa58b6572a3: [backport] Trim/expand test heap sizes to fit small heaps d8f6b4e791f5: [backport] Report illegal transitions verbosely, and remove some no-op transitions 1e9a0e68a087: [backport] Cleanup and refactor Full GC code 9beca79e01ca: [backport] Humongous regions should support explicit pinning efd9de15c656: [backport] Eagerly drop CSet state from regions during Full GC b067065f7bde: [backport] Region sampling should lock while gathering region data e64c7ea17e9c: [backport] Full GC should compact humongous regions d54166ac952d: [backport] Rework shared bool/enum flags with proper types and synchronization c6e7e25780c1: [backport] Account trashed regions from coalesced CM-with-UR 9cafa38b1ef1: [backport] ShenandoahVerifyOptoBarriers should not fail with disabled barriers Testing: hotspot_gc_shenandoah {fastdebug|release} Thanks, -Aleksey From roman at kennke.org Tue Dec 5 14:11:58 2017 From: roman at kennke.org (Roman Kennke) Date: Tue, 5 Dec 2017 15:11:58 +0100 Subject: RFR [9]: Bulk backport to sh/jdk9 In-Reply-To: References: Message-ID: <337d0d43-7529-fdc7-d174-f9252e344a88@kennke.org> Am 05.12.2017 um 15:01 schrieb Aleksey Shipilev: > On 12/04/2017 09:28 PM, Aleksey Shipilev wrote: >> http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk9-20171204/webrev.01/ >> >> Weekend runs for sh/jdk10 seem fine. >> >> Changes: >> >> rev 13746 : [backport] Assert Shenandoah-specific safepoints instead of generic ones >> rev 13747 : [backport] Generic verification is possible only at Shenandoah safepoints >> rev 13748 : [backport] C2 should use heapword-sized object math >> rev 13749 : [backport] Trim/expand test heap sizes to fit small heaps >> rev 13750 : Add missing TestShenandoahWithLogLevel test >> rev 13751 : [backport] Report illegal transitions verbosely, and remove some no-op transitions >> rev 13752 : [backport] Cleanup and refactor Full GC code >> rev 13753 : [backport] Humongous regions should support explicit pinning >> rev 13754 : [backport] Eagerly drop CSet state from regions during Full GC >> rev 13755 : [backport] Region sampling should lock while gathering region data >> rev 13756 : [backport] Full GC should compact humongous regions >> rev 13757 : [backport] Rework shared bool/enum flags with proper types and synchronization >> ----- weekend runs tested up to here ----- >> rev 13758 : [backport] Account trashed regions from coalesced CM-with-UR >> rev 13759 : [backport] ShenandoahVerifyOptoBarriers should not fail with disabled barriers >> ----- nightlies are testing now up to here ----- > Nightlies are fine. > > When backporting to sh/jdk8u, I discovered a small omission in arguments.cpp/r13759 -- missing > ShenandoahWriteBarrier check. Fixed here: > http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk9-20171204/webrev.02/ > > -Aleksey > > Ok by me. From shade at redhat.com Tue Dec 5 15:35:25 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 5 Dec 2017 16:35:25 +0100 Subject: RFR: Optimize oop/fwdptr/hr_index verification a bit Message-ID: <04d9c040-df79-6785-8363-03df2525fee5@redhat.com> http://cr.openjdk.java.net/~shade/shenandoah/verifier-perf-hr/webrev.01/ This slightly optimizes verification code, which should make fastdebug builds run a bit faster. Testing: hotspot_gc_shenandoah Thanks, -Aleksey From rkennke at redhat.com Tue Dec 5 15:38:50 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 5 Dec 2017 16:38:50 +0100 Subject: RFR: Optimize oop/fwdptr/hr_index verification a bit In-Reply-To: <04d9c040-df79-6785-8363-03df2525fee5@redhat.com> References: <04d9c040-df79-6785-8363-03df2525fee5@redhat.com> Message-ID: <7b58711a-cc66-17bf-0f79-2165f0e8288f@redhat.com> Am 05.12.2017 um 16:35 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/verifier-perf-hr/webrev.01/ > > This slightly optimizes verification code, which should make fastdebug builds run a bit faster. > > Testing: hotspot_gc_shenandoah > > Thanks, > -Aleksey > Tried it out. Seems to work. Please push! Roman From ashipile at redhat.com Tue Dec 5 15:49:25 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Tue, 05 Dec 2017 15:49:25 +0000 Subject: hg: shenandoah/jdk10: Optimize oop/fwdptr/hr_index verification a bit Message-ID: <201712051549.vB5FnPVV011934@aojmv0008.oracle.com> Changeset: b81c043a63cd Author: shade Date: 2017-12-05 16:38 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/b81c043a63cd Optimize oop/fwdptr/hr_index verification a bit ! src/hotspot/share/gc/shenandoah/brooksPointer.inline.hpp ! src/hotspot/share/gc/shenandoah/shenandoahHeap.inline.hpp ! src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp ! src/hotspot/share/gc/shenandoah/shenandoahVerifier.hpp From shade at redhat.com Tue Dec 5 16:01:11 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 5 Dec 2017 17:01:11 +0100 Subject: RFR: Optimize fwdptr region handling in ShenandoahVerifyOopClosure::verify_oop Message-ID: <06f1975c-e377-3e8f-8287-9a02b1ab63f1@redhat.com> http://cr.openjdk.java.net/~shade/shenandoah/verifier-perf-fhr/webrev.01/ This does the obvious optimization for fwdptr region handling: should not call heap_region_containing 4x times, when we can call it 1x time for non-forwarded objects. Testing: hotspot_gc_shenandoah Thanks, -Aleksey From rkennke at redhat.com Tue Dec 5 16:12:58 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 5 Dec 2017 17:12:58 +0100 Subject: RFR: Optimize fwdptr region handling in ShenandoahVerifyOopClosure::verify_oop In-Reply-To: <06f1975c-e377-3e8f-8287-9a02b1ab63f1@redhat.com> References: <06f1975c-e377-3e8f-8287-9a02b1ab63f1@redhat.com> Message-ID: <79fe9af9-6c7a-4f34-a359-e6f26ac435e0@redhat.com> Am 05.12.2017 um 17:01 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/verifier-perf-fhr/webrev.01/ > > This does the obvious optimization for fwdptr region handling: should not call > heap_region_containing 4x times, when we can call it 1x time for non-forwarded objects. > > Testing: hotspot_gc_shenandoah > > Thanks, > -Aleksey > Ok From ashipile at redhat.com Tue Dec 5 16:28:09 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Tue, 05 Dec 2017 16:28:09 +0000 Subject: hg: shenandoah/jdk10: Optimize fwdptr region handling in ShenandoahVerifyOopClosure::verify_oop Message-ID: <201712051628.vB5GS9Wx029008@aojmv0008.oracle.com> Changeset: feb16f72b64a Author: shade Date: 2017-12-05 16:59 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/feb16f72b64a Optimize fwdptr region handling in ShenandoahVerifyOopClosure::verify_oop ! src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp From shade at redhat.com Tue Dec 5 16:33:11 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 5 Dec 2017 17:33:11 +0100 Subject: RFR: SieveObjects test is too hostile to verification Message-ID: http://cr.openjdk.java.net/~shade/shenandoah/tests-sieve-verify/webrev.01 SieveObjects allocates a lot of objects, retained from the single array. Verifier does not do array chunking to keep itself simple, and thus verification for this test takes a while. Since verification costs are dependent on the number of objects, but test itself maintains the ragged array of decent footprint, the way out is to make the per-object footprint 10x larger, and thus trim the number of objects 10x too. Testing: SieveObjects Thanks, -Aleksey From rkennke at redhat.com Tue Dec 5 16:34:40 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 5 Dec 2017 17:34:40 +0100 Subject: RFR: SieveObjects test is too hostile to verification In-Reply-To: References: Message-ID: Am 05.12.2017 um 17:33 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/tests-sieve-verify/webrev.01 > > SieveObjects allocates a lot of objects, retained from the single array. Verifier does not do array > chunking to keep itself simple, and thus verification for this test takes a while. Since > verification costs are dependent on the number of objects, but test itself maintains the ragged > array of decent footprint, the way out is to make the per-object footprint 10x larger, and thus trim > the number of objects 10x too. > > Testing: SieveObjects > > Thanks, > -Aleksey > Sounds reasonable. Go! From ashipile at redhat.com Tue Dec 5 16:40:03 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Tue, 05 Dec 2017 16:40:03 +0000 Subject: hg: shenandoah/jdk10: SieveObjects test is too hostile to verification Message-ID: <201712051640.vB5Ge3fb005010@aojmv0008.oracle.com> Changeset: 1302e41c55e9 Author: shade Date: 2017-12-05 17:31 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/1302e41c55e9 SieveObjects test is too hostile to verification ! test/hotspot/jtreg/gc/shenandoah/acceptance/SieveObjects.java From zgu at redhat.com Tue Dec 5 16:46:27 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 5 Dec 2017 11:46:27 -0500 Subject: RFR [9]: Bulk backport to sh/jdk9 In-Reply-To: <267a2d6c-219d-1530-162b-f0fdb4cf4c8a@kennke.org> References: <0f4704f6-28eb-f280-ab32-fd8175d1594c@redhat.com> <267a2d6c-219d-1530-162b-f0fdb4cf4c8a@kennke.org> Message-ID: <140d12f4-01d6-428b-68ce-fcaf3f93a014@redhat.com> Okay to me. -Zhengyu On 10/25/2017 04:00 PM, Roman Kennke wrote: > Am 25.10.2017 um 21:27 schrieb Aleksey Shipilev: >> http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk9-20171025/webrev.01/ >> >> >> Changes include: >> >> rev 13698 : [backport] Rewrite and fix >> ShenandoahHeap::marked_object_iterate >> rev 13699 : [backport] Eliminate string dedup cleanup phase and >> correct UR closure >> rev 13700 : [backport] barrier moved due to null checks needs to >> always fix memory edges >> rev 13701 : [backport] Incorrect constant folding with final field and >> -ShenandoahOptimizeFinals >> rev 13702 : [backport] >> AESCrypt.implEncryptBlock/AESCrypt.implDecryptBlock intrinsics assume non >> null inputs >> rev 13703 : [backport] keep read barriers for final instance/stable >> field accesses >> rev 13704 : [backport] Added diagnostic flag ShenandoahOOMDuringEvacALot >> rev 13705 : [backport] Rename dynamic heuristics to static >> rev 13706 : [backport] Static heuristics should use non-zero >> allocation threshold >> rev 13707 : [backport] Static heuristics should be really static and >> report decisions >> rev 13708 : [backport] missing must_be_not_null() for arguments to >> String compareTo*/equals* >> >> These are mostly compiler fixes, so I would like Roland to take a hard >> look on these. Zhengyu, we >> are backporting String dedup fix, even though it still fails the >> string dedup stress intermittently. >> Roman, take at the remaining parts too? >> >> Testing: hotspot_gc_shenandoah {fastdebug|release} >> >> Thanks, >> -Aleksey >> >> > It all looks good to me. > From rwestrel at redhat.com Tue Dec 5 16:52:14 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 05 Dec 2017 17:52:14 +0100 Subject: RFR [9]: Bulk backport to sh/jdk9 In-Reply-To: References: Message-ID: > http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk9-20171204/webrev.01/ C2 changes look good to me. Roland. From rwestrel at redhat.com Tue Dec 5 16:53:00 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 05 Dec 2017 17:53:00 +0100 Subject: RFR: [8u] Bulk backports to sh/jdk8u In-Reply-To: References: Message-ID: > http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk8u-20171205/webrev.01/ C2 changes look good to me. Roland. From rkennke at redhat.com Tue Dec 5 19:18:46 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 5 Dec 2017 20:18:46 +0100 Subject: RFR: [8u] Bulk backports to sh/jdk8u In-Reply-To: References: Message-ID: <068ba75c-8f98-1ea6-d9b6-e27d8f6c0b13@redhat.com> Am 05.12.2017 um 15:02 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk8u-20171205/webrev.01/ > > Changes: > > 46091fe1a0bc: [backport] Assert Shenandoah-specific safepoints instead of generic ones > 2d0fb36d2bb3: [backport] Generic verification is possible only at Shenandoah safepoints > 956b6e15ff46: [backport] C2 should use heapword-sized object math > 8aa58b6572a3: [backport] Trim/expand test heap sizes to fit small heaps > d8f6b4e791f5: [backport] Report illegal transitions verbosely, and remove some no-op transitions > 1e9a0e68a087: [backport] Cleanup and refactor Full GC code > 9beca79e01ca: [backport] Humongous regions should support explicit pinning > efd9de15c656: [backport] Eagerly drop CSet state from regions during Full GC > b067065f7bde: [backport] Region sampling should lock while gathering region data > e64c7ea17e9c: [backport] Full GC should compact humongous regions > d54166ac952d: [backport] Rework shared bool/enum flags with proper types and synchronization > c6e7e25780c1: [backport] Account trashed regions from coalesced CM-with-UR > 9cafa38b1ef1: [backport] ShenandoahVerifyOptoBarriers should not fail with disabled barriers > > Testing: hotspot_gc_shenandoah {fastdebug|release} > > Thanks, > -Aleksey > Ok From ashipile at redhat.com Tue Dec 5 19:57:53 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Tue, 05 Dec 2017 19:57:53 +0000 Subject: hg: shenandoah/jdk9/hotspot: 14 new changesets Message-ID: <201712051957.vB5JvsIJ001578@aojmv0008.oracle.com> Changeset: ee4b295460ab Author: shade Date: 2017-12-05 14:28 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/ee4b295460ab [backport] Assert Shenandoah-specific safepoints instead of generic ones ! src/share/vm/gc/shenandoah/shenandoahCodeRoots.cpp ! src/share/vm/gc/shenandoah/shenandoahCollectionSet.cpp ! src/share/vm/gc/shenandoah/shenandoahConcurrentMark.cpp ! src/share/vm/gc/shenandoah/shenandoahHeap.cpp ! src/share/vm/gc/shenandoah/shenandoahMarkCompact.cpp ! src/share/vm/gc/shenandoah/shenandoahStringDedup.cpp ! src/share/vm/gc/shenandoah/shenandoahVerifier.cpp Changeset: 882aae3fdf3c Author: shade Date: 2017-11-30 16:33 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/882aae3fdf3c [backport] Generic verification is possible only at Shenandoah safepoints ! src/share/vm/gc/shenandoah/shenandoahHeap.cpp Changeset: 0a00aa87de86 Author: shade Date: 2017-11-30 10:13 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/0a00aa87de86 [backport] C2 should use heapword-sized object math ! src/share/vm/opto/macro.cpp Changeset: 083fd27d07c5 Author: shade Date: 2017-11-30 16:24 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/083fd27d07c5 [backport] Trim/expand test heap sizes to fit small heaps ! test/gc/TestHumongousReferenceObject.java ! test/gc/shenandoah/EvilSyncBug.java ! test/gc/shenandoah/HumongousThreshold.java ! test/gc/shenandoah/ShenandoahStrDedupStress.java ! test/gc/shenandoah/TestHeapAlloc.java ! test/gc/shenandoah/acceptance/AllocIntArrays.java ! test/gc/shenandoah/acceptance/AllocObjectArrays.java ! test/gc/shenandoah/acceptance/AllocObjects.java ! test/gc/shenandoah/acceptance/HeapUncommit.java ! test/gc/shenandoah/acceptance/ParallelRefprocSanity.java ! test/gc/shenandoah/acceptance/RetainObjects.java ! test/gc/shenandoah/acceptance/VerifyJCStressTest.java ! test/gc/shenandoah/options/AlwaysPreTouch.java ! test/gc/shenandoah/options/TestRegionSizeArgs.java ! test/gc/stress/gcbasher/TestGCBasherWithShenandoah.java Changeset: ace457c5a045 Author: shade Date: 2017-12-05 14:28 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/ace457c5a045 Add missing TestShenandoahWithLogLevel test + test/gc/shenandoah/TestShenandoahWithLogLevel.java Changeset: fea05c6a601d Author: shade Date: 2017-11-30 16:37 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/fea05c6a601d [backport] Report illegal transitions verbosely, and remove some no-op transitions ! src/share/vm/gc/shenandoah/shenandoahHeap.cpp ! src/share/vm/gc/shenandoah/shenandoahHeapRegion.cpp ! src/share/vm/gc/shenandoah/shenandoahHeapRegion.hpp Changeset: e63ab4bbbf4d Author: shade Date: 2017-11-30 16:38 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/e63ab4bbbf4d [backport] Cleanup and refactor Full GC code ! src/share/vm/gc/shenandoah/shenandoahHeap.cpp ! src/share/vm/gc/shenandoah/shenandoahHeap.hpp ! src/share/vm/gc/shenandoah/shenandoahHeapRegionSet.cpp ! src/share/vm/gc/shenandoah/shenandoahHeapRegionSet.hpp ! src/share/vm/gc/shenandoah/shenandoahMarkCompact.cpp ! src/share/vm/gc/shenandoah/shenandoahMarkCompact.hpp ! src/share/vm/gc/shenandoah/shenandoahUtils.cpp ! src/share/vm/gc/shenandoah/vm_operations_shenandoah.cpp Changeset: 4e8674a04a07 Author: shade Date: 2017-11-30 16:38 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/4e8674a04a07 [backport] Humongous regions should support explicit pinning ! src/share/vm/gc/shenandoah/shenandoahHeapRegion.cpp ! src/share/vm/gc/shenandoah/shenandoahHeapRegion.hpp ! src/share/vm/gc/shenandoah/shenandoahMarkCompact.cpp ! src/share/vm/gc/shenandoah/shenandoah_globals.hpp Changeset: 14daeae927da Author: shade Date: 2017-11-30 18:01 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/14daeae927da [backport] Eagerly drop CSet state from regions during Full GC ! src/share/vm/gc/shenandoah/shenandoahMarkCompact.cpp Changeset: 97124a983917 Author: shade Date: 2017-12-01 11:57 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/97124a983917 [backport] Region sampling should lock while gathering region data ! src/share/vm/gc/shenandoah/shenandoahHeapRegionCounters.cpp Changeset: 3a11513020bc Author: shade Date: 2017-12-01 12:44 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/3a11513020bc [backport] Full GC should compact humongous regions ! src/share/vm/gc/shenandoah/shenandoahHeapRegion.cpp ! src/share/vm/gc/shenandoah/shenandoahHeapRegion.hpp ! src/share/vm/gc/shenandoah/shenandoahMarkCompact.cpp ! src/share/vm/gc/shenandoah/shenandoahMarkCompact.hpp ! src/share/vm/gc/shenandoah/shenandoahPhaseTimings.cpp ! src/share/vm/gc/shenandoah/shenandoahPhaseTimings.hpp ! src/share/vm/gc/shenandoah/shenandoah_globals.hpp + test/gc/shenandoah/acceptance/AllocHumongousFragment.java Changeset: b3a3e5e2dc05 Author: shade Date: 2017-12-01 19:42 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/b3a3e5e2dc05 [backport] Rework shared bool/enum flags with proper types and synchronization ! src/share/vm/gc/shenandoah/shenandoahBarrierSet.cpp ! src/share/vm/gc/shenandoah/shenandoahCodeRoots.cpp ! src/share/vm/gc/shenandoah/shenandoahCodeRoots.hpp ! src/share/vm/gc/shenandoah/shenandoahCollectorPolicy.cpp ! src/share/vm/gc/shenandoah/shenandoahCollectorPolicy.hpp ! src/share/vm/gc/shenandoah/shenandoahConcurrentMark.cpp ! src/share/vm/gc/shenandoah/shenandoahConcurrentMark.hpp ! src/share/vm/gc/shenandoah/shenandoahConcurrentThread.cpp ! src/share/vm/gc/shenandoah/shenandoahConcurrentThread.hpp ! src/share/vm/gc/shenandoah/shenandoahHeap.cpp ! src/share/vm/gc/shenandoah/shenandoahHeap.hpp ! src/share/vm/gc/shenandoah/shenandoahHeap.inline.hpp ! src/share/vm/gc/shenandoah/shenandoahHeapRegionCounters.cpp ! src/share/vm/gc/shenandoah/shenandoahMarkCompact.cpp + src/share/vm/gc/shenandoah/shenandoahSharedVariables.hpp Changeset: 7632192570fc Author: shade Date: 2017-12-04 11:28 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/7632192570fc [backport] Account trashed regions from coalesced CM-with-UR ! src/share/vm/gc/shenandoah/shenandoahCollectorPolicy.cpp Changeset: abf47c017713 Author: shade Date: 2017-12-04 18:41 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/abf47c017713 [backport] ShenandoahVerifyOptoBarriers should not fail with disabled barriers ! src/share/vm/opto/compile.cpp ! src/share/vm/opto/shenandoahSupport.cpp ! src/share/vm/opto/shenandoahSupport.hpp ! src/share/vm/runtime/arguments.cpp ! test/gc/shenandoah/TestSelectiveBarrierFlags.java From ashipile at redhat.com Tue Dec 5 20:08:43 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Tue, 05 Dec 2017 20:08:43 +0000 Subject: hg: shenandoah/jdk8u/hotspot: 13 new changesets Message-ID: <201712052008.vB5K8hk8006300@aojmv0008.oracle.com> Changeset: fb364ee7f069 Author: shade Date: 2017-12-05 11:13 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/fb364ee7f069 [backport] Assert Shenandoah-specific safepoints instead of generic ones ! src/share/vm/gc_implementation/shenandoah/shenandoahCodeRoots.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahCollectionSet.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahConcurrentMark.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahHeap.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahMarkCompact.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahUtils.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahVerifier.cpp Changeset: 839b518d139a Author: shade Date: 2017-11-30 16:33 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/839b518d139a [backport] Generic verification is possible only at Shenandoah safepoints ! src/share/vm/gc_implementation/shenandoah/shenandoahHeap.cpp Changeset: e8f3b38913fd Author: shade Date: 2017-11-30 10:13 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/e8f3b38913fd [backport] C2 should use heapword-sized object math ! src/share/vm/opto/macro.cpp Changeset: f3370e98d9e1 Author: shade Date: 2017-11-30 16:24 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/f3370e98d9e1 [backport] Trim/expand test heap sizes to fit small heaps ! test/gc/shenandoah/EvilSyncBug.java ! test/gc/shenandoah/HumongousThreshold.java ! test/gc/shenandoah/TestHeapAlloc.java ! test/gc/shenandoah/TestShenandoahWithLogLevel.java ! test/gc/shenandoah/acceptance/AllocIntArrays.java ! test/gc/shenandoah/acceptance/AllocObjectArrays.java ! test/gc/shenandoah/acceptance/AllocObjects.java ! test/gc/shenandoah/acceptance/HeapUncommit.java ! test/gc/shenandoah/acceptance/ParallelRefprocSanity.java ! test/gc/shenandoah/acceptance/RetainObjects.java ! test/gc/shenandoah/acceptance/VerifyJCStressTest.java ! test/gc/shenandoah/options/AlwaysPreTouch.java ! test/gc/shenandoah/options/TestRegionSizeArgs.java Changeset: f4e55bcf7189 Author: shade Date: 2017-11-30 16:37 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/f4e55bcf7189 [backport] Report illegal transitions verbosely, and remove some no-op transitions ! src/share/vm/gc_implementation/shenandoah/shenandoahHeap.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahHeapRegion.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahHeapRegion.hpp Changeset: 4d6d19f32598 Author: shade Date: 2017-11-30 16:38 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/4d6d19f32598 [backport] Cleanup and refactor Full GC code ! src/share/vm/gc_implementation/shenandoah/shenandoahHeap.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahHeap.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahHeapRegionSet.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahHeapRegionSet.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahMarkCompact.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahMarkCompact.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahUtils.cpp ! src/share/vm/gc_implementation/shenandoah/vm_operations_shenandoah.cpp Changeset: 1298c7072652 Author: shade Date: 2017-11-30 16:38 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/1298c7072652 [backport] Humongous regions should support explicit pinning ! src/share/vm/gc_implementation/shenandoah/shenandoahHeapRegion.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahHeapRegion.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahMarkCompact.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoah_globals.hpp Changeset: f91092a7acd3 Author: shade Date: 2017-11-30 18:01 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/f91092a7acd3 [backport] Eagerly drop CSet state from regions during Full GC ! src/share/vm/gc_implementation/shenandoah/shenandoahMarkCompact.cpp Changeset: 1b6a6fbe141a Author: shade Date: 2017-12-01 11:57 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/1b6a6fbe141a [backport] Region sampling should lock while gathering region data ! src/share/vm/gc_implementation/shenandoah/shenandoahHeapRegionCounters.cpp Changeset: c1f80351ad51 Author: shade Date: 2017-12-01 12:44 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/c1f80351ad51 [backport] Full GC should compact humongous regions ! src/share/vm/gc_implementation/shenandoah/shenandoahHeapRegion.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahHeapRegion.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahMarkCompact.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahMarkCompact.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahPhaseTimings.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahPhaseTimings.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoah_globals.hpp + test/gc/shenandoah/acceptance/AllocHumongousFragment.java Changeset: 508fc61b6ffb Author: shade Date: 2017-12-01 19:42 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/508fc61b6ffb [backport] Rework shared bool/enum flags with proper types and synchronization ! src/share/vm/gc_implementation/shenandoah/shenandoahBarrierSet.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahCodeRoots.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahCodeRoots.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahCollectorPolicy.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahCollectorPolicy.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahConcurrentMark.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahConcurrentMark.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahConcurrentThread.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahConcurrentThread.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahHeap.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahHeap.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahHeap.inline.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahHeapRegionCounters.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahMarkCompact.cpp + src/share/vm/gc_implementation/shenandoah/shenandoahSharedVariables.hpp Changeset: 1422ae507ae8 Author: shade Date: 2017-12-04 11:28 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/1422ae507ae8 [backport] Account trashed regions from coalesced CM-with-UR ! src/share/vm/gc_implementation/shenandoah/shenandoahCollectorPolicy.cpp Changeset: f40e911070e2 Author: shade Date: 2017-12-04 18:41 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/f40e911070e2 [backport] ShenandoahVerifyOptoBarriers should not fail with disabled barriers ! src/share/vm/opto/compile.cpp ! src/share/vm/opto/shenandoahSupport.cpp ! src/share/vm/opto/shenandoahSupport.hpp ! src/share/vm/runtime/arguments.cpp ! test/gc/shenandoah/TestSelectiveBarrierFlags.java From zgu at redhat.com Wed Dec 6 17:15:11 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 6 Dec 2017 12:15:11 -0500 Subject: RFR: Shenandoah string deduplication Message-ID: <98c59eb5-3052-44a0-44ae-a6899ab89f70@redhat.com> The implementation takes different approach from G1's implementation. * It treats string dedup queue/table as weak roots. * It identifies deduplication candidates during concurrent marking and during full gc, to reduce impact on pause time. * To properly identify the candidates, it manipulates string's age field concurrently, to prevent from processing string too early or deduplicating the same string over and over again. While it is undesirable, it reduces the risk by only manipulating *regular* strings - when they do not have displaced headers. Locked objects are relative rare, especially for immutable objects, such as Strings. * Deduplicate routine is implemented lock-free, in case we have to embed it in barriers. * Reverted all early changes to G1's implementation Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/sh_strdedup/webrev.00/ Test: hotspot_gc_shenandoah (fastdebug and release) specJBB (fastdebug and release) Thanks, -Zhengyu From rkennke at redhat.com Wed Dec 6 19:48:57 2017 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 6 Dec 2017 20:48:57 +0100 Subject: RFR: Shenandoah string deduplication In-Reply-To: <98c59eb5-3052-44a0-44ae-a6899ab89f70@redhat.com> References: <98c59eb5-3052-44a0-44ae-a6899ab89f70@redhat.com> Message-ID: <7f680d35-6b13-ab3f-a4e9-8b129020d8ca@redhat.com> Am 06.12.2017 um 18:15 schrieb Zhengyu Gu: > The implementation takes different approach from G1's implementation. > > * It treats string dedup queue/table as weak roots. > > * It identifies deduplication candidates during concurrent marking and > during full gc, to reduce impact on pause time. > > * To properly identify the candidates, it manipulates string's age field > concurrently, to prevent from processing string too early or > deduplicating the same string over and over again. While it is > undesirable, it reduces the risk by only manipulating *regular* strings > - when they do not have displaced headers. Locked objects are relative > rare, especially for immutable objects, such as Strings. > > * Deduplicate routine is implemented lock-free, in case we have to embed > it in barriers. > > * Reverted all early changes to G1's implementation > > Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/sh_strdedup/webrev.00/ > > Test: > ? hotspot_gc_shenandoah (fastdebug and release) > ? specJBB?????????????? (fastdebug and release) > > > Thanks, > > -Zhengyu src/hotspot/share/classfile/stringTable.cpp: /me makes a note that this warrants a GC interface Other than that, I like it! :-) Probably warrants another pair of eyes. Roman From rkennke at redhat.com Thu Dec 7 13:18:26 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 7 Dec 2017 14:18:26 +0100 Subject: RFR: Missing enter/leave around keep_alive_barrier in AArch64 Message-ID: <19b41b84-c0bc-7ca8-ba95-2553fe5f0aad@redhat.com> I've been missing enter/leave calls around the SATB pre barrier call in MacroAssembler::keep_alive_barrier() for Shenandoah. This has been sending EvilSyncBug (and possible some other tests) into endless loops. The cleanest place to have them is in the (only) user of it in generate_Reference_get(): http://cr.openjdk.java.net/~rkennke/aarch64-enter-leave/webrev.00/ Test: EvilSyncBug terminates now (aarch64). Running other tests right now Ok? From shade at redhat.com Thu Dec 7 13:20:03 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 7 Dec 2017 14:20:03 +0100 Subject: RFR: Missing enter/leave around keep_alive_barrier in AArch64 In-Reply-To: <19b41b84-c0bc-7ca8-ba95-2553fe5f0aad@redhat.com> References: <19b41b84-c0bc-7ca8-ba95-2553fe5f0aad@redhat.com> Message-ID: <77d21cb8-2b64-e2a7-eebf-ba0a54cd8a32@redhat.com> On 12/07/2017 02:18 PM, Roman Kennke wrote: > I've been missing enter/leave calls around the SATB pre barrier call in > MacroAssembler::keep_alive_barrier() for Shenandoah. This has been sending EvilSyncBug (and possible > some other tests) into endless loops. > > The cleanest place to have them is in the (only) user of it in generate_Reference_get(): > > http://cr.openjdk.java.net/~rkennke/aarch64-enter-leave/webrev.00/ Looks good to me. -Aleksey From roman at kennke.org Thu Dec 7 13:24:48 2017 From: roman at kennke.org (roman at kennke.org) Date: Thu, 07 Dec 2017 13:24:48 +0000 Subject: hg: shenandoah/jdk10: AArch64: Fix missing enter/leave around keep_alive_barrier. Message-ID: <201712071324.vB7DOmF2029803@aojmv0008.oracle.com> Changeset: 053d35b758f1 Author: rkennke Date: 2017-12-07 13:20 +0000 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/053d35b758f1 AArch64: Fix missing enter/leave around keep_alive_barrier. ! src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp ! src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp From zgu at redhat.com Thu Dec 7 14:06:28 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 7 Dec 2017 09:06:28 -0500 Subject: RFR: Shenandoah string deduplication In-Reply-To: <7f680d35-6b13-ab3f-a4e9-8b129020d8ca@redhat.com> References: <98c59eb5-3052-44a0-44ae-a6899ab89f70@redhat.com> <7f680d35-6b13-ab3f-a4e9-8b129020d8ca@redhat.com> Message-ID: <38ab37ea-69cc-fb53-9f5e-93990ff752c7@redhat.com> A minor update: removed creation of unused StringDedupTable_lock: --- old/src/hotspot/share/runtime/mutexLocker.cpp 2017-12-07 08:57:03.732839267 -0500 +++ new/src/hotspot/share/runtime/mutexLocker.cpp 2017-12-07 08:57:03.383839085 -0500 @@ -202,9 +202,6 @@ def(SATB_Q_FL_lock , PaddedMutex , access, true, Monitor::_safepoint_check_never); def(SATB_Q_CBL_mon , PaddedMonitor, access, true, Monitor::_safepoint_check_never); def(Shared_SATB_Q_lock , PaddedMutex , access + 1, true, Monitor::_safepoint_check_never); - // Shenandoah needs (special-1) rank of the lock, because write barrier can evacuate objects while - // thread holding other locks, such as CodeCache_lock, etc. - def(StringDedupTable_lock , PaddedMutex , special-1, true, Monitor::_safepoint_check_never); } def(ParGCRareEvent_lock , PaddedMutex , leaf , true, Monitor::_safepoint_check_sometimes); def(DerivedPointerTableGC_lock , PaddedMutex , leaf, true, Monitor::_safepoint_check_never); Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/sh_strdedup/webrev.01/ Thanks, -Zhengyu On 12/06/2017 02:48 PM, Roman Kennke wrote: > Am 06.12.2017 um 18:15 schrieb Zhengyu Gu: >> The implementation takes different approach from G1's implementation. >> >> * It treats string dedup queue/table as weak roots. >> >> * It identifies deduplication candidates during concurrent marking and >> during full gc, to reduce impact on pause time. >> >> * To properly identify the candidates, it manipulates string's age >> field concurrently, to prevent from processing string too early or >> deduplicating the same string over and over again. While it is >> undesirable, it reduces the risk by only manipulating *regular* >> strings - when they do not have displaced headers. Locked objects are >> relative rare, especially for immutable objects, such as Strings. >> >> * Deduplicate routine is implemented lock-free, in case we have to >> embed it in barriers. >> >> * Reverted all early changes to G1's implementation >> >> Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/sh_strdedup/webrev.00/ >> >> Test: >> hotspot_gc_shenandoah (fastdebug and release) >> specJBB (fastdebug and release) >> >> >> Thanks, >> >> -Zhengyu > > src/hotspot/share/classfile/stringTable.cpp: > /me makes a note that this warrants a GC interface > > Other than that, I like it! :-) > > Probably warrants another pair of eyes. > > Roman > From rkennke at redhat.com Thu Dec 7 16:16:31 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 7 Dec 2017 17:16:31 +0100 Subject: RFR: Shenandoah string deduplication In-Reply-To: <38ab37ea-69cc-fb53-9f5e-93990ff752c7@redhat.com> References: <98c59eb5-3052-44a0-44ae-a6899ab89f70@redhat.com> <7f680d35-6b13-ab3f-a4e9-8b129020d8ca@redhat.com> <38ab37ea-69cc-fb53-9f5e-93990ff752c7@redhat.com> Message-ID: This is excellent. Still ok with me. Roman > A minor update: removed creation of unused StringDedupTable_lock: > > --- old/src/hotspot/share/runtime/mutexLocker.cpp??? 2017-12-07 > 08:57:03.732839267 -0500 > +++ new/src/hotspot/share/runtime/mutexLocker.cpp??? 2017-12-07 > 08:57:03.383839085 -0500 > @@ -202,9 +202,6 @@ > ???? def(SATB_Q_FL_lock???????????? , PaddedMutex? , access,????? true, > ?Monitor::_safepoint_check_never); > ???? def(SATB_Q_CBL_mon???????????? , PaddedMonitor, access,????? true, > ?Monitor::_safepoint_check_never); > ???? def(Shared_SATB_Q_lock???????? , PaddedMutex? , access + 1,? true, > ?Monitor::_safepoint_check_never); > -??? // Shenandoah needs (special-1) rank of the lock, because write > barrier can evacuate objects while > -??? // thread holding other locks, such as CodeCache_lock, etc. > -??? def(StringDedupTable_lock????? , PaddedMutex? , special-1,?? true, > Monitor::_safepoint_check_never); > ?? } > ?? def(ParGCRareEvent_lock????????? , PaddedMutex? , leaf???? ,?? true, > ?Monitor::_safepoint_check_sometimes); > ?? def(DerivedPointerTableGC_lock?? , PaddedMutex? , leaf,??????? true, > ?Monitor::_safepoint_check_never); > > > Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/sh_strdedup/webrev.01/ > > Thanks, > > -Zhengyu > > > > On 12/06/2017 02:48 PM, Roman Kennke wrote: >> Am 06.12.2017 um 18:15 schrieb Zhengyu Gu: >>> The implementation takes different approach from G1's implementation. >>> >>> * It treats string dedup queue/table as weak roots. >>> >>> * It identifies deduplication candidates during concurrent marking >>> and during full gc, to reduce impact on pause time. >>> >>> * To properly identify the candidates, it manipulates string's age >>> field concurrently, to prevent from processing string too early or >>> deduplicating the same string over and over again. While it is >>> undesirable, it reduces the risk by only manipulating *regular* >>> strings - when they do not have displaced headers. Locked objects are >>> relative rare, especially for immutable objects, such as Strings. >>> >>> * Deduplicate routine is implemented lock-free, in case we have to >>> embed it in barriers. >>> >>> * Reverted all early changes to G1's implementation >>> >>> Webrev: >>> http://cr.openjdk.java.net/~zgu/shenandoah/sh_strdedup/webrev.00/ >>> >>> Test: >>> ?? hotspot_gc_shenandoah (fastdebug and release) >>> ?? specJBB?????????????? (fastdebug and release) >>> >>> >>> Thanks, >>> >>> -Zhengyu >> >> src/hotspot/share/classfile/stringTable.cpp: >> /me makes a note that this warrants a GC interface >> >> Other than that, I like it! :-) >> >> Probably warrants another pair of eyes. >> >> Roman >> From rkennke at redhat.com Thu Dec 7 16:32:43 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 7 Dec 2017 17:32:43 +0100 Subject: RFR: Cherry-pick fix: 8193133: Assertion failure because 0xDEADDEAD can be in-heap Message-ID: <1be2b887-d21c-1a4f-3f7b-0f84b6d42711@redhat.com> Applies to aarch64-port/jdk8u only: Bug: https://bugs.openjdk.java.net/browse/JDK-8193133 Webrev: http://cr.openjdk.java.net/~rkennke/aarch64-jdk8u-movoops/webrev.01/ Pushed in: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/97df73117bbe I want to cherry-pick it into shenandoah/jdk8u. Ok? Roman From shade at redhat.com Thu Dec 7 16:35:10 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 7 Dec 2017 17:35:10 +0100 Subject: RFR: Cherry-pick fix: 8193133: Assertion failure because 0xDEADDEAD can be in-heap In-Reply-To: <1be2b887-d21c-1a4f-3f7b-0f84b6d42711@redhat.com> References: <1be2b887-d21c-1a4f-3f7b-0f84b6d42711@redhat.com> Message-ID: On 12/07/2017 05:32 PM, Roman Kennke wrote: > Applies to aarch64-port/jdk8u only: > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8193133 > > Webrev: > http://cr.openjdk.java.net/~rkennke/aarch64-jdk8u-movoops/webrev.01/ > > Pushed in: > http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/97df73117bbe > > I want to cherry-pick it into shenandoah/jdk8u. Ok? Yes, go. -Aleksey From shade at redhat.com Thu Dec 7 16:36:56 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 7 Dec 2017 17:36:56 +0100 Subject: RFR: Shenandoah string deduplication In-Reply-To: <38ab37ea-69cc-fb53-9f5e-93990ff752c7@redhat.com> References: <98c59eb5-3052-44a0-44ae-a6899ab89f70@redhat.com> <7f680d35-6b13-ab3f-a4e9-8b129020d8ca@redhat.com> <38ab37ea-69cc-fb53-9f5e-93990ff752c7@redhat.com> Message-ID: On 12/07/2017 03:06 PM, Zhengyu Gu wrote: > Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/sh_strdedup/webrev.01/ This looks good. Nits: *) Weird template overload here, should be only the first one? 48 template 49 void work(T *p); 50 51 template 52 void work(T *p); *) I'd much rather prefer not to introduce intermediate ShenandoahMarkRefsMetadataClosureImpl and ShenandoahMarkResolveRefsClosureImpl, and instead make two additional copies that handle dedup: {ShenandoahMarkRefsMetadataClosure, ShenandoahMarkRefsMetadataDedupClosure, ShenandoahMarkResolveRefsClosure, ShenandoahMarkResolveRefsDedupClosure}. That would duplicate some code, but it would provide better textual structure. Thanks, -Aleksey From roman at kennke.org Thu Dec 7 16:41:40 2017 From: roman at kennke.org (roman at kennke.org) Date: Thu, 07 Dec 2017 16:41:40 +0000 Subject: hg: shenandoah/jdk8u/hotspot: 2 new changesets Message-ID: <201712071641.vB7GfetL021351@aojmv0008.oracle.com> Changeset: 97df73117bbe Author: rkennke Date: 2017-12-07 17:23 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/97df73117bbe 8193133: Assertion failure because 0xDEADDEAD can be in-heap Reviewed-by: aph, adinn ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp Changeset: 7f3509e44acc Author: rkennke Date: 2017-12-07 17:36 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/7f3509e44acc Merge ! src/cpu/aarch64/vm/sharedRuntime_aarch64.cpp From shade at redhat.com Thu Dec 7 17:17:04 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 7 Dec 2017 18:17:04 +0100 Subject: RFR: [8u] Bulk backports to sh/jdk8u Message-ID: <41464c1a-cc91-4f39-2db0-01f4587189eb@redhat.com> http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk8u-20171207/webrev.01/ This backports current work to make AArch64 perform well. [backport] Check BS type in immByteMapBase predicate [backport] Optimize oop/fwdptr/hr_index verification a bit [backport] Optimize fwdptr region handling in ShenandoahVerifyOopClosure::verify_oop [backport] SieveObjects test is too hostile to verification Testing: hotspot_gc_shenandoah {fastdebug|release} Thanks, -Aleksey From shade at redhat.com Thu Dec 7 17:16:58 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 7 Dec 2017 18:16:58 +0100 Subject: RFR: [9] Bulk backports to sh/jdk9 Message-ID: http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk9-20171207/webrev.01/ This backports current work to make AArch64 perform well. [backport] Check BS type in immByteMapBase predicate [backport] Optimize oop/fwdptr/hr_index verification a bit [backport] Optimize fwdptr region handling in ShenandoahVerifyOopClosure::verify_oop [backport] SieveObjects test is too hostile to verification Testing: hotspot_gc_shenandoah {fastdebug|release} Thanks, -Aleksey From rkennke at redhat.com Thu Dec 7 17:28:54 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 7 Dec 2017 18:28:54 +0100 Subject: RFR: [9] Bulk backports to sh/jdk9 In-Reply-To: References: Message-ID: <16207499-63eb-c9a1-39b7-fa07470b5672@redhat.com> Am 07.12.2017 um 18:16 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk9-20171207/webrev.01/ > > This backports current work to make AArch64 perform well. > > [backport] Check BS type in immByteMapBase predicate > [backport] Optimize oop/fwdptr/hr_index verification a bit > [backport] Optimize fwdptr region handling in ShenandoahVerifyOopClosure::verify_oop > [backport] SieveObjects test is too hostile to verification > > Testing: hotspot_gc_shenandoah {fastdebug|release} > > Thanks, > -Aleksey > Patches look good. I built it on aarch64. All green. Roman From rkennke at redhat.com Thu Dec 7 17:39:02 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 7 Dec 2017 18:39:02 +0100 Subject: RFR: [8u] Bulk backports to sh/jdk8u In-Reply-To: <41464c1a-cc91-4f39-2db0-01f4587189eb@redhat.com> References: <41464c1a-cc91-4f39-2db0-01f4587189eb@redhat.com> Message-ID: <73154ac0-cc0d-d1a9-be13-51ea7194f039@redhat.com> Am 07.12.2017 um 18:17 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/backports/jdk8u-20171207/webrev.01/ > > This backports current work to make AArch64 perform well. > > [backport] Check BS type in immByteMapBase predicate > [backport] Optimize oop/fwdptr/hr_index verification a bit > [backport] Optimize fwdptr region handling in ShenandoahVerifyOopClosure::verify_oop > [backport] SieveObjects test is too hostile to verification > > Testing: hotspot_gc_shenandoah {fastdebug|release} > > Thanks, > -Aleksey > Patches look good. I built it on aarch64 and looks good too. Go! Roman From ashipile at redhat.com Thu Dec 7 17:46:09 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Thu, 07 Dec 2017 17:46:09 +0000 Subject: hg: shenandoah/jdk9/hotspot: 4 new changesets Message-ID: <201712071746.vB7Hk98H018727@aojmv0008.oracle.com> Changeset: 5fca18fb4a42 Author: rkennke Date: 2017-12-05 12:37 +0000 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/5fca18fb4a42 [backport] Check BS type in immByteMapBase predicate ! src/cpu/aarch64/vm/aarch64.ad Changeset: b56c807ecff1 Author: shade Date: 2017-12-05 16:38 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/b56c807ecff1 [backport] Optimize oop/fwdptr/hr_index verification a bit ! src/share/vm/gc/shenandoah/brooksPointer.inline.hpp ! src/share/vm/gc/shenandoah/shenandoahHeap.inline.hpp ! src/share/vm/gc/shenandoah/shenandoahVerifier.cpp ! src/share/vm/gc/shenandoah/shenandoahVerifier.hpp Changeset: f85c74370956 Author: shade Date: 2017-12-05 16:59 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/f85c74370956 [backport] Optimize fwdptr region handling in ShenandoahVerifyOopClosure::verify_oop ! src/share/vm/gc/shenandoah/shenandoahVerifier.cpp Changeset: 83a69ba46054 Author: shade Date: 2017-12-05 17:31 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/83a69ba46054 [backport] SieveObjects test is too hostile to verification ! test/gc/shenandoah/acceptance/SieveObjects.java From ashipile at redhat.com Thu Dec 7 17:46:46 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Thu, 07 Dec 2017 17:46:46 +0000 Subject: hg: shenandoah/jdk8u/hotspot: 4 new changesets Message-ID: <201712071746.vB7Hkkun019090@aojmv0008.oracle.com> Changeset: 544155158bd5 Author: rkennke Date: 2017-12-05 12:37 +0000 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/544155158bd5 [backport] Check BS type in immByteMapBase predicate ! src/cpu/aarch64/vm/aarch64.ad Changeset: 57431ac7c030 Author: shade Date: 2017-12-05 16:38 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/57431ac7c030 [backport] Optimize oop/fwdptr/hr_index verification a bit ! src/share/vm/gc_implementation/shenandoah/brooksPointer.inline.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahHeap.inline.hpp ! src/share/vm/gc_implementation/shenandoah/shenandoahVerifier.cpp ! src/share/vm/gc_implementation/shenandoah/shenandoahVerifier.hpp Changeset: 5a04129c5363 Author: shade Date: 2017-12-05 16:59 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/5a04129c5363 [backport] Optimize fwdptr region handling in ShenandoahVerifyOopClosure::verify_oop ! src/share/vm/gc_implementation/shenandoah/shenandoahVerifier.cpp Changeset: 716da20aa2d6 Author: shade Date: 2017-12-05 17:31 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/716da20aa2d6 [backport] SieveObjects test is too hostile to verification ! test/gc/shenandoah/acceptance/SieveObjects.java From rkennke at redhat.com Thu Dec 7 18:36:24 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 7 Dec 2017 19:36:24 +0100 Subject: RFR: Increase test timeouts Message-ID: <087b10cf-9cf9-bee6-6104-5f37dddc0f4b@redhat.com> I had to increase timeouts for EvilSyncBug and TestHeapDump to make them pass on aph's aarch64 box, which is not fast by itself, but has many cores which means jtreg test runner spawns many processes, which are probably slowing each other down. http://cr.openjdk.java.net/~rkennke/testtimeouts/webrev.00/ I've chosen the values by experimenting and increasing them in minute-increments until all tests passed. Ok? Roman From zgu at redhat.com Thu Dec 7 19:11:09 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 7 Dec 2017 14:11:09 -0500 Subject: RFR: Increase test timeouts In-Reply-To: <087b10cf-9cf9-bee6-6104-5f37dddc0f4b@redhat.com> References: <087b10cf-9cf9-bee6-6104-5f37dddc0f4b@redhat.com> Message-ID: <1aa90601-ec08-fd5c-ce9d-347ff9757551@redhat.com> Okay. -Zhengyu On 12/07/2017 01:36 PM, Roman Kennke wrote: > I had to increase timeouts for EvilSyncBug and TestHeapDump to make them > pass on aph's aarch64 box, which is not fast by itself, but has many > cores which means jtreg test runner spawns many processes, which are > probably slowing each other down. > > http://cr.openjdk.java.net/~rkennke/testtimeouts/webrev.00/ > > I've chosen the values by experimenting and increasing them in > minute-increments until all tests passed. > > Ok? > Roman From shade at redhat.com Thu Dec 7 19:18:52 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 7 Dec 2017 20:18:52 +0100 Subject: RFR: Increase test timeouts In-Reply-To: <087b10cf-9cf9-bee6-6104-5f37dddc0f4b@redhat.com> References: <087b10cf-9cf9-bee6-6104-5f37dddc0f4b@redhat.com> Message-ID: <20cabb2e-f2ad-5fa8-c96c-4971585ba5b4@redhat.com> On 12/07/2017 07:36 PM, Roman Kennke wrote: > I had to increase timeouts for EvilSyncBug and TestHeapDump to make them pass on aph's aarch64 box, > which is not fast by itself, but has many cores which means jtreg test runner spawns many processes, > which are probably slowing each other down. > > http://cr.openjdk.java.net/~rkennke/testtimeouts/webrev.00/ Okay! -Aleksey From roman at kennke.org Thu Dec 7 19:34:24 2017 From: roman at kennke.org (roman at kennke.org) Date: Thu, 07 Dec 2017 19:34:24 +0000 Subject: hg: shenandoah/jdk10: Increase test timeouts Message-ID: <201712071934.vB7JYOI8003778@aojmv0008.oracle.com> Changeset: 317e2201fbc4 Author: rkennke Date: 2017-12-07 19:30 +0000 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/317e2201fbc4 Increase test timeouts ! test/hotspot/jtreg/gc/shenandoah/EvilSyncBug.java ! test/hotspot/jtreg/gc/shenandoah/jvmti/TestHeapDump.java From zgu at redhat.com Thu Dec 7 19:34:25 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 7 Dec 2017 14:34:25 -0500 Subject: RFR: Shenandoah string deduplication In-Reply-To: References: <98c59eb5-3052-44a0-44ae-a6899ab89f70@redhat.com> <7f680d35-6b13-ab3f-a4e9-8b129020d8ca@redhat.com> <38ab37ea-69cc-fb53-9f5e-93990ff752c7@redhat.com> Message-ID: <5a2ab1e0-fd82-aee2-77df-4bbaf8e77fed@redhat.com> On 12/07/2017 11:36 AM, Aleksey Shipilev wrote: > On 12/07/2017 03:06 PM, Zhengyu Gu wrote: >> Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/sh_strdedup/webrev.01/ > > This looks good. > > Nits: > > *) Weird template overload here, should be only the first one? > > 48 template > 49 void work(T *p); > 50 > 51 template > 52 void work(T *p); > > *) I'd much rather prefer not to introduce intermediate ShenandoahMarkRefsMetadataClosureImpl and > ShenandoahMarkResolveRefsClosureImpl, and instead make two additional copies that handle dedup: > {ShenandoahMarkRefsMetadataClosure, ShenandoahMarkRefsMetadataDedupClosure, > ShenandoahMarkResolveRefsClosure, ShenandoahMarkResolveRefsDedupClosure}. That would duplicate some > code, but it would provide better textual structure. Changed accordingly. Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/sh_strdedup/webrev.02/ Reran hotspot_gc_shenandoah tests (fastdebug + release) Thanks, -Zhengyu > > Thanks, > -Aleksey > From shade at redhat.com Thu Dec 7 19:41:17 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 7 Dec 2017 20:41:17 +0100 Subject: RFR: Shenandoah string deduplication In-Reply-To: <5a2ab1e0-fd82-aee2-77df-4bbaf8e77fed@redhat.com> References: <98c59eb5-3052-44a0-44ae-a6899ab89f70@redhat.com> <7f680d35-6b13-ab3f-a4e9-8b129020d8ca@redhat.com> <38ab37ea-69cc-fb53-9f5e-93990ff752c7@redhat.com> <5a2ab1e0-fd82-aee2-77df-4bbaf8e77fed@redhat.com> Message-ID: <7956e30c-bb9b-a669-6ce8-7a274c7b6770@redhat.com> On 12/07/2017 08:34 PM, Zhengyu Gu wrote: > Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/sh_strdedup/webrev.02/ Looks good, thanks! Please run SPECjvm with fastdebug and aggressive options [1] to make sure we don't have trivial bugs. If it passes, we are all good. -Aleksey [1] -foe true -f 1 -wi 5 -i 5 -t 1 -w 300ms -r 300ms --jvmArgs "-Xmx1g -Xms1g -XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=aggressive -XX:+VerifyStrictOopOperations -XX:+ShenandoahStoreCheck -XX:+ShenandoahVerifyOptoBarriers -XX:+ShenandoahOOMDuringEvacALot" From zgu at redhat.com Thu Dec 7 20:43:55 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 7 Dec 2017 15:43:55 -0500 Subject: RFR: Shenandoah string deduplication In-Reply-To: <7956e30c-bb9b-a669-6ce8-7a274c7b6770@redhat.com> References: <98c59eb5-3052-44a0-44ae-a6899ab89f70@redhat.com> <7f680d35-6b13-ab3f-a4e9-8b129020d8ca@redhat.com> <38ab37ea-69cc-fb53-9f5e-93990ff752c7@redhat.com> <5a2ab1e0-fd82-aee2-77df-4bbaf8e77fed@redhat.com> <7956e30c-bb9b-a669-6ce8-7a274c7b6770@redhat.com> Message-ID: On 12/07/2017 02:41 PM, Aleksey Shipilev wrote: > On 12/07/2017 08:34 PM, Zhengyu Gu wrote: >> Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/sh_strdedup/webrev.02/ > > Looks good, thanks! > > Please run SPECjvm with fastdebug and aggressive options [1] to make sure we don't have trivial > bugs. If it passes, we are all good. > > -Aleksey > > [1] -foe true -f 1 -wi 5 -i 5 -t 1 -w 300ms -r 300ms --jvmArgs "-Xmx1g -Xms1g -XX:+UseShenandoahGC > -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=aggressive -XX:+VerifyStrictOopOperations > -XX:+ShenandoahStoreCheck -XX:+ShenandoahVerifyOptoBarriers -XX:+ShenandoahOOMDuringEvacALot" All clean! Can I push now? Thanks, -Zhengyu > > From shade at redhat.com Thu Dec 7 20:47:12 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 7 Dec 2017 21:47:12 +0100 Subject: RFR: Shenandoah string deduplication In-Reply-To: References: <98c59eb5-3052-44a0-44ae-a6899ab89f70@redhat.com> <7f680d35-6b13-ab3f-a4e9-8b129020d8ca@redhat.com> <38ab37ea-69cc-fb53-9f5e-93990ff752c7@redhat.com> <5a2ab1e0-fd82-aee2-77df-4bbaf8e77fed@redhat.com> <7956e30c-bb9b-a669-6ce8-7a274c7b6770@redhat.com> Message-ID: <52fc2b8d-f599-914b-3605-1e9c93489841@redhat.com> On 12/07/2017 09:43 PM, Zhengyu Gu wrote: > > > On 12/07/2017 02:41 PM, Aleksey Shipilev wrote: >> On 12/07/2017 08:34 PM, Zhengyu Gu wrote: >>> Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/sh_strdedup/webrev.02/ >> >> Looks good, thanks! >> >> Please run SPECjvm with fastdebug and aggressive options [1] to make sure we don't have trivial >> bugs. If it passes, we are all good. >> >> -Aleksey >> >> [1] -foe true -f 1 -wi 5 -i 5 -t 1 -w 300ms -r 300ms --jvmArgs "-Xmx1g -Xms1g -XX:+UseShenandoahGC >> -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=aggressive -XX:+VerifyStrictOopOperations >> -XX:+ShenandoahStoreCheck -XX:+ShenandoahVerifyOptoBarriers -XX:+ShenandoahOOMDuringEvacALot" > > All clean! Can I push now? Yes, I think so. -Aleksey From zgu at redhat.com Thu Dec 7 21:22:15 2017 From: zgu at redhat.com (zgu at redhat.com) Date: Thu, 07 Dec 2017 21:22:15 +0000 Subject: hg: shenandoah/jdk10: Shenandoah string deduplication support Message-ID: <201712072122.vB7LMFWK019173@aojmv0008.oracle.com> Changeset: 870847e12029 Author: zgu Date: 2017-12-07 16:18 -0500 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/870847e12029 Shenandoah string deduplication support ! src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp ! src/hotspot/cpu/x86/stubGenerator_x86_64.cpp ! src/hotspot/share/classfile/stringTable.cpp ! src/hotspot/share/gc/g1/g1StringDedup.hpp ! src/hotspot/share/gc/g1/g1StringDedupQueue.cpp ! src/hotspot/share/gc/g1/g1StringDedupQueue.hpp ! src/hotspot/share/gc/g1/g1StringDedupTable.cpp ! src/hotspot/share/gc/g1/g1StringDedupThread.cpp ! src/hotspot/share/gc/g1/g1StringDedupThread.hpp ! src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp ! src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.hpp ! src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.inline.hpp ! src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp ! src/hotspot/share/gc/shenandoah/shenandoahHeap.inline.hpp ! src/hotspot/share/gc/shenandoah/shenandoahMarkCompact.cpp ! src/hotspot/share/gc/shenandoah/shenandoahOopClosures.hpp ! src/hotspot/share/gc/shenandoah/shenandoahOopClosures.inline.hpp ! src/hotspot/share/gc/shenandoah/shenandoahPartialGC.cpp ! src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.cpp ! src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.hpp ! src/hotspot/share/gc/shenandoah/shenandoahRootProcessor.cpp + src/hotspot/share/gc/shenandoah/shenandoahStrDedupQueue.cpp + src/hotspot/share/gc/shenandoah/shenandoahStrDedupQueue.hpp + src/hotspot/share/gc/shenandoah/shenandoahStrDedupQueue.inline.hpp + src/hotspot/share/gc/shenandoah/shenandoahStrDedupTable.cpp + src/hotspot/share/gc/shenandoah/shenandoahStrDedupTable.hpp + src/hotspot/share/gc/shenandoah/shenandoahStrDedupThread.cpp + src/hotspot/share/gc/shenandoah/shenandoahStrDedupThread.hpp ! src/hotspot/share/gc/shenandoah/shenandoahStringDedup.cpp ! src/hotspot/share/gc/shenandoah/shenandoahStringDedup.hpp ! src/hotspot/share/runtime/arguments.cpp ! src/hotspot/share/runtime/mutexLocker.cpp ! test/hotspot/jtreg/gc/shenandoah/ShenandoahStrDedupStress.java ! test/hotspot/jtreg/gc/shenandoah/TestShenandoahStrDedup.java From rkennke at redhat.com Fri Dec 8 16:07:17 2017 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 8 Dec 2017 17:07:17 +0100 Subject: RFR: Don't assert Shenandoah safepoint in verifier Message-ID: During testing with jdk9 and jdk8, I hit the assert for Shenandoah safepoint in the verifier at least once, presumably duing verify-before-exit. This change relaxes back to asserting any safepoint in verifier. Ok? Testing: hotspot_gc_shenandoah http://cr.openjdk.java.net/~rkennke/no-sh-sp/webrev.00/ Roman From shade at redhat.com Sat Dec 9 00:07:44 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Sat, 9 Dec 2017 01:07:44 +0100 Subject: RFR: Don't assert Shenandoah safepoint in verifier In-Reply-To: References: Message-ID: On 12/08/2017 05:07 PM, Roman Kennke wrote: > During testing with jdk9 and jdk8, I hit the assert for Shenandoah safepoint in the verifier at > least once, presumably duing verify-before-exit. This change relaxes back to asserting any safepoint > in verifier. Ok? > > Testing: hotspot_gc_shenandoah > > http://cr.openjdk.java.net/~rkennke/no-sh-sp/webrev.00/ Not really. We should instead remove "|| !UseTLAB" in ShenandoahHeap::verify_generic, if that is the path we are going in. -Aleksey From shade at redhat.com Mon Dec 11 11:42:00 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 11 Dec 2017 12:42:00 +0100 Subject: Shenandoah WB and XMM spills? Message-ID: <9540614c-2d1d-6098-72c5-b345dc9c2635@redhat.com> Following up on some performance experiments, it seems that enabling Shenandoah WB disables XMM spills, and that makes more L1 load/stores, which seems to be responsible for performance difference. $ java -jar target/benchmarks.jar Serial --jvmArgs "-Xmx16g -Xms16g -XX:+AlwaysPreTouch -XX:-TieredCompilation -XX:+DisableExplicitGC -XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=passive" -f 1 -t 1 -prof perfasm # -XX:-ShenandoahWriteBarrier 2484.287 ? 2.759 ops/s http://cr.openjdk.java.net/~shade/shenandoah/wtf-wb-xmm/wb-disabled.perfasm (I see %xmm-based spills here) # -XX:+ShenandoahWriteBarrier 2303.283 ? 4.912 ops/s http://cr.openjdk.java.net/~shade/shenandoah/wtf-wb-xmm/wb-enabled.perfasm (I see %rsp-based spills here) Thanks, -Aleksey From shade at redhat.com Mon Dec 11 17:12:41 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 11 Dec 2017 18:12:41 +0100 Subject: RFR: Report fwdptr size in JNI GetObjectSize Message-ID: <170956e5-48c4-4c73-e053-0f946f7d9ca0@redhat.com> http://cr.openjdk.java.net/~shade/shenandoah/objsize-fwdptr/webrev.01/ To make cross-GC comparisons more convenient, it makes sense to let Shenandoah report JNI GetObjectSize added up with forwarding pointer. This seems innocuous, because it is specified to return implementation-specific value ("This size is an implementation-specific approximation of the amount of storage consumed by this object"), and it does not touch "actual" oopDesc::size() which is used everywhere else in critical VM code. Testing: hotspot_gc_shenandoah, eyeballing JOL output for different GCs # -XX:+UseParallelGC java.lang.String object internals: OFFSET SIZE TYPE DESCRIPTION VALUE 0 4 (object header) 05 00 00 00 4 4 (object header) 00 00 00 00 8 4 (object header) 18 16 00 00 12 4 byte[] String.value [] 16 4 int String.hash 0 20 1 byte String.coder 0 21 3 (loss due to the next object alignment) Instance size: 24 bytes Space losses: 0 bytes internal + 3 bytes external = 3 bytes total # -XX:+UseShenandoahGC java.lang.String object internals: OFFSET SIZE TYPE DESCRIPTION VALUE 0 4 (object header) 05 00 00 00 4 4 (object header) 00 00 00 00 8 4 (object header) 18 16 00 00 12 4 byte[] String.value [] 16 4 int String.hash 0 20 1 byte String.coder 0 21 11 (loss due to the next object alignment) Instance size: 32 bytes Space losses: 0 bytes internal + 11 bytes external = 11 bytes total Thanks, -Aleksey From rkennke at redhat.com Mon Dec 11 17:14:41 2017 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 11 Dec 2017 18:14:41 +0100 Subject: RFR: Report fwdptr size in JNI GetObjectSize In-Reply-To: <170956e5-48c4-4c73-e053-0f946f7d9ca0@redhat.com> References: <170956e5-48c4-4c73-e053-0f946f7d9ca0@redhat.com> Message-ID: <1c9dd78e-b4e1-bd4a-20f7-4765642247fa@redhat.com> Am 11.12.2017 um 18:12 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/objsize-fwdptr/webrev.01/ > > To make cross-GC comparisons more convenient, it makes sense to let Shenandoah report JNI > GetObjectSize added up with forwarding pointer. This seems innocuous, because it is specified to > return implementation-specific value ("This size is an implementation-specific approximation of the > amount of storage consumed by this object"), and it does not touch "actual" oopDesc::size() which is > used everywhere else in critical VM code. > > Testing: hotspot_gc_shenandoah, eyeballing JOL output for different GCs > > # -XX:+UseParallelGC > > java.lang.String object internals: > OFFSET SIZE TYPE DESCRIPTION VALUE > 0 4 (object header) 05 00 00 00 > 4 4 (object header) 00 00 00 00 > 8 4 (object header) 18 16 00 00 > 12 4 byte[] String.value [] > 16 4 int String.hash 0 > 20 1 byte String.coder 0 > 21 3 (loss due to the next object alignment) > Instance size: 24 bytes > Space losses: 0 bytes internal + 3 bytes external = 3 bytes total > > # -XX:+UseShenandoahGC > > java.lang.String object internals: > OFFSET SIZE TYPE DESCRIPTION VALUE > 0 4 (object header) 05 00 00 00 > 4 4 (object header) 00 00 00 00 > 8 4 (object header) 18 16 00 00 > 12 4 byte[] String.value [] > 16 4 int String.hash 0 > 20 1 byte String.coder 0 > 21 11 (loss due to the next object alignment) > Instance size: 32 bytes > Space losses: 0 bytes internal + 11 bytes external = 11 bytes total > > > Thanks, > -Aleksey > Yes that sounds reasonable. Roman From ashipile at redhat.com Tue Dec 12 11:22:48 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Tue, 12 Dec 2017 11:22:48 +0000 Subject: hg: shenandoah/jdk10: Report fwdptr size in JNI GetObjectSize Message-ID: <201712121122.vBCBMmfw020126@aojmv0008.oracle.com> Changeset: 81b6c6bd635b Author: shade Date: 2017-12-12 11:51 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/81b6c6bd635b Report fwdptr size in JNI GetObjectSize ! src/hotspot/share/prims/jvmtiEnv.cpp ! src/hotspot/share/prims/whitebox.cpp From rkennke at redhat.com Tue Dec 12 14:26:22 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 12 Dec 2017 15:26:22 +0100 Subject: RFR: Don't assert Shenandoah safepoint in verifier In-Reply-To: References: Message-ID: Am 09.12.2017 um 01:07 schrieb Aleksey Shipilev: > On 12/08/2017 05:07 PM, Roman Kennke wrote: >> During testing with jdk9 and jdk8, I hit the assert for Shenandoah safepoint in the verifier at >> least once, presumably duing verify-before-exit. This change relaxes back to asserting any safepoint >> in verifier. Ok? >> >> Testing: hotspot_gc_shenandoah >> >> http://cr.openjdk.java.net/~rkennke/no-sh-sp/webrev.00/ > > Not really. We should instead remove "|| !UseTLAB" in ShenandoahHeap::verify_generic, if that is the > path we are going in. > > -Aleksey > > Right. Like this: http://cr.openjdk.java.net/~rkennke/no-sh-sp/webrev.01/ Still passes tests. Roman From shade at redhat.com Tue Dec 12 14:26:58 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 12 Dec 2017 15:26:58 +0100 Subject: RFR: Don't assert Shenandoah safepoint in verifier In-Reply-To: References: Message-ID: On 12/12/2017 03:26 PM, Roman Kennke wrote: > Am 09.12.2017 um 01:07 schrieb Aleksey Shipilev: >> On 12/08/2017 05:07 PM, Roman Kennke wrote: >>> During testing with jdk9 and jdk8, I hit the assert for Shenandoah safepoint in the verifier at >>> least once, presumably duing verify-before-exit. This change relaxes back to asserting any safepoint >>> in verifier. Ok? >>> >>> Testing: hotspot_gc_shenandoah >>> >>> http://cr.openjdk.java.net/~rkennke/no-sh-sp/webrev.00/ >> >> Not really. We should instead remove "|| !UseTLAB" in ShenandoahHeap::verify_generic, if that is the >> path we are going in. >> >> -Aleksey >> >> > > Right. Like this: > > http://cr.openjdk.java.net/~rkennke/no-sh-sp/webrev.01/ Exactly. My fault for not doing that in my last changeset. -Aleksey From rwestrel at redhat.com Tue Dec 12 15:00:15 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 12 Dec 2017 16:00:15 +0100 Subject: Shenandoah WB and XMM spills? In-Reply-To: <9540614c-2d1d-6098-72c5-b345dc9c2635@redhat.com> References: <9540614c-2d1d-6098-72c5-b345dc9c2635@redhat.com> Message-ID: Can you try the attached patch? Roland. -------------- next part -------------- A non-text attachment was scrubbed... Name: xmm.patch Type: text/x-patch Size: 1705 bytes Desc: not available URL: From shade at redhat.com Tue Dec 12 15:38:46 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 12 Dec 2017 16:38:46 +0100 Subject: Shenandoah WB and XMM spills? In-Reply-To: References: <9540614c-2d1d-6098-72c5-b345dc9c2635@redhat.com> Message-ID: On 12/12/2017 04:00 PM, Roland Westrelin wrote: > Can you try the attached patch? Thank you, that works! Serial: XMM spills are now there. L1-dcache-stores with +WB are now the same as with -WB (-15% reduction) L1-dcache-loads are also reduced around -15% 1% throughput improvement. -Aleksey From rwestrel at redhat.com Tue Dec 12 15:59:05 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 12 Dec 2017 16:59:05 +0100 Subject: Shenandoah WB and XMM spills? In-Reply-To: References: <9540614c-2d1d-6098-72c5-b345dc9c2635@redhat.com> Message-ID: How safe do we think that change really is? The C compiler or libc could use fp registers under the hood for bulk copies for instance. Roland. From rkennke at redhat.com Tue Dec 12 16:01:25 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 12 Dec 2017 17:01:25 +0100 Subject: Shenandoah WB and XMM spills? In-Reply-To: References: <9540614c-2d1d-6098-72c5-b345dc9c2635@redhat.com> Message-ID: Am 12.12.2017 um 16:59 schrieb Roland Westrelin: > > How safe do we think that change really is? The C compiler or libc could > use fp registers under the hood for bulk copies for instance. > > Roland. > And it does. The thing is that we first call into our WB stub, which we know does *not* use FP regs. Only if that one fails, we call into the runtime, which uses memcpy, which will use FP regs. However, the stub also saves/restores FP regs before doing that. Should be safe, no? Roman From rwestrel at redhat.com Tue Dec 12 16:07:57 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 12 Dec 2017 17:07:57 +0100 Subject: Shenandoah WB and XMM spills? In-Reply-To: References: <9540614c-2d1d-6098-72c5-b345dc9c2635@redhat.com> Message-ID: > The thing is that we first call into our WB stub, which we know does > *not* use FP regs. Only if that one fails, we call into the runtime, > which uses memcpy, which will use FP regs. However, the stub also > saves/restores FP regs before doing that. Should be safe, no? Right. Roland. From roman at kennke.org Tue Dec 12 21:30:48 2017 From: roman at kennke.org (roman at kennke.org) Date: Tue, 12 Dec 2017 21:30:48 +0000 Subject: hg: shenandoah/jdk10: Disable verification from non-Shenandoah VMOps. Message-ID: <201712122130.vBCLUmPg012060@aojmv0008.oracle.com> Changeset: 30d6eb7c2df9 Author: rkennke Date: 2017-12-12 22:26 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/30d6eb7c2df9 Disable verification from non-Shenandoah VMOps. ! src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp From shade at redhat.com Wed Dec 13 18:10:09 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 13 Dec 2017 19:10:09 +0100 Subject: RFR: [9] Remove .jcheck from hotspot Message-ID: <4822cf72-b3be-ea5b-763f-1598a2b74613@redhat.com> Remove .jcheck from hotspot repo, otherwise this prevents checking out sh/jdk9 when jcheck hook is enabled. $ hg diff diff -r 83a69ba46054 .jcheck/conf --- a/.jcheck/conf Tue Dec 05 17:31:55 2017 +0100 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,2 +0,0 @@ -project=jdk9 -bugids=dup sh/jdk8 got that from aarch64/jdk8u seed: http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/4c3f7e682e48 http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/4c3f7e682e48 sh/jdk10 would stay with jcheck for a time being Thanks, -Aleksey From shade at redhat.com Wed Dec 13 20:22:03 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 13 Dec 2017 21:22:03 +0100 Subject: RFR: Cleanup reset_{next|complete}_mark_bitmap Message-ID: <5d6e409e-7bfe-dcc3-dc9c-647a20283236@redhat.com> http://cr.openjdk.java.net/~shade/shenandoah/cleanups-7/webrev.01/ Trivial cleanup: these methods are always called with workers(), we might as well inline. Testing: hotspot_fast_gc_shenandoah Thanks, -Aleksey From rkennke at redhat.com Wed Dec 13 20:48:41 2017 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 13 Dec 2017 21:48:41 +0100 Subject: RFR: Cleanup reset_{next|complete}_mark_bitmap In-Reply-To: <5d6e409e-7bfe-dcc3-dc9c-647a20283236@redhat.com> References: <5d6e409e-7bfe-dcc3-dc9c-647a20283236@redhat.com> Message-ID: Am 13.12.2017 um 21:22 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/cleanups-7/webrev.01/ > > Trivial cleanup: these methods are always called with workers(), we might as well inline. > > Testing: hotspot_fast_gc_shenandoah > > Thanks, > -Aleksey > Yes! From rkennke at redhat.com Wed Dec 13 21:33:45 2017 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 13 Dec 2017 22:33:45 +0100 Subject: RFR: [9] Remove .jcheck from hotspot In-Reply-To: <4822cf72-b3be-ea5b-763f-1598a2b74613@redhat.com> References: <4822cf72-b3be-ea5b-763f-1598a2b74613@redhat.com> Message-ID: Am 13.12.2017 um 19:10 schrieb Aleksey Shipilev: > Remove .jcheck from hotspot repo, otherwise this prevents checking out sh/jdk9 when jcheck hook is > enabled. > > $ hg diff > diff -r 83a69ba46054 .jcheck/conf > --- a/.jcheck/conf Tue Dec 05 17:31:55 2017 +0100 > +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 > @@ -1,2 +0,0 @@ > -project=jdk9 > -bugids=dup > > > sh/jdk8 got that from aarch64/jdk8u seed: > http://hg.openjdk.java.net/aarch64-port/jdk8u/hotspot/rev/4c3f7e682e48 > http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/4c3f7e682e48 > > sh/jdk10 would stay with jcheck for a time being > > Thanks, > -Aleksey > Yes sure. From ashipile at redhat.com Wed Dec 13 21:56:44 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Wed, 13 Dec 2017 21:56:44 +0000 Subject: hg: shenandoah/jdk10: Cleanup reset_{next|complete}_mark_bitmap Message-ID: <201712132156.vBDLuitb018412@aojmv0008.oracle.com> Changeset: 7ddb5f33c0a4 Author: shade Date: 2017-12-13 21:24 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/7ddb5f33c0a4 Cleanup reset_{next|complete}_mark_bitmap ! src/hotspot/share/gc/shenandoah/shenandoahConcurrentThread.cpp ! src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp ! src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp ! src/hotspot/share/gc/shenandoah/shenandoahMarkCompact.cpp From ashipile at redhat.com Wed Dec 13 21:57:19 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Wed, 13 Dec 2017 21:57:19 +0000 Subject: hg: shenandoah/jdk9/hotspot: Remove .jcheck from hotspot Message-ID: <201712132157.vBDLvJo8018820@aojmv0008.oracle.com> Changeset: 3aacd37b032f Author: shade Date: 2017-12-13 22:53 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/3aacd37b032f Remove .jcheck from hotspot - .jcheck/conf From shade at redhat.com Thu Dec 14 09:52:27 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 14 Dec 2017 10:52:27 +0100 Subject: RFR: Verifier should check klass pointers before attempting to reach for object size Message-ID: <3862bde7-ad8e-264c-28c1-04d7b17d4cff@redhat.com> http://cr.openjdk.java.net/~shade/shenandoah/verifier-klass-checks/webrev.01/ Debug a new feature -> corrupt the heap -> crash the Verifier. Verifier should be more resilient when Klass pointers are NULL, in which case oopDesc::size() is guaranteed to fail. Testing: hotspot_gc_shenandoah Thanks, -Aleksey From rkennke at redhat.com Thu Dec 14 10:53:39 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 14 Dec 2017 11:53:39 +0100 Subject: RFR: Verifier should check klass pointers before attempting to reach for object size In-Reply-To: <3862bde7-ad8e-264c-28c1-04d7b17d4cff@redhat.com> References: <3862bde7-ad8e-264c-28c1-04d7b17d4cff@redhat.com> Message-ID: Am 14.12.2017 um 10:52 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/verifier-klass-checks/webrev.01/ > > Debug a new feature -> corrupt the heap -> crash the Verifier. Verifier should be more resilient > when Klass pointers are NULL, in which case oopDesc::size() is guaranteed to fail. > > Testing: hotspot_gc_shenandoah > > Thanks, > -Aleksey > Very good. Go! From ashipile at redhat.com Thu Dec 14 11:41:46 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Thu, 14 Dec 2017 11:41:46 +0000 Subject: hg: shenandoah/jdk10: Verifier should check klass pointers before attempting to reach for object size Message-ID: <201712141141.vBEBfkjU012511@aojmv0008.oracle.com> Changeset: 90f120de8a35 Author: shade Date: 2017-12-14 12:02 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/90f120de8a35 Verifier should check klass pointers before attempting to reach for object size ! src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp From shade at redhat.com Thu Dec 14 12:23:49 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 14 Dec 2017 13:23:49 +0100 Subject: RFR: TestSelectiveBarrierFlags times out due to too aggressive compilation mode Message-ID: TestSelectiveBarrierFlags uses "-Xcomp" to make sure we actually reach compiler paths during "Hello World" invocation. Unfortunately, that takes up to 3s for run. Using a more relaxed "-Xbatch -XX:CompileThreshold=..." makes the test still reach the compilation path and improves test run time around 10x. The improvement is because we don't compile overly cold methods, and test still works, because we compile warmer ones. Patch: diff -r 90f120de8a35 -r e6ce6500a167 test/hotspot/jtreg/gc/shenandoah/TestSelectiveBarrierFlags.java --- a/test/hotspot/jtreg/gc/shenandoah/TestSelectiveBarrierFlags.java Thu Dec 14 12:02:33 2017 +0100 +++ b/test/hotspot/jtreg/gc/shenandoah/TestSelectiveBarrierFlags.java Thu Dec 14 13:16:54 2017 +0100 @@ -26,8 +26,8 @@ * of barrier flags * @library /test/lib * @run main/othervm TestSelectiveBarrierFlags -Xint - * @run main/othervm TestSelectiveBarrierFlags -Xcomp -XX:TieredStopAtLevel=1 - * @run main/othervm TestSelectiveBarrierFlags -Xcomp -XX:-TieredCompilation -XX:+IgnoreUnrecognizedVMOptions -XX:+ShenandoahVerifyOptoBarriers + * @run main/othervm TestSelectiveBarrierFlags -Xbatch -XX:CompileThreshold=100 -XX:TieredStopAtLevel=1 + * @run main/othervm TestSelectiveBarrierFlags -Xbatch -XX:CompileThreshold=100 -XX:-TieredCompilation -XX:+IgnoreUnrecognizedVMOptions -XX:+ShenandoahVerifyOptoBarriers */ import java.util.*; Testing: TestSelectiveBarrierFlags (fastdebug|release) Thanks, -Aleksey From rkennke at redhat.com Thu Dec 14 12:51:49 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 14 Dec 2017 13:51:49 +0100 Subject: RFR: TestSelectiveBarrierFlags times out due to too aggressive compilation mode In-Reply-To: References: Message-ID: <28e07205-1316-3b40-94bf-a9a839806a80@redhat.com> Am 14.12.2017 um 13:23 schrieb Aleksey Shipilev: > TestSelectiveBarrierFlags uses "-Xcomp" to make sure we actually reach compiler paths during "Hello > World" invocation. Unfortunately, that takes up to 3s for run. Using a more relaxed "-Xbatch > -XX:CompileThreshold=..." makes the test still reach the compilation path and improves test run time > around 10x. The improvement is because we don't compile overly cold methods, and test still works, > because we compile warmer ones. > > Patch: > > diff -r 90f120de8a35 -r e6ce6500a167 test/hotspot/jtreg/gc/shenandoah/TestSelectiveBarrierFlags.java > --- a/test/hotspot/jtreg/gc/shenandoah/TestSelectiveBarrierFlags.java Thu Dec 14 12:02:33 2017 +0100 > +++ b/test/hotspot/jtreg/gc/shenandoah/TestSelectiveBarrierFlags.java Thu Dec 14 13:16:54 2017 +0100 > @@ -26,8 +26,8 @@ > * of barrier flags > * @library /test/lib > * @run main/othervm TestSelectiveBarrierFlags -Xint > - * @run main/othervm TestSelectiveBarrierFlags -Xcomp -XX:TieredStopAtLevel=1 > - * @run main/othervm TestSelectiveBarrierFlags -Xcomp -XX:-TieredCompilation > -XX:+IgnoreUnrecognizedVMOptions -XX:+ShenandoahVerifyOptoBarriers > + * @run main/othervm TestSelectiveBarrierFlags -Xbatch -XX:CompileThreshold=100 -XX:TieredStopAtLevel=1 > + * @run main/othervm TestSelectiveBarrierFlags -Xbatch -XX:CompileThreshold=100 > -XX:-TieredCompilation -XX:+IgnoreUnrecognizedVMOptions -XX:+ShenandoahVerifyOptoBarriers > */ > > import java.util.*; > > > Testing: TestSelectiveBarrierFlags (fastdebug|release) > > Thanks, > -Aleksey > Yes, that solves the problem for me. Please push it. Roman From ashipile at redhat.com Thu Dec 14 12:56:55 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Thu, 14 Dec 2017 12:56:55 +0000 Subject: hg: shenandoah/jdk10: TestSelectiveBarrierFlags times out due to too aggressive compilation mode Message-ID: <201712141256.vBECut2S014598@aojmv0008.oracle.com> Changeset: ec350905c939 Author: shade Date: 2017-12-14 13:18 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/ec350905c939 TestSelectiveBarrierFlags times out due to too aggressive compilation mode ! test/hotspot/jtreg/gc/shenandoah/TestSelectiveBarrierFlags.java From shade at redhat.com Thu Dec 14 16:36:00 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 14 Dec 2017 17:36:00 +0100 Subject: RFR: Rehash VMOperations and cycle driver mechanics for consistency" Message-ID: http://cr.openjdk.java.net/~shade/shenandoah/operations-cleanup/webrev.01/ This is the preparation cleanup for Degenerate GC. The changes in this webrev are not about functionality, but about harmonizing the code for future changes. Brief tour: a) Three groups of methods are now in ShenandoahHeap: entry-points with the safepoint, entry-points without the safepoints, and private group that does the actual operations. b) vmop_entry_* do all the needed setup, including capturing the gross GC times (In future, we may report both gross and net times there, to capture these in GC logs!). These entry-points would initiate the safepoint and call into entry_* methods. VM_Shenandoah* operations are now the simple trampolines back to entry_*. c) entry_* do all the rest of needed setup (assuming safepoint or not), including figuring out the worker counts, recording net times, and calling into op_* methods. d) op_* is where we do the actual thing for each phase. (Spoiler alert: Degenerate GC would just call op_* methods in correct order, entering via single VMOp) e) Minor corrections in GCMark and stats for Full GC code f) Minor typo changes Testing: hotspot_gc_shenandoah {fastdebug|release} Thanks, -Aleksey From rwestrel at redhat.com Thu Dec 14 16:42:42 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Thu, 14 Dec 2017 17:42:42 +0100 Subject: Shenandoah WB and XMM spills? In-Reply-To: References: <9540614c-2d1d-6098-72c5-b345dc9c2635@redhat.com> Message-ID: I'd like to push that patch as is. Ok? Roland. From shade at redhat.com Thu Dec 14 16:53:21 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 14 Dec 2017 17:53:21 +0100 Subject: Shenandoah WB and XMM spills? In-Reply-To: References: <9540614c-2d1d-6098-72c5-b345dc9c2635@redhat.com> Message-ID: <56e65ce1-f648-e46e-2a5f-d552c2e07cf1@redhat.com> On 12/14/2017 05:42 PM, Roland Westrelin wrote: > I'd like to push that patch as is. Ok? We should protect things with UseShenandoahGC, I guess, because we would need to backport this eventually? Also, do we want to additionally check the wb stub is only called with CallLeafNoFP? E.g.: if (UseShenandoahGC && mcall->entry_point() == StubRoutines::shenandoah_wb_C()) { assert(op == Op_CallLeafNoFP, "shenandoah_wb_C should be called with Op_CallLeafNoFP"); add_call_kills(proj, regs, save_policy, exclude_soe, true); } else { add_call_kills(proj, regs, save_policy, exclude_soe, false); } -Aleksey From kirill at korins.ky Thu Dec 14 17:52:11 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Thu, 14 Dec 2017 17:52:11 +0000 Subject: Strange bug inside jetty at shenandoah/jdk8u Message-ID: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> Good day! I've tried use shenandoah on real application and found strange behaviour. The application is usual java-http-server (jgroup, spring, jetty, etc) that listen to port and responses to request. If remove from arguments all `-D[application settings]` the start command looks like: > java -server -Duser.timezone=UTC -XX:-OmitStackTraceInFastThrow -Xmx4096 -Xms4096 -server ${GC_OPTIONS} -Djava.net.preferIPv4Stack=true -Djava.security.auth.login.config=/opt/server/conf/jaas.conf -jar /path/to/shade.jar It runs inside docker container and I tried use shipilev/openjdk:8-shenandoah and openjdk-8 from fedora:27 but it has same behaviour. When I run tests over ab I had strange error inside jetty: > 2017-12-14 11:30:08 WARN HttpParser: - parse exception: java.lang.IndexOutOfBoundsException: 32 for HttpChannelOverHttp at 779b4be{r=152,c=false,a=IDLE,uri=null} > java.lang.IndexOutOfBoundsException: 32 > at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139) ~[?:1.8.0_151] > at org.eclipse.jetty.http.HttpParser.parseLine(HttpParser.java:785) ~[jetty-http-9.3.22.v20171030.jar:9.3.22.v20171030] > at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1328) ~[jetty-http-9.3.22.v20171030.jar:9.3.22.v20171030] > at org.eclipse.jetty.server.HttpConnection.parseRequestBuffer(HttpConnection.java:351) ~[jetty-server-9.3.22.v20171030.jar:9.3.22.v20171030] > at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:234) ~[jetty-server-9.3.22.v20171030.jar:9.3.22.v20171030] The code near this get() bellow > if (version!=null) > { > int pos = buffer.position()+version.asString().length()-1; > if (pos { > byte n=buffer.get(pos); > if (n==HttpTokens.CARRIAGE_RETURN) it is crashed at line `buffer.get(pos)` If we checked `buffer.get(...)` code we will see > public byte get(int i) { > return hb[ix(checkIndex(i))]; > } and if checked `checkIndex()` > final int checkIndex(int i) { // package-private > if ((i < 0) || (i >= limit)) > throw new IndexOutOfBoundsException(); > return i; > } but `IndexOutOfBoundsException()` trhows at `get()`, not at `checkIndex()` :( Anyway, if I remove `-XX:+UseShenandoahGC` everything works well. I can't run it at shipilev/openjdk:8-shenandoah-fastdebug because JVM crashed without logs I attached gc log that I got when run this application with options: `-XX:+UseShenandoahGC -XX:+ShenandoahVerify -XX:ShenandoahGCHeuristics=aggressive -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps` but it didn't start, just worked sometime and crashed without any exception or error at application level. -- wbr, Kirill From shade at redhat.com Thu Dec 14 18:02:14 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 14 Dec 2017 19:02:14 +0100 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> Message-ID: <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> On 12/14/2017 06:52 PM, Kirill A. Korinsky wrote: > Good day! > > I've tried use shenandoah on real application and found strange behaviour. Thanks for reporting this! > The application is usual java-http-server (jgroup, spring, jetty, etc) that listen to port and responses to request. > > If remove from arguments all `-D[application settings]` the start command looks like: >> java -server -Duser.timezone=UTC -XX:-OmitStackTraceInFastThrow -Xmx4096 -Xms4096 -server ${GC_OPTIONS} -Djava.net.preferIPv4Stack=true -Djava.security.auth.login.config=/opt/server/conf/jaas.conf -jar /path/to/shade.jar Any chance you can whip up a simple reproducer for this? Something like Maven project that produces a JAR in question. > I can't run it at shipilev/openjdk:8-shenandoah-fastdebug because JVM crashed without logs Crashing without logs is weird. Can you run the non-Docker nightly JDK from here? https://builds.shipilev.net/openjdk-shenandoah-jdk8/ Our wiki outlines some steps to better dissect this: https://wiki.openjdk.java.net/display/shenandoah/Main#Main-FunctionalDiagnostics Thanks, -Aleksey From shade at redhat.com Thu Dec 14 18:06:59 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 14 Dec 2017 19:06:59 +0100 Subject: RFR: Make degenerated update-refs use region-set cursor to hand over work Message-ID: <1a915cfc-4f78-242d-e528-3ce6b0729a1c@redhat.com> http://cr.openjdk.java.net/~shade/shenandoah/ur-degen-cursor/webrev.01/ This is based on previous RFR that cleans up operations. For Degenerate GC to work, we want to drop cancellation flag right away, and do init-update-refs, followed by final-update-refs to finish the update refs work. But, final-update-refs would not finish work when cancellation is cleared. Since work handover is tracked by regions cursor anyway, why don't we use that to signal available work? This also handles the case where cancellation is called when all threads have processed all regions during conc-update-refs, and reacted on cancellation at the end of the phase. Current code would make a futile attempt to whip up workers during final-update-refs, when we know there is no work left. Testing: hotspot_gc_shenandoah Thanks, -Aleksey From zgu at redhat.com Thu Dec 14 20:00:02 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 14 Dec 2017 15:00:02 -0500 Subject: RFR: Shenandoah SA implementation Message-ID: <9f013c6b-9aa8-b692-8923-df9fc1599e06@redhat.com> Please review Shenandoah Seviceability agent implementation. Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/shenandoah-sa/webrev.00/index.html Test: Manual test: jhsdb {hsdb, jstack, jmap, jinfo, jsnap} Thanks, -Zhengyu From shade at redhat.com Thu Dec 14 20:26:57 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 14 Dec 2017 21:26:57 +0100 Subject: RFR: Shenandoah SA implementation In-Reply-To: <9f013c6b-9aa8-b692-8923-df9fc1599e06@redhat.com> References: <9f013c6b-9aa8-b692-8923-df9fc1599e06@redhat.com> Message-ID: <8241f47f-03d3-6218-5b74-5fc04d541e47@redhat.com> On 12/14/2017 09:00 PM, Zhengyu Gu wrote: > Please review Shenandoah Seviceability agent implementation. > > > Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/shenandoah-sa/webrev.00/index.html Very good! Minor nits: *) Excess fully-qualified name here in HeapSummary: 149 } else if (heap instanceof sun.jvm.hotspot.gc.shenandoah.ShenandoahHeap) { *) Excess leading white-space in String here in HeapSummary? 153 System.out.println(" regions = " + num_regions); *) Double spaces here after "Address" in ShenandoahHeapRegionSet: 62 Address arrayAddr = regionsField.getValue(addr); 63 Address regAddr = arrayAddr.getAddressAt(index * regionPtrFieldSize); No need for re-review. Thanks, -Aleksey From zgu at redhat.com Thu Dec 14 20:44:53 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 14 Dec 2017 15:44:53 -0500 Subject: RFR: Shenandoah SA implementation In-Reply-To: <8241f47f-03d3-6218-5b74-5fc04d541e47@redhat.com> References: <9f013c6b-9aa8-b692-8923-df9fc1599e06@redhat.com> <8241f47f-03d3-6218-5b74-5fc04d541e47@redhat.com> Message-ID: <6709778a-079e-ee93-eb44-43ce44f9006c@redhat.com> On 12/14/2017 03:26 PM, Aleksey Shipilev wrote: > On 12/14/2017 09:00 PM, Zhengyu Gu wrote: >> Please review Shenandoah Seviceability agent implementation. >> >> >> Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/shenandoah-sa/webrev.00/index.html > > Very good! > > Minor nits: > > *) Excess fully-qualified name here in HeapSummary: > > 149 } else if (heap instanceof sun.jvm.hotspot.gc.shenandoah.ShenandoahHeap) { Fixed. > > *) Excess leading white-space in String here in HeapSummary? > > 153 System.out.println(" regions = " + num_regions); It aligns up, printValMB() has leading spaces. > > *) Double spaces here after "Address" in ShenandoahHeapRegionSet: > > 62 Address arrayAddr = regionsField.getValue(addr); > 63 Address regAddr = arrayAddr.getAddressAt(index * regionPtrFieldSize); Fixed. Thanks, -Zhengyu > > No need for re-review. > > Thanks, > -Aleksey > From zgu at redhat.com Thu Dec 14 20:52:40 2017 From: zgu at redhat.com (zgu at redhat.com) Date: Thu, 14 Dec 2017 20:52:40 +0000 Subject: hg: shenandoah/jdk10: Shenandoah SA implementation Message-ID: <201712142052.vBEKqeAl027586@aojmv0008.oracle.com> Changeset: 42d652a258a7 Author: zgu Date: 2017-12-14 15:48 -0500 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/42d652a258a7 Shenandoah SA implementation ! src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp ! src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.hpp ! src/hotspot/share/gc/shenandoah/shenandoahHeapRegionSet.hpp ! src/hotspot/share/runtime/vmStructs.cpp ! src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/shared/CollectedHeap.java ! src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/shared/CollectedHeapName.java + src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/shenandoah/ShenandoahHeap.java + src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/shenandoah/ShenandoahHeapRegion.java + src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/gc/shenandoah/ShenandoahHeapRegionSet.java ! src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/memory/Universe.java ! src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ObjectHeap.java ! src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/tools/HeapSummary.java From rkennke at redhat.com Thu Dec 14 21:24:56 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 14 Dec 2017 22:24:56 +0100 Subject: RFR: Rehash VMOperations and cycle driver mechanics for consistency" In-Reply-To: References: Message-ID: <3dfe92f1-e9dc-0bb1-d4b8-73d42c285f29@redhat.com> Am 14.12.2017 um 17:36 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/operations-cleanup/webrev.01/ > > This is the preparation cleanup for Degenerate GC. The changes in this webrev are not about > functionality, but about harmonizing the code for future changes. > > Brief tour: > > a) Three groups of methods are now in ShenandoahHeap: entry-points with the safepoint, entry-points > without the safepoints, and private group that does the actual operations. > > b) vmop_entry_* do all the needed setup, including capturing the gross GC times (In future, we may > report both gross and net times there, to capture these in GC logs!). These entry-points would > initiate the safepoint and call into entry_* methods. VM_Shenandoah* operations are now the simple > trampolines back to entry_*. > > c) entry_* do all the rest of needed setup (assuming safepoint or not), including figuring out the > worker counts, recording net times, and calling into op_* methods. > > d) op_* is where we do the actual thing for each phase. (Spoiler alert: Degenerate GC would just > call op_* methods in correct order, entering via single VMOp) > > e) Minor corrections in GCMark and stats for Full GC code > > f) Minor typo changes > > Testing: hotspot_gc_shenandoah {fastdebug|release} > > Thanks, > -Aleksey Yes, this seems cleaner and more consistent. Thanks! Roman From rkennke at redhat.com Thu Dec 14 21:49:22 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 14 Dec 2017 22:49:22 +0100 Subject: RFR: Make degenerated update-refs use region-set cursor to hand over work In-Reply-To: <1a915cfc-4f78-242d-e528-3ce6b0729a1c@redhat.com> References: <1a915cfc-4f78-242d-e528-3ce6b0729a1c@redhat.com> Message-ID: <73df3b1a-5926-1b7a-2194-cb3649bdf456@redhat.com> Am 14.12.2017 um 19:06 schrieb Aleksey Shipilev: > http://cr.openjdk.java.net/~shade/shenandoah/ur-degen-cursor/webrev.01/ > > This is based on previous RFR that cleans up operations. For Degenerate GC to work, we want to drop > cancellation flag right away, and do init-update-refs, followed by final-update-refs to finish the > update refs work. But, final-update-refs would not finish work when cancellation is cleared. > > Since work handover is tracked by regions cursor anyway, why don't we use that to signal available > work? This also handles the case where cancellation is called when all threads have processed all > regions during conc-update-refs, and reacted on cancellation at the end of the phase. Current code > would make a futile attempt to whip up workers during final-update-refs, when we know there is no > work left. Ok From kirill at korins.ky Thu Dec 14 22:29:57 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Thu, 14 Dec 2017 22:29:57 +0000 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> Message-ID: <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> Hey, Looks like I found issue that creates strange behaviour and broke starting an application at your fastdebug image ? ~ docker run --rm -ti shipilev/openjdk:8-shenandoah-fastdebug bash root at 27a293be90b2:/# java -Xmx4096m -Xms4096m -version Killed root at 27a293be90b2:/# java -Xmx4096m -version openjdk version "1.8.0-internal-fastdebug" OpenJDK Runtime Environment (build 1.8.0-internal-fastdebug-jenkins_2017_11_15_05_12-b00) OpenJDK 64-Bit Server VM (build 25.71-b00-fastdebug, mixed mode) root at 27a293be90b2:/# exit exit ? ~ docker run --rm -ti shipilev/openjdk:8-shenandoah bash root at 10aa56f25cb7:/# java -Xmx4096m -Xms4096m -version openjdk version "1.8.0-internal" OpenJDK Runtime Environment (build 1.8.0-internal-jenkins_2017_11_15_03_20-b00) OpenJDK 64-Bit Server VM (build 25.71-b00, mixed mode) root at 10aa56f25cb7:/# exit exit ? ~ docker images | grep shenandoah shipilev/openjdk 8-shenandoah-fastdebug ec13bdd01380 4 weeks ago 383MB shipilev/openjdk 8-shenandoah cb7dbc6f6fb9 4 weeks ago 385MB ? ~ Yes, I just removed `-Xms` from arguments and it helps: - it starts at fastdebug image; - it doesn't crash at ab test. Anyway, I will keep testing, and if it crashed I let you know -- wbr, Kirill > On 14 Dec 2017, at 22:02, Aleksey Shipilev wrote: > > On 12/14/2017 06:52 PM, Kirill A. Korinsky wrote: >> Good day! >> >> I've tried use shenandoah on real application and found strange behaviour. > > Thanks for reporting this! > >> The application is usual java-http-server (jgroup, spring, jetty, etc) that listen to port and responses to request. >> >> If remove from arguments all `-D[application settings]` the start command looks like: >>> java -server -Duser.timezone=UTC -XX:-OmitStackTraceInFastThrow -Xmx4096 -Xms4096 -server ${GC_OPTIONS} -Djava.net.preferIPv4Stack=true -Djava.security.auth.login.config=/opt/server/conf/jaas.conf -jar /path/to/shade.jar > > Any chance you can whip up a simple reproducer for this? Something like Maven project that produces > a JAR in question. > >> I can't run it at shipilev/openjdk:8-shenandoah-fastdebug because JVM crashed without logs > > Crashing without logs is weird. Can you run the non-Docker nightly JDK from here? > https://builds.shipilev.net/openjdk-shenandoah-jdk8/ > > Our wiki outlines some steps to better dissect this: > https://wiki.openjdk.java.net/display/shenandoah/Main#Main-FunctionalDiagnostics > > Thanks, > -Aleksey > > From zgu at redhat.com Thu Dec 14 23:04:04 2017 From: zgu at redhat.com (zgu at redhat.com) Date: Thu, 14 Dec 2017 23:04:04 +0000 Subject: hg: shenandoah/jdk10: Added missing file for Shenandoah SA Message-ID: <201712142304.vBEN44uO006387@aojmv0008.oracle.com> Changeset: b1ff99e9ee87 Author: zgu Date: 2017-12-14 17:59 -0500 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/b1ff99e9ee87 Added missing file for Shenandoah SA + src/hotspot/share/gc/shenandoah/vmStructs_shenandoah.hpp From rwestrel at redhat.com Fri Dec 15 08:18:20 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 15 Dec 2017 09:18:20 +0100 Subject: Shenandoah WB and XMM spills? In-Reply-To: <56e65ce1-f648-e46e-2a5f-d552c2e07cf1@redhat.com> References: <9540614c-2d1d-6098-72c5-b345dc9c2635@redhat.com> <56e65ce1-f648-e46e-2a5f-d552c2e07cf1@redhat.com> Message-ID: > if (UseShenandoahGC && mcall->entry_point() == StubRoutines::shenandoah_wb_C()) { > assert(op == Op_CallLeafNoFP, "shenandoah_wb_C should be called with Op_CallLeafNoFP"); > add_call_kills(proj, regs, save_policy, exclude_soe, true); > } else { > add_call_kills(proj, regs, save_policy, exclude_soe, false); > } Thanks for the suggestion. I will go with that. Roland. From rwestrel at redhat.com Fri Dec 15 08:32:05 2017 From: rwestrel at redhat.com (rwestrel at redhat.com) Date: Fri, 15 Dec 2017 08:32:05 +0000 Subject: hg: shenandoah/jdk10: Allow use of fp spills around write barrier Message-ID: <201712150832.vBF8W5VE009514@aojmv0008.oracle.com> Changeset: 842e412a3f86 Author: roland Date: 2017-12-15 09:27 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/842e412a3f86 Allow use of fp spills around write barrier ! src/hotspot/share/opto/lcm.cpp From ashipile at redhat.com Fri Dec 15 09:48:10 2017 From: ashipile at redhat.com (ashipile at redhat.com) Date: Fri, 15 Dec 2017 09:48:10 +0000 Subject: hg: shenandoah/jdk10: Rehash VMOperations and cycle driver mechanics for consistency Message-ID: <201712150948.vBF9mAFP001937@aojmv0008.oracle.com> Changeset: dc57d33678b7 Author: shade Date: 2017-12-15 10:13 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/dc57d33678b7 Rehash VMOperations and cycle driver mechanics for consistency ! src/hotspot/share/gc/shenandoah/shenandoahConcurrentThread.cpp ! src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp ! src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp ! src/hotspot/share/gc/shenandoah/shenandoahMarkCompact.cpp ! src/hotspot/share/gc/shenandoah/shenandoahPartialGC.cpp ! src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.cpp ! src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.hpp ! src/hotspot/share/gc/shenandoah/shenandoahUtils.cpp ! src/hotspot/share/gc/shenandoah/shenandoahUtils.hpp ! src/hotspot/share/gc/shenandoah/shenandoahWorkerPolicy.cpp ! src/hotspot/share/gc/shenandoah/shenandoahWorkerPolicy.hpp ! src/hotspot/share/gc/shenandoah/vm_operations_shenandoah.cpp From zgu at redhat.com Mon Dec 18 16:43:36 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 18 Dec 2017 11:43:36 -0500 Subject: RFR: Fixed compilation error of libTestHeapDump.c on Windows with VS2010 Message-ID: This patch fixes compilation error reported by Michal Vala [1]. VS2010 compiling C files uses C89 standard, which requires all local variables are declared at the beginning of code blocks. Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/vs2010_c2275/webrev.00/ [1] https://post-office.corp.redhat.com/mailman/private/java-team/2017-December/msg00164.html Thanks, -Zhengyu From shade at redhat.com Mon Dec 18 16:48:05 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 18 Dec 2017 17:48:05 +0100 Subject: RFR: Fixed compilation error of libTestHeapDump.c on Windows with VS2010 In-Reply-To: References: Message-ID: On 12/18/2017 05:43 PM, Zhengyu Gu wrote: > This patch fixes compilation error reported by Michal Vala [1]. > > VS2010 compiling C files uses C89 standard, which requires all local variables are declared at the > beginning of code blocks. > > > Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/vs2010_c2275/webrev.00/ Looks good. Fix up the triple space here? 73 jvmtiCapabilities capabilities; This may go directly to sh/jdk10, and to sh/jdk9 and sh/jdk8u (with "[backport]"). Thanks, -Aleksey From zgu at redhat.com Mon Dec 18 16:55:36 2017 From: zgu at redhat.com (zgu at redhat.com) Date: Mon, 18 Dec 2017 16:55:36 +0000 Subject: hg: shenandoah/jdk10: Fixed compilation error of libTestHeapDump.c on Windows with VS2010 Message-ID: <201712181655.vBIGtbe0000043@aojmv0008.oracle.com> Changeset: 115f2ad14216 Author: zgu Date: 2017-12-18 11:51 -0500 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/115f2ad14216 Fixed compilation error of libTestHeapDump.c on Windows with VS2010 ! test/hotspot/jtreg/gc/shenandoah/jvmti/libTestHeapDump.c From zgu at redhat.com Mon Dec 18 17:20:05 2017 From: zgu at redhat.com (zgu at redhat.com) Date: Mon, 18 Dec 2017 17:20:05 +0000 Subject: hg: shenandoah/jdk9/hotspot: [backport] Fixed compilation error of libTestHeapDump.c on Windows with VS2010 Message-ID: <201712181720.vBIHK5Qv008766@aojmv0008.oracle.com> Changeset: c68fed8072ea Author: zgu Date: 2017-12-18 12:02 -0500 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/c68fed8072ea [backport] Fixed compilation error of libTestHeapDump.c on Windows with VS2010 ! test/gc/shenandoah/jvmti/libTestHeapDump.c From zgu at redhat.com Mon Dec 18 17:27:04 2017 From: zgu at redhat.com (zgu at redhat.com) Date: Mon, 18 Dec 2017 17:27:04 +0000 Subject: hg: shenandoah/jdk8u/hotspot: [backport] Fixed compilation error of libTestHeapDump.c on Windows with VS2010 Message-ID: <201712181727.vBIHR4r5011446@aojmv0008.oracle.com> Changeset: 278b3f069a4a Author: zgu Date: 2017-12-18 12:23 -0500 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/278b3f069a4a [backport] Fixed compilation error of libTestHeapDump.c on Windows with VS2010 ! test/gc/shenandoah/jvmti/libTestHeapDump.c From shade at redhat.com Tue Dec 19 12:54:08 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 19 Dec 2017 13:54:08 +0100 Subject: Shenandoah WB fastpath and optimizations Message-ID: Comparing the Shenandoah performance on XmlValidation and disabled barriers, reveals an odd story. The accurate perfnorm profiling that normalizes the CPU counters to benchmark operations: Benchmark Mode Cnt Score Error Units # passive XV.test thrpt 10 236 ? 1 ops/min XV.test:CPI thrpt 10 0.417 ? 0 #/op XV.test:L1-dcache-load-misses thrpt 10 11605037 ? 191196 #/op XV.test:L1-dcache-loads thrpt 10 520038766 ? 6177479 #/op XV.test:L1-dcache-stores thrpt 10 198131386 ? 2044458 #/op XV.test:L1-icache-load-misses thrpt 10 4058561 ? 157045 #/op XV.test:LLC-load-misses thrpt 10 481808 ? 17320 #/op XV.test:LLC-loads thrpt 10 3478116 ? 78461 #/op XV.test:LLC-store-misses thrpt 10 51686 ? 2262 #/op XV.test:LLC-stores thrpt 10 262209 ? 15420 #/op XV.test:branch-misses thrpt 10 954476 ? 20287 #/op XV.test:branches thrpt 10 320735964 ? 1510799 #/op XV.test:cycles thrpt 10 691694314 ? 4159603 #/op XV.test:dTLB-load-misses thrpt 10 52266 ? 10707 #/op XV.test:dTLB-loads thrpt 10 515487335 ? 5540964 #/op XV.test:dTLB-store-misses thrpt 10 1692 ? 547 #/op XV.test:dTLB-stores thrpt 10 197639464 ? 2675693 #/op XV.test:iTLB-load-misses thrpt 10 10636 ? 5019 #/op XV.test:iTLB-loads thrpt 10 878417 ? 106475 #/op XV.test:instructions thrpt 10 1659286537 ? 8661844 #/op # passive, +ShenandoahWriteBarrier XV.test thrpt 10 206 ? 2.905 ops/min -14% XV.test:CPI thrpt 10 0.417 ? 0.004 #/op XV.test:L1-dcache-load-misses thrpt 10 12126323 ? 464131 #/op XV.test:L1-dcache-loads thrpt 10 609183240 ? 5857280 #/op +77..101M +17% XV.test:L1-dcache-stores thrpt 10 216852068 ? 2586890 #/op +14..23M +9% XV.test:L1-icache-load-misses thrpt 10 4600468 ? 252047 #/op XV.test:LLC-load-misses thrpt 10 504257 ? 28641 #/op XV.test:LLC-loads thrpt 10 3696029 ? 105743 #/op XV.test:LLC-store-misses thrpt 10 52340 ? 2107 #/op XV.test:LLC-stores thrpt 10 245865 ? 15167 #/op XV.test:branch-misses thrpt 10 1080985 ? 29069 #/op XV.test:branches thrpt 10 361296218 ? 2117561 #/op +36..44M +12% XV.test:cycles thrpt 10 790992629 ? 9312064 #/op XV.test:dTLB-load-misses thrpt 10 72138 ? 8381 #/op XV.test:dTLB-loads thrpt 10 606335138 ? 4969218 #/op XV.test:dTLB-store-misses thrpt 10 3452 ? 2327 #/op XV.test:dTLB-stores thrpt 10 216814757 ? 2316964 #/op XV.test:iTLB-load-misses thrpt 10 16967 ? 14388 #/op XV.test:iTLB-loads thrpt 10 1006270 ? 153479 #/op XV.test:instructions thrpt 10 1897746787 ? 10418938 #/op +220..257M +14% There are a few interesting observations here: *) Enabling Shenandoah WB on this workload is responsible for ~14% throughput hit. This is the impact of the WB fastpath, because the workload runs with "passive" that does not do any concurrent cycles, and thus never reaches the slowpath. Shenandoah WB fastpath is basically four instructions: movzbl 0x3d8(%rTLS), %rScratch ; read evac-in-progress test %rScratch, %rScratch jne EVAC-ENABLED-SLOW-PATH mov -0x8(%rObj), %rObj ; read barrier *) CPI numbers agree in both configurations, and the number of instructions had also grown +14%. This means the impact is due to larger code path, not some backend effect (like cache misses or such). *) If we treat the number of of additional branches as the number of WBs for the workload, then we have around 40M WB fastpaths for each benchmark op. This means we should have around 80M L1-dcache-loads coming from WB (one for reading TLS flag, and another for RB), and that seems to agree with the data, given quite large error bounds. *) What is weird is that we have ~18M excess *stores*, which are completely unaccounted by WBs. *) ...and to add the insult to injury, 4 insn per WB should add up to 160M excess insns, but instead we have around 240M. The profile is too flat to pinpoint the exact code shape where we lose some of these instructions. But this collateral evidence seems to imply WBs make some stores more probable (e.g. by breaking some optimizations?), and that is the cause for inflated insn and L1 store counts? Thoughts? Thanks, -Aleksey P.S. Looking at ShenandoahWriteBarrierNode::test_evacuation_in_progress, I see there is Op_MemBarAcquire node that is attached to control projection for both CmpI and Bool nodes from WB. Are these limiting the optimizations? Why do we need acquire there? This had originated from Roland's change rewrite that introduced shenandoah_pin_and_expand_barriers: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/978d7601df14#l20.1137 From rkennke at redhat.com Tue Dec 19 12:58:49 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 19 Dec 2017 13:58:49 +0100 Subject: Shenandoah WB fastpath and optimizations In-Reply-To: References: Message-ID: > P.S. Looking at ShenandoahWriteBarrierNode::test_evacuation_in_progress, I see there is > Op_MemBarAcquire node that is attached to control projection for both CmpI and Bool nodes from WB. > Are these limiting the optimizations? Why do we need acquire there? This had originated from > Roland's change rewrite that introduced shenandoah_pin_and_expand_barriers: > http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/978d7601df14#l20.1137 This is the membar that we need between loading the evac_in_progress and the load of the brooks ptr. If we don't have it, we cannot turn off evac-in-progress concurrently. We don't strictly need Acquire, we 'only' need a LoadLoad membar, on x86 meaning that the instructions need to be emitted in correct order. I cannot say if this possibly prevents optimizations. Roman From rkennke at redhat.com Tue Dec 19 13:03:25 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 19 Dec 2017 14:03:25 +0100 Subject: Shenandoah WB fastpath and optimizations In-Reply-To: References: Message-ID: Without going deeper, maybe it's worth to do an optimization like I've outlined in the Traversal GC thread? I.e. fold evac_in_progress checks in blocks without safepoint? And generate WB-less blocks if possible? Roman > Comparing the Shenandoah performance on XmlValidation and disabled barriers, reveals an odd story. > The accurate perfnorm profiling that normalizes the CPU counters to benchmark operations: > > > Benchmark Mode Cnt Score Error Units > > # passive > XV.test thrpt 10 236 ? 1 ops/min > XV.test:CPI thrpt 10 0.417 ? 0 #/op > XV.test:L1-dcache-load-misses thrpt 10 11605037 ? 191196 #/op > XV.test:L1-dcache-loads thrpt 10 520038766 ? 6177479 #/op > XV.test:L1-dcache-stores thrpt 10 198131386 ? 2044458 #/op > XV.test:L1-icache-load-misses thrpt 10 4058561 ? 157045 #/op > XV.test:LLC-load-misses thrpt 10 481808 ? 17320 #/op > XV.test:LLC-loads thrpt 10 3478116 ? 78461 #/op > XV.test:LLC-store-misses thrpt 10 51686 ? 2262 #/op > XV.test:LLC-stores thrpt 10 262209 ? 15420 #/op > XV.test:branch-misses thrpt 10 954476 ? 20287 #/op > XV.test:branches thrpt 10 320735964 ? 1510799 #/op > XV.test:cycles thrpt 10 691694314 ? 4159603 #/op > XV.test:dTLB-load-misses thrpt 10 52266 ? 10707 #/op > XV.test:dTLB-loads thrpt 10 515487335 ? 5540964 #/op > XV.test:dTLB-store-misses thrpt 10 1692 ? 547 #/op > XV.test:dTLB-stores thrpt 10 197639464 ? 2675693 #/op > XV.test:iTLB-load-misses thrpt 10 10636 ? 5019 #/op > XV.test:iTLB-loads thrpt 10 878417 ? 106475 #/op > XV.test:instructions thrpt 10 1659286537 ? 8661844 #/op > > # passive, +ShenandoahWriteBarrier > XV.test thrpt 10 206 ? 2.905 ops/min -14% > XV.test:CPI thrpt 10 0.417 ? 0.004 #/op > XV.test:L1-dcache-load-misses thrpt 10 12126323 ? 464131 #/op > XV.test:L1-dcache-loads thrpt 10 609183240 ? 5857280 #/op +77..101M +17% > XV.test:L1-dcache-stores thrpt 10 216852068 ? 2586890 #/op +14..23M +9% > XV.test:L1-icache-load-misses thrpt 10 4600468 ? 252047 #/op > XV.test:LLC-load-misses thrpt 10 504257 ? 28641 #/op > XV.test:LLC-loads thrpt 10 3696029 ? 105743 #/op > XV.test:LLC-store-misses thrpt 10 52340 ? 2107 #/op > XV.test:LLC-stores thrpt 10 245865 ? 15167 #/op > XV.test:branch-misses thrpt 10 1080985 ? 29069 #/op > XV.test:branches thrpt 10 361296218 ? 2117561 #/op +36..44M +12% > XV.test:cycles thrpt 10 790992629 ? 9312064 #/op > XV.test:dTLB-load-misses thrpt 10 72138 ? 8381 #/op > XV.test:dTLB-loads thrpt 10 606335138 ? 4969218 #/op > XV.test:dTLB-store-misses thrpt 10 3452 ? 2327 #/op > XV.test:dTLB-stores thrpt 10 216814757 ? 2316964 #/op > XV.test:iTLB-load-misses thrpt 10 16967 ? 14388 #/op > XV.test:iTLB-loads thrpt 10 1006270 ? 153479 #/op > XV.test:instructions thrpt 10 1897746787 ? 10418938 #/op +220..257M +14% > > > There are a few interesting observations here: > > *) Enabling Shenandoah WB on this workload is responsible for ~14% throughput hit. This is the > impact of the WB fastpath, because the workload runs with "passive" that does not do any concurrent > cycles, and thus never reaches the slowpath. > > Shenandoah WB fastpath is basically four instructions: > > movzbl 0x3d8(%rTLS), %rScratch ; read evac-in-progress > test %rScratch, %rScratch > jne EVAC-ENABLED-SLOW-PATH > mov -0x8(%rObj), %rObj ; read barrier > > *) CPI numbers agree in both configurations, and the number of instructions had also grown +14%. > This means the impact is due to larger code path, not some backend effect (like cache misses or such). > > *) If we treat the number of of additional branches as the number of WBs for the workload, then we > have around 40M WB fastpaths for each benchmark op. This means we should have around 80M > L1-dcache-loads coming from WB (one for reading TLS flag, and another for RB), and that seems to > agree with the data, given quite large error bounds. > > *) What is weird is that we have ~18M excess *stores*, which are completely unaccounted by WBs. > > *) ...and to add the insult to injury, 4 insn per WB should add up to 160M excess insns, but > instead we have around 240M. > > The profile is too flat to pinpoint the exact code shape where we lose some of these instructions. > But this collateral evidence seems to imply WBs make some stores more probable (e.g. by breaking > some optimizations?), and that is the cause for inflated insn and L1 store counts? > > Thoughts? > > Thanks, > -Aleksey > > P.S. Looking at ShenandoahWriteBarrierNode::test_evacuation_in_progress, I see there is > Op_MemBarAcquire node that is attached to control projection for both CmpI and Bool nodes from WB. > Are these limiting the optimizations? Why do we need acquire there? This had originated from > Roland's change rewrite that introduced shenandoah_pin_and_expand_barriers: > http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/978d7601df14#l20.1137 > From rwestrel at redhat.com Tue Dec 19 14:14:34 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 19 Dec 2017 15:14:34 +0100 Subject: Shenandoah WB fastpath and optimizations In-Reply-To: References: Message-ID: > There are a few interesting observations here: > > *) Enabling Shenandoah WB on this workload is responsible for ~14% throughput hit. This is the > impact of the WB fastpath, because the workload runs with "passive" that does not do any concurrent > cycles, and thus never reaches the slowpath. > > Shenandoah WB fastpath is basically four instructions: > > movzbl 0x3d8(%rTLS), %rScratch ; read evac-in-progress > test %rScratch, %rScratch > jne EVAC-ENABLED-SLOW-PATH > mov -0x8(%rObj), %rObj ; read barrier > > *) CPI numbers agree in both configurations, and the number of instructions had also grown +14%. > This means the impact is due to larger code path, not some backend effect (like cache misses or such). > > *) If we treat the number of of additional branches as the number of WBs for the workload, then we > have around 40M WB fastpaths for each benchmark op. This means we should have around 80M > L1-dcache-loads coming from WB (one for reading TLS flag, and another for RB), and that seems to > agree with the data, given quite large error bounds. > > *) What is weird is that we have ~18M excess *stores*, which are completely unaccounted by WBs. > > *) ...and to add the insult to injury, 4 insn per WB should add up to 160M excess insns, but > instead we have around 240M. > > The profile is too flat to pinpoint the exact code shape where we lose some of these instructions. > But this collateral evidence seems to imply WBs make some stores more probable (e.g. by breaking > some optimizations?), and that is the cause for inflated insn and L1 store counts? > > Thoughts? Could the 18M stores be spills and somewhere in the 77..101M extra loads would be their counterpart spill loads? The WB needs at least one extra register and there's also the possibility that the WB slow path messes up the register allocator heuristics (as we've seen with the XMM spills). Roland. From shade at redhat.com Tue Dec 19 14:19:33 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 19 Dec 2017 15:19:33 +0100 Subject: Shenandoah WB fastpath and optimizations In-Reply-To: References: Message-ID: <32e9db01-2e7c-3cff-a56f-0cfd60e78a17@redhat.com> On 12/19/2017 03:14 PM, Roland Westrelin wrote: >> Thoughts? > > Could the 18M stores be spills and somewhere in the 77..101M extra loads > would be their counterpart spill loads? The WB needs at least one extra > register and there's also the possibility that the WB slow path messes > up the register allocator heuristics (as we've seen with the XMM > spills). Could be, and it was my base theory at some point. But I'd expect more loads to manifest more reliably. As such, we seem to be very well within the L1-load budget to account for WB loads. In fact, I wanted to ask you what would it take to teach C2 to emit C1-style check, e.g. instead of: movzbl 0x3d8(%rTLS), %rScratch ; read evac-in-progress test %rScratch, %rScratch jne EVAC-ENABLED-SLOW-PATH mov -0x8(%rObj), %rObj ; read barrier ...do: cmpb 0x3d8(%TLS), 0 ; read evac-in-progress jne EVAC-ENABLED-SLOW-PATH mov -0x8(%rObj), %rObj ; read barrier ...thus freeing up the register? Thanks, -Aleksey From rwestrel at redhat.com Tue Dec 19 14:28:41 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 19 Dec 2017 15:28:41 +0100 Subject: Shenandoah WB fastpath and optimizations In-Reply-To: <32e9db01-2e7c-3cff-a56f-0cfd60e78a17@redhat.com> References: <32e9db01-2e7c-3cff-a56f-0cfd60e78a17@redhat.com> Message-ID: > Could be, and it was my base theory at some point. But I'd expect more loads to manifest more > reliably. As such, we seem to be very well within the L1-load budget to account for WB loads. > > In fact, I wanted to ask you what would it take to teach C2 to emit C1-style check, e.g. instead of: > > movzbl 0x3d8(%rTLS), %rScratch ; read evac-in-progress > test %rScratch, %rScratch > jne EVAC-ENABLED-SLOW-PATH > mov -0x8(%rObj), %rObj ; read barrier > > ...do: > > cmpb 0x3d8(%TLS), 0 ; read evac-in-progress > jne EVAC-ENABLED-SLOW-PATH > mov -0x8(%rObj), %rObj ; read barrier > > ...thus freeing up the register? I will look into that. Running with an empty slow path could be an interesting experiment too. Roland. From kirill at korins.ky Tue Dec 19 14:57:00 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Tue, 19 Dec 2017 14:57:00 +0000 Subject: Rare very big pause Message-ID: Good day! I'm trying to use Shenandoah GC from Fedora 27 at jdk8u151b12, and I have rare (one-two time per our) very big pause. Up to 0.5 seconds. You can find a GC log bellow (it is log for 10 minutes where this sort of pause happened). The pause had happened at 2017-12-19T12:11:03.965+0000 and was 446.259 ms java runs with arguments: -server -XX:-OmitStackTraceInFastThrow -Xmx6144m -XX:+UseShenandoahGC -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+DisableExplicitGC -XX:+UseTransparentHugePages -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps The log: Concurrent marking triggered. Free: 552M, Free Threshold: 552M; Allocated: 552M, Alloc Threshold: 0M 2017-12-19T12:05:18.585+0000: 11876.106: [Pause Init Mark, 5.275 ms] 2017-12-19T12:05:18.590+0000: 11876.109: [Concurrent marking 5582M->5610M(6144M), 1093.063 ms] 2017-12-19T12:05:19.684+0000: 11877.203: [Pause Final MarkTotal Garbage: 4518M Immediate Garbage: 1200M, 600 regions (28% of total) Garbage to be collected: 2872M (63% of total), 1512 regions Live objects to be evacuated: 147M Live/garbage ratio in collected regions: 5% 5610M->4412M(6144M), 38.723 ms] 2017-12-19T12:05:19.723+0000: 11877.243: [Concurrent evacuation 4413M->4577M(6144M), 130.382 ms] 2017-12-19T12:05:19.854+0000: 11877.373: [Pause Init Update Refs, 0.114 ms] 2017-12-19T12:05:19.854+0000: 11877.374: [Concurrent update references 4577M->4599M(6144M), 669.178 ms] 2017-12-19T12:05:20.524+0000: 11878.043: [Pause Final Update Refs 4599M->1578M(6144M), 2.454 ms] 2017-12-19T12:05:20.527+0000: 11878.046: [Concurrent reset bitmaps 1578M->1578M(6144M), 3.767 ms] Capacity: 6144M, Peak Occupancy: 5610M, Lowest Free: 533M, Free Threshold: 184M Adjusting free threshold to: 4% (245M) Concurrent marking triggered. Free: 244M, Free Threshold: 245M; Allocated: 244M, Alloc Threshold: 0M 2017-12-19T12:06:26.485+0000: 11944.004: [Pause Init Mark, 3.089 ms] 2017-12-19T12:06:26.488+0000: 11944.007: [Concurrent marking 5892M->6134M(6144M), 956.447 ms] 2017-12-19T12:06:27.444+0000: 11944.964: [Concurrent precleaning 6134M->6136M(6144M), 1.746 ms] 2017-12-19T12:06:27.447+0000: 11944.966: [Pause Final MarkTotal Garbage: 5133M Immediate Garbage: 2282M, 1141 regions (46% of total) Garbage to be collected: 2535M (49% of total), 1311 regions Live objects to be evacuated: 83M Live/garbage ratio in collected regions: 3% 6136M->3856M(6144M), 13.938 ms] 2017-12-19T12:06:27.465+0000: 11944.984: [Concurrent evacuation 3864M->4004M(6144M), 117.100 ms] 2017-12-19T12:06:27.583+0000: 11945.102: [Pause Init Update Refs, 0.106 ms] 2017-12-19T12:06:27.583+0000: 11945.103: [Concurrent update references 4006M->4392M(6144M), 807.411 ms] 2017-12-19T12:06:28.392+0000: 11945.911: [Pause Final Update Refs 4392M->1773M(6144M), 2.480 ms] 2017-12-19T12:06:28.394+0000: 11945.913: [Concurrent reset bitmaps 1773M->1775M(6144M), 5.338 ms] Capacity: 6144M, Peak Occupancy: 6136M, Lowest Free: 7M, Free Threshold: 184M Adjusting free threshold to: 7% (430M) Concurrent marking triggered. Free: 428M, Free Threshold: 430M; Allocated: 428M, Alloc Threshold: 0M 2017-12-19T12:06:57.712+0000: 11975.232: [Pause Init Mark, 2.998 ms] 2017-12-19T12:06:57.716+0000: 11975.235: [Concurrent marking 5710M->5824M(6144M), 1137.936 ms] 2017-12-19T12:06:58.854+0000: 11976.374: [Pause Final MarkTotal Garbage: 4656M Immediate Garbage: 1890M, 945 regions (43% of total) Garbage to be collected: 2329M (50% of total), 1235 regions Live objects to be evacuated: 139M Live/garbage ratio in collected regions: 5% 5824M->3936M(6144M), 12.464 ms] 2017-12-19T12:06:58.868+0000: 11976.387: [Concurrent evacuation 3943M->4086M(6144M), 110.411 ms] 2017-12-19T12:06:58.979+0000: 11976.498: [Pause Init Update Refs, 0.106 ms] 2017-12-19T12:06:58.979+0000: 11976.498: [Concurrent update references 4086M->4125M(6144M), 711.303 ms] 2017-12-19T12:06:59.691+0000: 11977.211: [Pause Final Update Refs 4125M->1656M(6144M), 2.317 ms] 2017-12-19T12:06:59.694+0000: 11977.213: [Concurrent reset bitmaps 1656M->1656M(6144M), 2.713 ms] Capacity: 6144M, Peak Occupancy: 5824M, Lowest Free: 319M, Free Threshold: 184M Concurrent marking triggered. Free: 430M, Free Threshold: 430M; Allocated: 430M, Alloc Threshold: 0M 2017-12-19T12:07:28.999+0000: 12006.518: [Pause Init Mark, 2.983 ms] 2017-12-19T12:07:29.002+0000: 12006.521: [Concurrent marking 5713M->5890M(6144M), 1230.274 ms] 2017-12-19T12:07:30.233+0000: 12007.753: [Pause Final MarkTotal Garbage: 4660M Immediate Garbage: 1962M, 981 regions (45% of total) Garbage to be collected: 2257M (48% of total), 1194 regions Live objects to be evacuated: 128M Live/garbage ratio in collected regions: 5% 5890M->3930M(6144M), 11.268 ms] 2017-12-19T12:07:30.245+0000: 12007.764: [Concurrent evacuation 3932M->4066M(6144M), 111.081 ms] 2017-12-19T12:07:30.357+0000: 12007.876: [Pause Init Update Refs, 0.101 ms] 2017-12-19T12:07:30.357+0000: 12007.876: [Concurrent update references 4066M->4080M(6144M), 686.184 ms] 2017-12-19T12:07:31.044+0000: 12008.563: [Pause Final Update Refs 4080M->1694M(6144M), 2.311 ms] 2017-12-19T12:07:31.046+0000: 12008.565: [Concurrent reset bitmaps 1694M->1694M(6144M), 3.733 ms] Capacity: 6144M, Peak Occupancy: 5890M, Lowest Free: 253M, Free Threshold: 184M Concurrent marking triggered. Free: 429M, Free Threshold: 430M; Allocated: 429M, Alloc Threshold: 0M 2017-12-19T12:09:15.602+0000: 12113.121: [Pause Init Mark, 2.988 ms] 2017-12-19T12:09:15.605+0000: 12113.124: [Concurrent marking 5704M->5721M(6144M), 713.760 ms] 2017-12-19T12:09:16.319+0000: 12113.839: [Pause Final MarkTotal Garbage: 4948M Immediate Garbage: 1667M, 834 regions (35% of total) Garbage to be collected: 2964M (59% of total), 1525 regions Live objects to be evacuated: 79M Live/garbage ratio in collected regions: 2% 5721M->4055M(6144M), 11.784 ms] 2017-12-19T12:09:16.332+0000: 12113.851: [Concurrent evacuation 4061M->4145M(6144M), 66.259 ms] 2017-12-19T12:09:16.399+0000: 12113.918: [Pause Init Update Refs, 0.104 ms] 2017-12-19T12:09:16.399+0000: 12113.918: [Concurrent update references 4145M->4159M(6144M), 409.740 ms] 2017-12-19T12:09:16.810+0000: 12114.329: [Pause Final Update Refs 4159M->1115M(6144M), 2.491 ms] 2017-12-19T12:09:16.812+0000: 12114.332: [Concurrent reset bitmaps 1115M->1115M(6144M), 4.700 ms] Capacity: 6144M, Peak Occupancy: 5721M, Lowest Free: 422M, Free Threshold: 184M Adjusting free threshold to: 4% (245M) Concurrent marking triggered. Free: 238M, Free Threshold: 245M; Allocated: 238M, Alloc Threshold: 0M 2017-12-19T12:09:47.005+0000: 12144.524: [Pause Init Mark, 2.417 ms] 2017-12-19T12:09:47.008+0000: 12144.527: [Concurrent marking 5898M->5948M(6144M), 985.550 ms] 2017-12-19T12:09:47.994+0000: 12145.513: [Pause Final MarkTotal Garbage: 4900M Immediate Garbage: 2342M, 1171 regions (51% of total) Garbage to be collected: 2116M (43% of total), 1098 regions Live objects to be evacuated: 78M Live/garbage ratio in collected regions: 3% 5948M->3608M(6144M), 37.561 ms] 2017-12-19T12:09:48.032+0000: 12145.551: [Concurrent evacuation 3609M->3696M(6144M), 68.148 ms] 2017-12-19T12:09:48.101+0000: 12145.620: [Pause Init Update Refs, 0.102 ms] 2017-12-19T12:09:48.101+0000: 12145.620: [Concurrent update references 3696M->3724M(6144M), 584.008 ms] 2017-12-19T12:09:48.686+0000: 12146.205: [Pause Final Update Refs 3724M->1530M(6144M), 2.221 ms] 2017-12-19T12:09:48.688+0000: 12146.207: [Concurrent reset bitmaps 1530M->1530M(6144M), 2.919 ms] Capacity: 6144M, Peak Occupancy: 5948M, Lowest Free: 195M, Free Threshold: 184M Concurrent marking triggered. Free: 245M, Free Threshold: 245M; Allocated: 245M, Alloc Threshold: 0M 2017-12-19T12:11:03.960+0000: 12221.481: [Pause Init Mark, 5.081 ms] 2017-12-19T12:11:03.965+0000: 12221.485: [Concurrent markingCancelling concurrent GC: Allocation Failure 5887M->6130M(6144M), 653.310 ms] Adjusting free threshold to: 14% (860M) 2017-12-19T12:11:04.620+0000: 12222.139: [Pause Final MarkTotal Garbage: 4889M Immediate Garbage: 2018M, 1009 regions (44% of total) Garbage to be collected: 2435M (49% of total), 1264 regions Live objects to be evacuated: 87M Live/garbage ratio in collected regions: 3% 6130M->4114M(6144M), 446.259 ms] 2017-12-19T12:11:05.069+0000: 12222.594: [Concurrent evacuation 4118M->4251M(6144M), 116.424 ms] 2017-12-19T12:11:05.187+0000: 12222.706: [Pause Init Update Refs, 0.105 ms] 2017-12-19T12:11:05.187+0000: 12222.706: [Concurrent update references 4251M->4614M(6144M), 992.439 ms] 2017-12-19T12:11:06.180+0000: 12223.699: [Pause Final Update Refs 4614M->2092M(6144M), 2.380 ms] 2017-12-19T12:11:06.183+0000: 12223.702: [Concurrent reset bitmaps 2092M->2092M(6144M), 4.246 ms] Capacity: 6144M, Peak Occupancy: 6130M, Lowest Free: 13M, Free Threshold: 184M Adjusting free threshold to: 17% (1044M) Concurrent marking triggered. Free: 1043M, Free Threshold: 1044M; Allocated: 1043M, Alloc Threshold: 0M 2017-12-19T12:11:23.829+0000: 12241.348: [Pause Init Mark, 2.962 ms] 2017-12-19T12:11:23.832+0000: 12241.351: [Concurrent marking 5094M->5122M(6144M), 700.227 ms] 2017-12-19T12:11:24.533+0000: 12242.052: [Pause Final MarkTotal Garbage: 4338M Immediate Garbage: 2252M, 1126 regions (54% of total) Garbage to be collected: 1771M (40% of total), 923 regions Live objects to be evacuated: 73M Live/garbage ratio in collected regions: 4% 5122M->2872M(6144M), 12.196 ms] 2017-12-19T12:11:24.545+0000: 12242.065: [Concurrent evacuation 2876M->2958M(6144M), 63.041 ms] 2017-12-19T12:11:24.609+0000: 12242.128: [Pause Init Update Refs, 0.099 ms] 2017-12-19T12:11:24.609+0000: 12242.129: [Concurrent update references 2962M->2968M(6144M), 402.969 ms] 2017-12-19T12:11:25.013+0000: 12242.533: [Pause Final Update Refs 2968M->1124M(6144M), 2.222 ms] 2017-12-19T12:11:25.016+0000: 12242.535: [Concurrent reset bitmaps 1124M->1124M(6144M), 3.644 ms] Capacity: 6144M, Peak Occupancy: 5122M, Lowest Free: 1021M, Free Threshold: 184M Concurrent marking triggered. Free: 1043M, Free Threshold: 1044M; Allocated: 1043M, Alloc Threshold: 0M 2017-12-19T12:12:09.424+0000: 12286.943: [Pause Init Mark, 3.066 ms] 2017-12-19T12:12:09.427+0000: 12286.947: [Concurrent marking 5091M->5105M(6144M), 694.640 ms] 2017-12-19T12:12:10.123+0000: 12287.642: [Pause Final MarkTotal Garbage: 4338M Immediate Garbage: 1980M, 990 regions (48% of total) Garbage to be collected: 2044M (47% of total), 1062 regions Live objects to be evacuated: 76M Live/garbage ratio in collected regions: 3% 5105M->3127M(6144M), 11.334 ms] 2017-12-19T12:12:10.134+0000: 12287.654: [Concurrent evacuation 3129M->3212M(6144M), 64.445 ms] 2017-12-19T12:12:10.200+0000: 12287.719: [Pause Init Update Refs, 0.106 ms] 2017-12-19T12:12:10.200+0000: 12287.719: [Concurrent update references 3216M->3224M(6144M), 405.975 ms] 2017-12-19T12:12:10.607+0000: 12288.126: [Pause Final Update Refs 3224M->1103M(6144M), 2.868 ms] 2017-12-19T12:12:10.610+0000: 12288.129: [Concurrent reset bitmaps 1103M->1103M(6144M), 4.004 ms] Capacity: 6144M, Peak Occupancy: 5105M, Lowest Free: 1038M, Free Threshold: 184M Concurrent marking triggered. Free: 1035M, Free Threshold: 1044M; Allocated: 1035M, Alloc Threshold: 0M 2017-12-19T12:13:10.975+0000: 12348.494: [Pause Init Mark, 3.064 ms] 2017-12-19T12:13:10.978+0000: 12348.497: [Concurrent marking 5099M->5257M(6144M), 1174.055 ms] 2017-12-19T12:13:12.153+0000: 12349.672: [Pause Final MarkTotal Garbage: 4058M Immediate Garbage: 1164M, 583 regions (31% of total) Garbage to be collected: 2456M (60% of total), 1294 regions Live objects to be evacuated: 129M Live/garbage ratio in collected regions: 5% 5257M->4095M(6144M), 13.076 ms] 2017-12-19T12:13:12.167+0000: 12349.686: [Concurrent evacuation 4099M->4237M(6144M), 106.069 ms] 2017-12-19T12:13:12.274+0000: 12349.793: [Pause Init Update Refs, 0.104 ms] 2017-12-19T12:13:12.274+0000: 12349.793: [Concurrent update references 4237M->4254M(6144M), 687.437 ms] 2017-12-19T12:13:12.962+0000: 12350.481: [Pause Final Update Refs 4254M->1668M(6144M), 2.407 ms] 2017-12-19T12:13:12.965+0000: 12350.484: [Concurrent reset bitmaps 1668M->1668M(6144M), 3.446 ms] Capacity: 6144M, Peak Occupancy: 5257M, Lowest Free: 886M, Free Threshold: 184M Concurrent marking triggered. Free: 1035M, Free Threshold: 1044M; Allocated: 1035M, Alloc Threshold: 0M 2017-12-19T12:14:10.870+0000: 12408.389: [Pause Init Mark, 2.398 ms] 2017-12-19T12:14:10.872+0000: 12408.391: [Concurrent marking 5098M->5293M(6144M), 1229.373 ms] 2017-12-19T12:14:12.102+0000: 12409.622: [Pause Final MarkTotal Garbage: 4061M Immediate Garbage: 1206M, 603 regions (32% of total) Garbage to be collected: 2409M (59% of total), 1268 regions Live objects to be evacuated: 123M Live/garbage ratio in collected regions: 5% 5293M->4089M(6144M), 35.800 ms] 2017-12-19T12:14:12.139+0000: 12409.659: [Concurrent evacuation 4098M->4226M(6144M), 100.445 ms] 2017-12-19T12:14:12.241+0000: 12409.760: [Pause Init Update Refs, 0.100 ms] 2017-12-19T12:14:12.241+0000: 12409.760: [Concurrent update references 4226M->4245M(6144M), 697.900 ms] 2017-12-19T12:14:12.940+0000: 12410.459: [Pause Final Update Refs 4245M->1711M(6144M), 2.315 ms] 2017-12-19T12:14:12.942+0000: 12410.461: [Concurrent reset bitmaps 1711M->1711M(6144M), 3.816 ms] Capacity: 6144M, Peak Occupancy: 5293M, Lowest Free: 850M, Free Threshold: 184M Adjusting free threshold to: 12% (737M) Concurrent marking triggered. Free: 736M, Free Threshold: 737M; Allocated: 736M, Alloc Threshold: 0M 2017-12-19T12:14:37.225+0000: 12434.744: [Pause Init Mark, 2.908 ms] 2017-12-19T12:14:37.228+0000: 12434.747: [Concurrent marking 5401M->5851M(6144M), 1192.213 ms] 2017-12-19T12:14:38.420+0000: 12435.939: [Concurrent precleaning 5851M->5851M(6144M), 0.398 ms] 2017-12-19T12:14:38.421+0000: 12435.940: [Pause Final MarkTotal Garbage: 4561M Immediate Garbage: 2298M, 1149 regions (53% of total) Garbage to be collected: 1898M (41% of total), 989 regions Live objects to be evacuated: 77M Live/garbage ratio in collected regions: 4% 5851M->3555M(6144M), 16.073 ms] 2017-12-19T12:14:38.437+0000: 12435.957: [Concurrent evacuation 3559M->3685M(6144M), 103.447 ms] 2017-12-19T12:14:38.542+0000: 12436.061: [Pause Init Update Refs, 0.104 ms] 2017-12-19T12:14:38.542+0000: 12436.061: [Concurrent update references 3685M->4019M(6144M), 935.649 ms] 2017-12-19T12:14:39.482+0000: 12437.001: [Pause Final Update Refs 4019M->2043M(6144M), 2.179 ms] 2017-12-19T12:14:39.484+0000: 12437.004: [Concurrent reset bitmaps 2043M->2045M(6144M), 4.248 ms] Capacity: 6144M, Peak Occupancy: 5851M, Lowest Free: 292M, Free Threshold: 184M Concurrent marking triggered. Free: 732M, Free Threshold: 737M; Allocated: 732M, Alloc Threshold: 0M 2017-12-19T12:15:10.995+0000: 12468.518: [Pause Init Mark, 6.252 ms] 2017-12-19T12:15:11.002+0000: 12468.521: [Concurrent marking 5408M->5852M(6144M), 1412.717 ms] 2017-12-19T12:15:12.415+0000: 12469.935: [Pause Final MarkTotal Garbage: 4407M Immediate Garbage: 1828M, 914 regions (45% of total) Garbage to be collected: 2136M (48% of total), 1113 regions Live objects to be evacuated: 87M Live/garbage ratio in collected regions: 4% 5852M->4026M(6144M), 11.499 ms] 2017-12-19T12:15:12.427+0000: 12469.946: [Concurrent evacuation 4028M->4121M(6144M), 77.189 ms] 2017-12-19T12:15:12.505+0000: 12470.024: [Pause Init Update Refs, 0.104 ms] 2017-12-19T12:15:12.505+0000: 12470.025: [Concurrent update references 4121M->4157M(6144M), 763.012 ms] 2017-12-19T12:15:13.269+0000: 12470.788: [Pause Final Update Refs 4157M->1932M(6144M), 2.305 ms] 2017-12-19T12:15:13.272+0000: 12470.791: [Concurrent reset bitmaps 1932M->1932M(6144M), 3.267 ms] Capacity: 6144M, Peak Occupancy: 5852M, Lowest Free: 291M, Free Threshold: 184M -- wbr, Kirill From shade at redhat.com Tue Dec 19 15:05:38 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 19 Dec 2017 16:05:38 +0100 Subject: Rare very big pause In-Reply-To: References: Message-ID: <69178fbf-e0b4-eeda-9467-3e5fe087556b@redhat.com> On 12/19/2017 03:57 PM, Kirill A. Korinsky wrote: > 2017-12-19T12:11:03.960+0000: 12221.481: [Pause Init Mark, 5.081 ms] > 2017-12-19T12:11:03.965+0000: 12221.485: [Concurrent markingCancelling concurrent GC: Allocation Failure > 5887M->6130M(6144M), 653.310 ms] > Adjusting free threshold to: 14% (860M) > 2017-12-19T12:11:04.620+0000: 12222.139: [Pause Final MarkTotal Garbage: 4889M > Immediate Garbage: 2018M, 1009 regions (44% of total) > Garbage to be collected: 2435M (49% of total), 1264 regions > Live objects to be evacuated: 87M > Live/garbage ratio in collected regions: 3% > 6130M->4114M(6144M), 446.259 ms] This is degenerated mark: collector ran out of memory during concurrent mark (or, in other words, application had allocated too much), and Shenandoah dived into Final Mark right away, where it completed the marking phase. Heuristics is supposed to find the optimal spot when to start the cycle to avoid this, but transient hiccups like this might still happen early in the application lifecycle. Or, you might want to tell heuristics how much free space to maintain to absorb allocations, see our Wiki about that. Thanks, -Aleksey From rkennke at redhat.com Tue Dec 19 15:07:35 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 19 Dec 2017 16:07:35 +0100 Subject: Rare very big pause In-Reply-To: References: Message-ID: <067d1484-2a93-ed14-5b87-c54a47312c44@redhat.com> Hi Kirill, none of the pauses in that GC log is longer than a few ms. You might be hitting a non-GC related pause. For example, there's safepoint cleanups happening which sometimes take long (see [0] or [1]), and there is non-GC safepoints too (e.g. deoptimization or biased locking revokation). You are not using biased locking, so this is not it. -XX:+PrintSafepointStatistics should provide some more insights. [0] https://bugs.openjdk.java.net/browse/JDK-8132849 [1] https://bugs.openjdk.java.net/browse/JDK-8153224 Best regards, Roman > Good day! > > I'm trying to use Shenandoah GC from Fedora 27 at jdk8u151b12, and I have rare (one-two time per our) very big pause. Up to 0.5 seconds. > > You can find a GC log bellow (it is log for 10 minutes where this sort of pause happened). > > The pause had happened at 2017-12-19T12:11:03.965+0000 and was 446.259 ms > > java runs with arguments: > -server -XX:-OmitStackTraceInFastThrow -Xmx6144m -XX:+UseShenandoahGC -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+DisableExplicitGC -XX:+UseTransparentHugePages -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps > > The log: > Concurrent marking triggered. Free: 552M, Free Threshold: 552M; Allocated: 552M, Alloc Threshold: 0M > 2017-12-19T12:05:18.585+0000: 11876.106: [Pause Init Mark, 5.275 ms] > 2017-12-19T12:05:18.590+0000: 11876.109: [Concurrent marking 5582M->5610M(6144M), 1093.063 ms] > 2017-12-19T12:05:19.684+0000: 11877.203: [Pause Final MarkTotal Garbage: 4518M > Immediate Garbage: 1200M, 600 regions (28% of total) > Garbage to be collected: 2872M (63% of total), 1512 regions > Live objects to be evacuated: 147M > Live/garbage ratio in collected regions: 5% > 5610M->4412M(6144M), 38.723 ms] > 2017-12-19T12:05:19.723+0000: 11877.243: [Concurrent evacuation 4413M->4577M(6144M), 130.382 ms] > 2017-12-19T12:05:19.854+0000: 11877.373: [Pause Init Update Refs, 0.114 ms] > 2017-12-19T12:05:19.854+0000: 11877.374: [Concurrent update references 4577M->4599M(6144M), 669.178 ms] > 2017-12-19T12:05:20.524+0000: 11878.043: [Pause Final Update Refs 4599M->1578M(6144M), 2.454 ms] > 2017-12-19T12:05:20.527+0000: 11878.046: [Concurrent reset bitmaps 1578M->1578M(6144M), 3.767 ms] > Capacity: 6144M, Peak Occupancy: 5610M, Lowest Free: 533M, Free Threshold: 184M > Adjusting free threshold to: 4% (245M) > Concurrent marking triggered. Free: 244M, Free Threshold: 245M; Allocated: 244M, Alloc Threshold: 0M > 2017-12-19T12:06:26.485+0000: 11944.004: [Pause Init Mark, 3.089 ms] > 2017-12-19T12:06:26.488+0000: 11944.007: [Concurrent marking 5892M->6134M(6144M), 956.447 ms] > 2017-12-19T12:06:27.444+0000: 11944.964: [Concurrent precleaning 6134M->6136M(6144M), 1.746 ms] > 2017-12-19T12:06:27.447+0000: 11944.966: [Pause Final MarkTotal Garbage: 5133M > Immediate Garbage: 2282M, 1141 regions (46% of total) > Garbage to be collected: 2535M (49% of total), 1311 regions > Live objects to be evacuated: 83M > Live/garbage ratio in collected regions: 3% > 6136M->3856M(6144M), 13.938 ms] > 2017-12-19T12:06:27.465+0000: 11944.984: [Concurrent evacuation 3864M->4004M(6144M), 117.100 ms] > 2017-12-19T12:06:27.583+0000: 11945.102: [Pause Init Update Refs, 0.106 ms] > 2017-12-19T12:06:27.583+0000: 11945.103: [Concurrent update references 4006M->4392M(6144M), 807.411 ms] > 2017-12-19T12:06:28.392+0000: 11945.911: [Pause Final Update Refs 4392M->1773M(6144M), 2.480 ms] > 2017-12-19T12:06:28.394+0000: 11945.913: [Concurrent reset bitmaps 1773M->1775M(6144M), 5.338 ms] > Capacity: 6144M, Peak Occupancy: 6136M, Lowest Free: 7M, Free Threshold: 184M > Adjusting free threshold to: 7% (430M) > Concurrent marking triggered. Free: 428M, Free Threshold: 430M; Allocated: 428M, Alloc Threshold: 0M > 2017-12-19T12:06:57.712+0000: 11975.232: [Pause Init Mark, 2.998 ms] > 2017-12-19T12:06:57.716+0000: 11975.235: [Concurrent marking 5710M->5824M(6144M), 1137.936 ms] > 2017-12-19T12:06:58.854+0000: 11976.374: [Pause Final MarkTotal Garbage: 4656M > Immediate Garbage: 1890M, 945 regions (43% of total) > Garbage to be collected: 2329M (50% of total), 1235 regions > Live objects to be evacuated: 139M > Live/garbage ratio in collected regions: 5% > 5824M->3936M(6144M), 12.464 ms] > 2017-12-19T12:06:58.868+0000: 11976.387: [Concurrent evacuation 3943M->4086M(6144M), 110.411 ms] > 2017-12-19T12:06:58.979+0000: 11976.498: [Pause Init Update Refs, 0.106 ms] > 2017-12-19T12:06:58.979+0000: 11976.498: [Concurrent update references 4086M->4125M(6144M), 711.303 ms] > 2017-12-19T12:06:59.691+0000: 11977.211: [Pause Final Update Refs 4125M->1656M(6144M), 2.317 ms] > 2017-12-19T12:06:59.694+0000: 11977.213: [Concurrent reset bitmaps 1656M->1656M(6144M), 2.713 ms] > Capacity: 6144M, Peak Occupancy: 5824M, Lowest Free: 319M, Free Threshold: 184M > Concurrent marking triggered. Free: 430M, Free Threshold: 430M; Allocated: 430M, Alloc Threshold: 0M > 2017-12-19T12:07:28.999+0000: 12006.518: [Pause Init Mark, 2.983 ms] > 2017-12-19T12:07:29.002+0000: 12006.521: [Concurrent marking 5713M->5890M(6144M), 1230.274 ms] > 2017-12-19T12:07:30.233+0000: 12007.753: [Pause Final MarkTotal Garbage: 4660M > Immediate Garbage: 1962M, 981 regions (45% of total) > Garbage to be collected: 2257M (48% of total), 1194 regions > Live objects to be evacuated: 128M > Live/garbage ratio in collected regions: 5% > 5890M->3930M(6144M), 11.268 ms] > 2017-12-19T12:07:30.245+0000: 12007.764: [Concurrent evacuation 3932M->4066M(6144M), 111.081 ms] > 2017-12-19T12:07:30.357+0000: 12007.876: [Pause Init Update Refs, 0.101 ms] > 2017-12-19T12:07:30.357+0000: 12007.876: [Concurrent update references 4066M->4080M(6144M), 686.184 ms] > 2017-12-19T12:07:31.044+0000: 12008.563: [Pause Final Update Refs 4080M->1694M(6144M), 2.311 ms] > 2017-12-19T12:07:31.046+0000: 12008.565: [Concurrent reset bitmaps 1694M->1694M(6144M), 3.733 ms] > Capacity: 6144M, Peak Occupancy: 5890M, Lowest Free: 253M, Free Threshold: 184M > Concurrent marking triggered. Free: 429M, Free Threshold: 430M; Allocated: 429M, Alloc Threshold: 0M > 2017-12-19T12:09:15.602+0000: 12113.121: [Pause Init Mark, 2.988 ms] > 2017-12-19T12:09:15.605+0000: 12113.124: [Concurrent marking 5704M->5721M(6144M), 713.760 ms] > 2017-12-19T12:09:16.319+0000: 12113.839: [Pause Final MarkTotal Garbage: 4948M > Immediate Garbage: 1667M, 834 regions (35% of total) > Garbage to be collected: 2964M (59% of total), 1525 regions > Live objects to be evacuated: 79M > Live/garbage ratio in collected regions: 2% > 5721M->4055M(6144M), 11.784 ms] > 2017-12-19T12:09:16.332+0000: 12113.851: [Concurrent evacuation 4061M->4145M(6144M), 66.259 ms] > 2017-12-19T12:09:16.399+0000: 12113.918: [Pause Init Update Refs, 0.104 ms] > 2017-12-19T12:09:16.399+0000: 12113.918: [Concurrent update references 4145M->4159M(6144M), 409.740 ms] > 2017-12-19T12:09:16.810+0000: 12114.329: [Pause Final Update Refs 4159M->1115M(6144M), 2.491 ms] > 2017-12-19T12:09:16.812+0000: 12114.332: [Concurrent reset bitmaps 1115M->1115M(6144M), 4.700 ms] > Capacity: 6144M, Peak Occupancy: 5721M, Lowest Free: 422M, Free Threshold: 184M > Adjusting free threshold to: 4% (245M) > Concurrent marking triggered. Free: 238M, Free Threshold: 245M; Allocated: 238M, Alloc Threshold: 0M > 2017-12-19T12:09:47.005+0000: 12144.524: [Pause Init Mark, 2.417 ms] > 2017-12-19T12:09:47.008+0000: 12144.527: [Concurrent marking 5898M->5948M(6144M), 985.550 ms] > 2017-12-19T12:09:47.994+0000: 12145.513: [Pause Final MarkTotal Garbage: 4900M > Immediate Garbage: 2342M, 1171 regions (51% of total) > Garbage to be collected: 2116M (43% of total), 1098 regions > Live objects to be evacuated: 78M > Live/garbage ratio in collected regions: 3% > 5948M->3608M(6144M), 37.561 ms] > 2017-12-19T12:09:48.032+0000: 12145.551: [Concurrent evacuation 3609M->3696M(6144M), 68.148 ms] > 2017-12-19T12:09:48.101+0000: 12145.620: [Pause Init Update Refs, 0.102 ms] > 2017-12-19T12:09:48.101+0000: 12145.620: [Concurrent update references 3696M->3724M(6144M), 584.008 ms] > 2017-12-19T12:09:48.686+0000: 12146.205: [Pause Final Update Refs 3724M->1530M(6144M), 2.221 ms] > 2017-12-19T12:09:48.688+0000: 12146.207: [Concurrent reset bitmaps 1530M->1530M(6144M), 2.919 ms] > Capacity: 6144M, Peak Occupancy: 5948M, Lowest Free: 195M, Free Threshold: 184M > Concurrent marking triggered. Free: 245M, Free Threshold: 245M; Allocated: 245M, Alloc Threshold: 0M > 2017-12-19T12:11:03.960+0000: 12221.481: [Pause Init Mark, 5.081 ms] > 2017-12-19T12:11:03.965+0000: 12221.485: [Concurrent markingCancelling concurrent GC: Allocation Failure > 5887M->6130M(6144M), 653.310 ms] > Adjusting free threshold to: 14% (860M) > 2017-12-19T12:11:04.620+0000: 12222.139: [Pause Final MarkTotal Garbage: 4889M > Immediate Garbage: 2018M, 1009 regions (44% of total) > Garbage to be collected: 2435M (49% of total), 1264 regions > Live objects to be evacuated: 87M > Live/garbage ratio in collected regions: 3% > 6130M->4114M(6144M), 446.259 ms] > 2017-12-19T12:11:05.069+0000: 12222.594: [Concurrent evacuation 4118M->4251M(6144M), 116.424 ms] > 2017-12-19T12:11:05.187+0000: 12222.706: [Pause Init Update Refs, 0.105 ms] > 2017-12-19T12:11:05.187+0000: 12222.706: [Concurrent update references 4251M->4614M(6144M), 992.439 ms] > 2017-12-19T12:11:06.180+0000: 12223.699: [Pause Final Update Refs 4614M->2092M(6144M), 2.380 ms] > 2017-12-19T12:11:06.183+0000: 12223.702: [Concurrent reset bitmaps 2092M->2092M(6144M), 4.246 ms] > Capacity: 6144M, Peak Occupancy: 6130M, Lowest Free: 13M, Free Threshold: 184M > Adjusting free threshold to: 17% (1044M) > Concurrent marking triggered. Free: 1043M, Free Threshold: 1044M; Allocated: 1043M, Alloc Threshold: 0M > 2017-12-19T12:11:23.829+0000: 12241.348: [Pause Init Mark, 2.962 ms] > 2017-12-19T12:11:23.832+0000: 12241.351: [Concurrent marking 5094M->5122M(6144M), 700.227 ms] > 2017-12-19T12:11:24.533+0000: 12242.052: [Pause Final MarkTotal Garbage: 4338M > Immediate Garbage: 2252M, 1126 regions (54% of total) > Garbage to be collected: 1771M (40% of total), 923 regions > Live objects to be evacuated: 73M > Live/garbage ratio in collected regions: 4% > 5122M->2872M(6144M), 12.196 ms] > 2017-12-19T12:11:24.545+0000: 12242.065: [Concurrent evacuation 2876M->2958M(6144M), 63.041 ms] > 2017-12-19T12:11:24.609+0000: 12242.128: [Pause Init Update Refs, 0.099 ms] > 2017-12-19T12:11:24.609+0000: 12242.129: [Concurrent update references 2962M->2968M(6144M), 402.969 ms] > 2017-12-19T12:11:25.013+0000: 12242.533: [Pause Final Update Refs 2968M->1124M(6144M), 2.222 ms] > 2017-12-19T12:11:25.016+0000: 12242.535: [Concurrent reset bitmaps 1124M->1124M(6144M), 3.644 ms] > Capacity: 6144M, Peak Occupancy: 5122M, Lowest Free: 1021M, Free Threshold: 184M > Concurrent marking triggered. Free: 1043M, Free Threshold: 1044M; Allocated: 1043M, Alloc Threshold: 0M > 2017-12-19T12:12:09.424+0000: 12286.943: [Pause Init Mark, 3.066 ms] > 2017-12-19T12:12:09.427+0000: 12286.947: [Concurrent marking 5091M->5105M(6144M), 694.640 ms] > 2017-12-19T12:12:10.123+0000: 12287.642: [Pause Final MarkTotal Garbage: 4338M > Immediate Garbage: 1980M, 990 regions (48% of total) > Garbage to be collected: 2044M (47% of total), 1062 regions > Live objects to be evacuated: 76M > Live/garbage ratio in collected regions: 3% > 5105M->3127M(6144M), 11.334 ms] > 2017-12-19T12:12:10.134+0000: 12287.654: [Concurrent evacuation 3129M->3212M(6144M), 64.445 ms] > 2017-12-19T12:12:10.200+0000: 12287.719: [Pause Init Update Refs, 0.106 ms] > 2017-12-19T12:12:10.200+0000: 12287.719: [Concurrent update references 3216M->3224M(6144M), 405.975 ms] > 2017-12-19T12:12:10.607+0000: 12288.126: [Pause Final Update Refs 3224M->1103M(6144M), 2.868 ms] > 2017-12-19T12:12:10.610+0000: 12288.129: [Concurrent reset bitmaps 1103M->1103M(6144M), 4.004 ms] > Capacity: 6144M, Peak Occupancy: 5105M, Lowest Free: 1038M, Free Threshold: 184M > Concurrent marking triggered. Free: 1035M, Free Threshold: 1044M; Allocated: 1035M, Alloc Threshold: 0M > 2017-12-19T12:13:10.975+0000: 12348.494: [Pause Init Mark, 3.064 ms] > 2017-12-19T12:13:10.978+0000: 12348.497: [Concurrent marking 5099M->5257M(6144M), 1174.055 ms] > 2017-12-19T12:13:12.153+0000: 12349.672: [Pause Final MarkTotal Garbage: 4058M > Immediate Garbage: 1164M, 583 regions (31% of total) > Garbage to be collected: 2456M (60% of total), 1294 regions > Live objects to be evacuated: 129M > Live/garbage ratio in collected regions: 5% > 5257M->4095M(6144M), 13.076 ms] > 2017-12-19T12:13:12.167+0000: 12349.686: [Concurrent evacuation 4099M->4237M(6144M), 106.069 ms] > 2017-12-19T12:13:12.274+0000: 12349.793: [Pause Init Update Refs, 0.104 ms] > 2017-12-19T12:13:12.274+0000: 12349.793: [Concurrent update references 4237M->4254M(6144M), 687.437 ms] > 2017-12-19T12:13:12.962+0000: 12350.481: [Pause Final Update Refs 4254M->1668M(6144M), 2.407 ms] > 2017-12-19T12:13:12.965+0000: 12350.484: [Concurrent reset bitmaps 1668M->1668M(6144M), 3.446 ms] > Capacity: 6144M, Peak Occupancy: 5257M, Lowest Free: 886M, Free Threshold: 184M > Concurrent marking triggered. Free: 1035M, Free Threshold: 1044M; Allocated: 1035M, Alloc Threshold: 0M > 2017-12-19T12:14:10.870+0000: 12408.389: [Pause Init Mark, 2.398 ms] > 2017-12-19T12:14:10.872+0000: 12408.391: [Concurrent marking 5098M->5293M(6144M), 1229.373 ms] > 2017-12-19T12:14:12.102+0000: 12409.622: [Pause Final MarkTotal Garbage: 4061M > Immediate Garbage: 1206M, 603 regions (32% of total) > Garbage to be collected: 2409M (59% of total), 1268 regions > Live objects to be evacuated: 123M > Live/garbage ratio in collected regions: 5% > 5293M->4089M(6144M), 35.800 ms] > 2017-12-19T12:14:12.139+0000: 12409.659: [Concurrent evacuation 4098M->4226M(6144M), 100.445 ms] > 2017-12-19T12:14:12.241+0000: 12409.760: [Pause Init Update Refs, 0.100 ms] > 2017-12-19T12:14:12.241+0000: 12409.760: [Concurrent update references 4226M->4245M(6144M), 697.900 ms] > 2017-12-19T12:14:12.940+0000: 12410.459: [Pause Final Update Refs 4245M->1711M(6144M), 2.315 ms] > 2017-12-19T12:14:12.942+0000: 12410.461: [Concurrent reset bitmaps 1711M->1711M(6144M), 3.816 ms] > Capacity: 6144M, Peak Occupancy: 5293M, Lowest Free: 850M, Free Threshold: 184M > Adjusting free threshold to: 12% (737M) > Concurrent marking triggered. Free: 736M, Free Threshold: 737M; Allocated: 736M, Alloc Threshold: 0M > 2017-12-19T12:14:37.225+0000: 12434.744: [Pause Init Mark, 2.908 ms] > 2017-12-19T12:14:37.228+0000: 12434.747: [Concurrent marking 5401M->5851M(6144M), 1192.213 ms] > 2017-12-19T12:14:38.420+0000: 12435.939: [Concurrent precleaning 5851M->5851M(6144M), 0.398 ms] > 2017-12-19T12:14:38.421+0000: 12435.940: [Pause Final MarkTotal Garbage: 4561M > Immediate Garbage: 2298M, 1149 regions (53% of total) > Garbage to be collected: 1898M (41% of total), 989 regions > Live objects to be evacuated: 77M > Live/garbage ratio in collected regions: 4% > 5851M->3555M(6144M), 16.073 ms] > 2017-12-19T12:14:38.437+0000: 12435.957: [Concurrent evacuation 3559M->3685M(6144M), 103.447 ms] > 2017-12-19T12:14:38.542+0000: 12436.061: [Pause Init Update Refs, 0.104 ms] > 2017-12-19T12:14:38.542+0000: 12436.061: [Concurrent update references 3685M->4019M(6144M), 935.649 ms] > 2017-12-19T12:14:39.482+0000: 12437.001: [Pause Final Update Refs 4019M->2043M(6144M), 2.179 ms] > 2017-12-19T12:14:39.484+0000: 12437.004: [Concurrent reset bitmaps 2043M->2045M(6144M), 4.248 ms] > Capacity: 6144M, Peak Occupancy: 5851M, Lowest Free: 292M, Free Threshold: 184M > Concurrent marking triggered. Free: 732M, Free Threshold: 737M; Allocated: 732M, Alloc Threshold: 0M > 2017-12-19T12:15:10.995+0000: 12468.518: [Pause Init Mark, 6.252 ms] > 2017-12-19T12:15:11.002+0000: 12468.521: [Concurrent marking 5408M->5852M(6144M), 1412.717 ms] > 2017-12-19T12:15:12.415+0000: 12469.935: [Pause Final MarkTotal Garbage: 4407M > Immediate Garbage: 1828M, 914 regions (45% of total) > Garbage to be collected: 2136M (48% of total), 1113 regions > Live objects to be evacuated: 87M > Live/garbage ratio in collected regions: 4% > 5852M->4026M(6144M), 11.499 ms] > 2017-12-19T12:15:12.427+0000: 12469.946: [Concurrent evacuation 4028M->4121M(6144M), 77.189 ms] > 2017-12-19T12:15:12.505+0000: 12470.024: [Pause Init Update Refs, 0.104 ms] > 2017-12-19T12:15:12.505+0000: 12470.025: [Concurrent update references 4121M->4157M(6144M), 763.012 ms] > 2017-12-19T12:15:13.269+0000: 12470.788: [Pause Final Update Refs 4157M->1932M(6144M), 2.305 ms] > 2017-12-19T12:15:13.272+0000: 12470.791: [Concurrent reset bitmaps 1932M->1932M(6144M), 3.267 ms] > Capacity: 6144M, Peak Occupancy: 5852M, Lowest Free: 291M, Free Threshold: 184M > > > -- > wbr, Kirill > > From kirill at korins.ky Tue Dec 19 15:10:23 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Tue, 19 Dec 2017 15:10:23 +0000 Subject: Rare very big pause In-Reply-To: <067d1484-2a93-ed14-5b87-c54a47312c44@redhat.com> References: <067d1484-2a93-ed14-5b87-c54a47312c44@redhat.com> Message-ID: Hi Roman, I mean this pice of log > Concurrent marking triggered. Free: 245M, Free Threshold: 245M; Allocated: 245M, Alloc Threshold: 0M > 2017-12-19T12:11:03.960+0000: 12221.481: [Pause Init Mark, 5.081 ms] > 2017-12-19T12:11:03.965+0000: 12221.485: [Concurrent markingCancelling concurrent GC: Allocation Failure > 5887M->6130M(6144M), 653.310 ms] > Adjusting free threshold to: 14% (860M) > 2017-12-19T12:11:04.620+0000: 12222.139: [Pause Final MarkTotal Garbage: 4889M > Immediate Garbage: 2018M, 1009 regions (44% of total) > Garbage to be collected: 2435M (49% of total), 1264 regions > Live objects to be evacuated: 87M > Live/garbage ratio in collected regions: 3% > 6130M->4114M(6144M), 446.259 ms] > 2017-12-19T12:11:05.069+0000: 12222.594: [Concurrent evacuation 4118M->4251M(6144M), 116.424 ms] > 2017-12-19T12:11:05.187+0000: 12222.706: [Pause Init Update Refs, 0.105 ms] > 2017-12-19T12:11:05.187+0000: 12222.706: [Concurrent update references 4251M->4614M(6144M), 992.439 ms] > 2017-12-19T12:11:06.180+0000: 12223.699: [Pause Final Update Refs 4614M->2092M(6144M), 2.380 ms] > 2017-12-19T12:11:06.183+0000: 12223.702: [Concurrent reset bitmaps 2092M->2092M(6144M), 4.246 ms] > Capacity: 6144M, Peak Occupancy: 6130M, Lowest Free: 13M, Free Threshold: 184M > Adjusting free threshold to: 17% (1044M) If I right understand it means that pause was 446.259 ms. Am I wrong? Thanks for links. I'm reading it. -- wbr, Kirill > On 19 Dec 2017, at 19:07, Roman Kennke wrote: > > Hi Kirill, > > none of the pauses in that GC log is longer than a few ms. > > You might be hitting a non-GC related pause. For example, there's safepoint cleanups happening which sometimes take long (see [0] or [1]), and there is non-GC safepoints too (e.g. deoptimization or biased locking revokation). You are not using biased locking, so this is not it. > > -XX:+PrintSafepointStatistics should provide some more insights. > > [0] https://bugs.openjdk.java.net/browse/JDK-8132849 > [1] https://bugs.openjdk.java.net/browse/JDK-8153224 > > Best regards, > Roman > >> Good day! >> I'm trying to use Shenandoah GC from Fedora 27 at jdk8u151b12, and I have rare (one-two time per our) very big pause. Up to 0.5 seconds. >> You can find a GC log bellow (it is log for 10 minutes where this sort of pause happened). >> The pause had happened at 2017-12-19T12:11:03.965+0000 and was 446.259 ms >> java runs with arguments: >> -server -XX:-OmitStackTraceInFastThrow -Xmx6144m -XX:+UseShenandoahGC -XX:+AlwaysPreTouch -XX:-UseBiasedLocking -XX:+DisableExplicitGC -XX:+UseTransparentHugePages -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps >> The log: >> Concurrent marking triggered. Free: 552M, Free Threshold: 552M; Allocated: 552M, Alloc Threshold: 0M >> 2017-12-19T12:05:18.585+0000: 11876.106: [Pause Init Mark, 5.275 ms] >> 2017-12-19T12:05:18.590+0000: 11876.109: [Concurrent marking 5582M->5610M(6144M), 1093.063 ms] >> 2017-12-19T12:05:19.684+0000: 11877.203: [Pause Final MarkTotal Garbage: 4518M >> Immediate Garbage: 1200M, 600 regions (28% of total) >> Garbage to be collected: 2872M (63% of total), 1512 regions >> Live objects to be evacuated: 147M >> Live/garbage ratio in collected regions: 5% >> 5610M->4412M(6144M), 38.723 ms] >> 2017-12-19T12:05:19.723+0000: 11877.243: [Concurrent evacuation 4413M->4577M(6144M), 130.382 ms] >> 2017-12-19T12:05:19.854+0000: 11877.373: [Pause Init Update Refs, 0.114 ms] >> 2017-12-19T12:05:19.854+0000: 11877.374: [Concurrent update references 4577M->4599M(6144M), 669.178 ms] >> 2017-12-19T12:05:20.524+0000: 11878.043: [Pause Final Update Refs 4599M->1578M(6144M), 2.454 ms] >> 2017-12-19T12:05:20.527+0000: 11878.046: [Concurrent reset bitmaps 1578M->1578M(6144M), 3.767 ms] >> Capacity: 6144M, Peak Occupancy: 5610M, Lowest Free: 533M, Free Threshold: 184M >> Adjusting free threshold to: 4% (245M) >> Concurrent marking triggered. Free: 244M, Free Threshold: 245M; Allocated: 244M, Alloc Threshold: 0M >> 2017-12-19T12:06:26.485+0000: 11944.004: [Pause Init Mark, 3.089 ms] >> 2017-12-19T12:06:26.488+0000: 11944.007: [Concurrent marking 5892M->6134M(6144M), 956.447 ms] >> 2017-12-19T12:06:27.444+0000: 11944.964: [Concurrent precleaning 6134M->6136M(6144M), 1.746 ms] >> 2017-12-19T12:06:27.447+0000: 11944.966: [Pause Final MarkTotal Garbage: 5133M >> Immediate Garbage: 2282M, 1141 regions (46% of total) >> Garbage to be collected: 2535M (49% of total), 1311 regions >> Live objects to be evacuated: 83M >> Live/garbage ratio in collected regions: 3% >> 6136M->3856M(6144M), 13.938 ms] >> 2017-12-19T12:06:27.465+0000: 11944.984: [Concurrent evacuation 3864M->4004M(6144M), 117.100 ms] >> 2017-12-19T12:06:27.583+0000: 11945.102: [Pause Init Update Refs, 0.106 ms] >> 2017-12-19T12:06:27.583+0000: 11945.103: [Concurrent update references 4006M->4392M(6144M), 807.411 ms] >> 2017-12-19T12:06:28.392+0000: 11945.911: [Pause Final Update Refs 4392M->1773M(6144M), 2.480 ms] >> 2017-12-19T12:06:28.394+0000: 11945.913: [Concurrent reset bitmaps 1773M->1775M(6144M), 5.338 ms] >> Capacity: 6144M, Peak Occupancy: 6136M, Lowest Free: 7M, Free Threshold: 184M >> Adjusting free threshold to: 7% (430M) >> Concurrent marking triggered. Free: 428M, Free Threshold: 430M; Allocated: 428M, Alloc Threshold: 0M >> 2017-12-19T12:06:57.712+0000: 11975.232: [Pause Init Mark, 2.998 ms] >> 2017-12-19T12:06:57.716+0000: 11975.235: [Concurrent marking 5710M->5824M(6144M), 1137.936 ms] >> 2017-12-19T12:06:58.854+0000: 11976.374: [Pause Final MarkTotal Garbage: 4656M >> Immediate Garbage: 1890M, 945 regions (43% of total) >> Garbage to be collected: 2329M (50% of total), 1235 regions >> Live objects to be evacuated: 139M >> Live/garbage ratio in collected regions: 5% >> 5824M->3936M(6144M), 12.464 ms] >> 2017-12-19T12:06:58.868+0000: 11976.387: [Concurrent evacuation 3943M->4086M(6144M), 110.411 ms] >> 2017-12-19T12:06:58.979+0000: 11976.498: [Pause Init Update Refs, 0.106 ms] >> 2017-12-19T12:06:58.979+0000: 11976.498: [Concurrent update references 4086M->4125M(6144M), 711.303 ms] >> 2017-12-19T12:06:59.691+0000: 11977.211: [Pause Final Update Refs 4125M->1656M(6144M), 2.317 ms] >> 2017-12-19T12:06:59.694+0000: 11977.213: [Concurrent reset bitmaps 1656M->1656M(6144M), 2.713 ms] >> Capacity: 6144M, Peak Occupancy: 5824M, Lowest Free: 319M, Free Threshold: 184M >> Concurrent marking triggered. Free: 430M, Free Threshold: 430M; Allocated: 430M, Alloc Threshold: 0M >> 2017-12-19T12:07:28.999+0000: 12006.518: [Pause Init Mark, 2.983 ms] >> 2017-12-19T12:07:29.002+0000: 12006.521: [Concurrent marking 5713M->5890M(6144M), 1230.274 ms] >> 2017-12-19T12:07:30.233+0000: 12007.753: [Pause Final MarkTotal Garbage: 4660M >> Immediate Garbage: 1962M, 981 regions (45% of total) >> Garbage to be collected: 2257M (48% of total), 1194 regions >> Live objects to be evacuated: 128M >> Live/garbage ratio in collected regions: 5% >> 5890M->3930M(6144M), 11.268 ms] >> 2017-12-19T12:07:30.245+0000: 12007.764: [Concurrent evacuation 3932M->4066M(6144M), 111.081 ms] >> 2017-12-19T12:07:30.357+0000: 12007.876: [Pause Init Update Refs, 0.101 ms] >> 2017-12-19T12:07:30.357+0000: 12007.876: [Concurrent update references 4066M->4080M(6144M), 686.184 ms] >> 2017-12-19T12:07:31.044+0000: 12008.563: [Pause Final Update Refs 4080M->1694M(6144M), 2.311 ms] >> 2017-12-19T12:07:31.046+0000: 12008.565: [Concurrent reset bitmaps 1694M->1694M(6144M), 3.733 ms] >> Capacity: 6144M, Peak Occupancy: 5890M, Lowest Free: 253M, Free Threshold: 184M >> Concurrent marking triggered. Free: 429M, Free Threshold: 430M; Allocated: 429M, Alloc Threshold: 0M >> 2017-12-19T12:09:15.602+0000: 12113.121: [Pause Init Mark, 2.988 ms] >> 2017-12-19T12:09:15.605+0000: 12113.124: [Concurrent marking 5704M->5721M(6144M), 713.760 ms] >> 2017-12-19T12:09:16.319+0000: 12113.839: [Pause Final MarkTotal Garbage: 4948M >> Immediate Garbage: 1667M, 834 regions (35% of total) >> Garbage to be collected: 2964M (59% of total), 1525 regions >> Live objects to be evacuated: 79M >> Live/garbage ratio in collected regions: 2% >> 5721M->4055M(6144M), 11.784 ms] >> 2017-12-19T12:09:16.332+0000: 12113.851: [Concurrent evacuation 4061M->4145M(6144M), 66.259 ms] >> 2017-12-19T12:09:16.399+0000: 12113.918: [Pause Init Update Refs, 0.104 ms] >> 2017-12-19T12:09:16.399+0000: 12113.918: [Concurrent update references 4145M->4159M(6144M), 409.740 ms] >> 2017-12-19T12:09:16.810+0000: 12114.329: [Pause Final Update Refs 4159M->1115M(6144M), 2.491 ms] >> 2017-12-19T12:09:16.812+0000: 12114.332: [Concurrent reset bitmaps 1115M->1115M(6144M), 4.700 ms] >> Capacity: 6144M, Peak Occupancy: 5721M, Lowest Free: 422M, Free Threshold: 184M >> Adjusting free threshold to: 4% (245M) >> Concurrent marking triggered. Free: 238M, Free Threshold: 245M; Allocated: 238M, Alloc Threshold: 0M >> 2017-12-19T12:09:47.005+0000: 12144.524: [Pause Init Mark, 2.417 ms] >> 2017-12-19T12:09:47.008+0000: 12144.527: [Concurrent marking 5898M->5948M(6144M), 985.550 ms] >> 2017-12-19T12:09:47.994+0000: 12145.513: [Pause Final MarkTotal Garbage: 4900M >> Immediate Garbage: 2342M, 1171 regions (51% of total) >> Garbage to be collected: 2116M (43% of total), 1098 regions >> Live objects to be evacuated: 78M >> Live/garbage ratio in collected regions: 3% >> 5948M->3608M(6144M), 37.561 ms] >> 2017-12-19T12:09:48.032+0000: 12145.551: [Concurrent evacuation 3609M->3696M(6144M), 68.148 ms] >> 2017-12-19T12:09:48.101+0000: 12145.620: [Pause Init Update Refs, 0.102 ms] >> 2017-12-19T12:09:48.101+0000: 12145.620: [Concurrent update references 3696M->3724M(6144M), 584.008 ms] >> 2017-12-19T12:09:48.686+0000: 12146.205: [Pause Final Update Refs 3724M->1530M(6144M), 2.221 ms] >> 2017-12-19T12:09:48.688+0000: 12146.207: [Concurrent reset bitmaps 1530M->1530M(6144M), 2.919 ms] >> Capacity: 6144M, Peak Occupancy: 5948M, Lowest Free: 195M, Free Threshold: 184M >> Concurrent marking triggered. Free: 245M, Free Threshold: 245M; Allocated: 245M, Alloc Threshold: 0M >> 2017-12-19T12:11:03.960+0000: 12221.481: [Pause Init Mark, 5.081 ms] >> 2017-12-19T12:11:03.965+0000: 12221.485: [Concurrent markingCancelling concurrent GC: Allocation Failure >> 5887M->6130M(6144M), 653.310 ms] >> Adjusting free threshold to: 14% (860M) >> 2017-12-19T12:11:04.620+0000: 12222.139: [Pause Final MarkTotal Garbage: 4889M >> Immediate Garbage: 2018M, 1009 regions (44% of total) >> Garbage to be collected: 2435M (49% of total), 1264 regions >> Live objects to be evacuated: 87M >> Live/garbage ratio in collected regions: 3% >> 6130M->4114M(6144M), 446.259 ms] >> 2017-12-19T12:11:05.069+0000: 12222.594: [Concurrent evacuation 4118M->4251M(6144M), 116.424 ms] >> 2017-12-19T12:11:05.187+0000: 12222.706: [Pause Init Update Refs, 0.105 ms] >> 2017-12-19T12:11:05.187+0000: 12222.706: [Concurrent update references 4251M->4614M(6144M), 992.439 ms] >> 2017-12-19T12:11:06.180+0000: 12223.699: [Pause Final Update Refs 4614M->2092M(6144M), 2.380 ms] >> 2017-12-19T12:11:06.183+0000: 12223.702: [Concurrent reset bitmaps 2092M->2092M(6144M), 4.246 ms] >> Capacity: 6144M, Peak Occupancy: 6130M, Lowest Free: 13M, Free Threshold: 184M >> Adjusting free threshold to: 17% (1044M) >> Concurrent marking triggered. Free: 1043M, Free Threshold: 1044M; Allocated: 1043M, Alloc Threshold: 0M >> 2017-12-19T12:11:23.829+0000: 12241.348: [Pause Init Mark, 2.962 ms] >> 2017-12-19T12:11:23.832+0000: 12241.351: [Concurrent marking 5094M->5122M(6144M), 700.227 ms] >> 2017-12-19T12:11:24.533+0000: 12242.052: [Pause Final MarkTotal Garbage: 4338M >> Immediate Garbage: 2252M, 1126 regions (54% of total) >> Garbage to be collected: 1771M (40% of total), 923 regions >> Live objects to be evacuated: 73M >> Live/garbage ratio in collected regions: 4% >> 5122M->2872M(6144M), 12.196 ms] >> 2017-12-19T12:11:24.545+0000: 12242.065: [Concurrent evacuation 2876M->2958M(6144M), 63.041 ms] >> 2017-12-19T12:11:24.609+0000: 12242.128: [Pause Init Update Refs, 0.099 ms] >> 2017-12-19T12:11:24.609+0000: 12242.129: [Concurrent update references 2962M->2968M(6144M), 402.969 ms] >> 2017-12-19T12:11:25.013+0000: 12242.533: [Pause Final Update Refs 2968M->1124M(6144M), 2.222 ms] >> 2017-12-19T12:11:25.016+0000: 12242.535: [Concurrent reset bitmaps 1124M->1124M(6144M), 3.644 ms] >> Capacity: 6144M, Peak Occupancy: 5122M, Lowest Free: 1021M, Free Threshold: 184M >> Concurrent marking triggered. Free: 1043M, Free Threshold: 1044M; Allocated: 1043M, Alloc Threshold: 0M >> 2017-12-19T12:12:09.424+0000: 12286.943: [Pause Init Mark, 3.066 ms] >> 2017-12-19T12:12:09.427+0000: 12286.947: [Concurrent marking 5091M->5105M(6144M), 694.640 ms] >> 2017-12-19T12:12:10.123+0000: 12287.642: [Pause Final MarkTotal Garbage: 4338M >> Immediate Garbage: 1980M, 990 regions (48% of total) >> Garbage to be collected: 2044M (47% of total), 1062 regions >> Live objects to be evacuated: 76M >> Live/garbage ratio in collected regions: 3% >> 5105M->3127M(6144M), 11.334 ms] >> 2017-12-19T12:12:10.134+0000: 12287.654: [Concurrent evacuation 3129M->3212M(6144M), 64.445 ms] >> 2017-12-19T12:12:10.200+0000: 12287.719: [Pause Init Update Refs, 0.106 ms] >> 2017-12-19T12:12:10.200+0000: 12287.719: [Concurrent update references 3216M->3224M(6144M), 405.975 ms] >> 2017-12-19T12:12:10.607+0000: 12288.126: [Pause Final Update Refs 3224M->1103M(6144M), 2.868 ms] >> 2017-12-19T12:12:10.610+0000: 12288.129: [Concurrent reset bitmaps 1103M->1103M(6144M), 4.004 ms] >> Capacity: 6144M, Peak Occupancy: 5105M, Lowest Free: 1038M, Free Threshold: 184M >> Concurrent marking triggered. Free: 1035M, Free Threshold: 1044M; Allocated: 1035M, Alloc Threshold: 0M >> 2017-12-19T12:13:10.975+0000: 12348.494: [Pause Init Mark, 3.064 ms] >> 2017-12-19T12:13:10.978+0000: 12348.497: [Concurrent marking 5099M->5257M(6144M), 1174.055 ms] >> 2017-12-19T12:13:12.153+0000: 12349.672: [Pause Final MarkTotal Garbage: 4058M >> Immediate Garbage: 1164M, 583 regions (31% of total) >> Garbage to be collected: 2456M (60% of total), 1294 regions >> Live objects to be evacuated: 129M >> Live/garbage ratio in collected regions: 5% >> 5257M->4095M(6144M), 13.076 ms] >> 2017-12-19T12:13:12.167+0000: 12349.686: [Concurrent evacuation 4099M->4237M(6144M), 106.069 ms] >> 2017-12-19T12:13:12.274+0000: 12349.793: [Pause Init Update Refs, 0.104 ms] >> 2017-12-19T12:13:12.274+0000: 12349.793: [Concurrent update references 4237M->4254M(6144M), 687.437 ms] >> 2017-12-19T12:13:12.962+0000: 12350.481: [Pause Final Update Refs 4254M->1668M(6144M), 2.407 ms] >> 2017-12-19T12:13:12.965+0000: 12350.484: [Concurrent reset bitmaps 1668M->1668M(6144M), 3.446 ms] >> Capacity: 6144M, Peak Occupancy: 5257M, Lowest Free: 886M, Free Threshold: 184M >> Concurrent marking triggered. Free: 1035M, Free Threshold: 1044M; Allocated: 1035M, Alloc Threshold: 0M >> 2017-12-19T12:14:10.870+0000: 12408.389: [Pause Init Mark, 2.398 ms] >> 2017-12-19T12:14:10.872+0000: 12408.391: [Concurrent marking 5098M->5293M(6144M), 1229.373 ms] >> 2017-12-19T12:14:12.102+0000: 12409.622: [Pause Final MarkTotal Garbage: 4061M >> Immediate Garbage: 1206M, 603 regions (32% of total) >> Garbage to be collected: 2409M (59% of total), 1268 regions >> Live objects to be evacuated: 123M >> Live/garbage ratio in collected regions: 5% >> 5293M->4089M(6144M), 35.800 ms] >> 2017-12-19T12:14:12.139+0000: 12409.659: [Concurrent evacuation 4098M->4226M(6144M), 100.445 ms] >> 2017-12-19T12:14:12.241+0000: 12409.760: [Pause Init Update Refs, 0.100 ms] >> 2017-12-19T12:14:12.241+0000: 12409.760: [Concurrent update references 4226M->4245M(6144M), 697.900 ms] >> 2017-12-19T12:14:12.940+0000: 12410.459: [Pause Final Update Refs 4245M->1711M(6144M), 2.315 ms] >> 2017-12-19T12:14:12.942+0000: 12410.461: [Concurrent reset bitmaps 1711M->1711M(6144M), 3.816 ms] >> Capacity: 6144M, Peak Occupancy: 5293M, Lowest Free: 850M, Free Threshold: 184M >> Adjusting free threshold to: 12% (737M) >> Concurrent marking triggered. Free: 736M, Free Threshold: 737M; Allocated: 736M, Alloc Threshold: 0M >> 2017-12-19T12:14:37.225+0000: 12434.744: [Pause Init Mark, 2.908 ms] >> 2017-12-19T12:14:37.228+0000: 12434.747: [Concurrent marking 5401M->5851M(6144M), 1192.213 ms] >> 2017-12-19T12:14:38.420+0000: 12435.939: [Concurrent precleaning 5851M->5851M(6144M), 0.398 ms] >> 2017-12-19T12:14:38.421+0000: 12435.940: [Pause Final MarkTotal Garbage: 4561M >> Immediate Garbage: 2298M, 1149 regions (53% of total) >> Garbage to be collected: 1898M (41% of total), 989 regions >> Live objects to be evacuated: 77M >> Live/garbage ratio in collected regions: 4% >> 5851M->3555M(6144M), 16.073 ms] >> 2017-12-19T12:14:38.437+0000: 12435.957: [Concurrent evacuation 3559M->3685M(6144M), 103.447 ms] >> 2017-12-19T12:14:38.542+0000: 12436.061: [Pause Init Update Refs, 0.104 ms] >> 2017-12-19T12:14:38.542+0000: 12436.061: [Concurrent update references 3685M->4019M(6144M), 935.649 ms] >> 2017-12-19T12:14:39.482+0000: 12437.001: [Pause Final Update Refs 4019M->2043M(6144M), 2.179 ms] >> 2017-12-19T12:14:39.484+0000: 12437.004: [Concurrent reset bitmaps 2043M->2045M(6144M), 4.248 ms] >> Capacity: 6144M, Peak Occupancy: 5851M, Lowest Free: 292M, Free Threshold: 184M >> Concurrent marking triggered. Free: 732M, Free Threshold: 737M; Allocated: 732M, Alloc Threshold: 0M >> 2017-12-19T12:15:10.995+0000: 12468.518: [Pause Init Mark, 6.252 ms] >> 2017-12-19T12:15:11.002+0000: 12468.521: [Concurrent marking 5408M->5852M(6144M), 1412.717 ms] >> 2017-12-19T12:15:12.415+0000: 12469.935: [Pause Final MarkTotal Garbage: 4407M >> Immediate Garbage: 1828M, 914 regions (45% of total) >> Garbage to be collected: 2136M (48% of total), 1113 regions >> Live objects to be evacuated: 87M >> Live/garbage ratio in collected regions: 4% >> 5852M->4026M(6144M), 11.499 ms] >> 2017-12-19T12:15:12.427+0000: 12469.946: [Concurrent evacuation 4028M->4121M(6144M), 77.189 ms] >> 2017-12-19T12:15:12.505+0000: 12470.024: [Pause Init Update Refs, 0.104 ms] >> 2017-12-19T12:15:12.505+0000: 12470.025: [Concurrent update references 4121M->4157M(6144M), 763.012 ms] >> 2017-12-19T12:15:13.269+0000: 12470.788: [Pause Final Update Refs 4157M->1932M(6144M), 2.305 ms] >> 2017-12-19T12:15:13.272+0000: 12470.791: [Concurrent reset bitmaps 1932M->1932M(6144M), 3.267 ms] >> Capacity: 6144M, Peak Occupancy: 5852M, Lowest Free: 291M, Free Threshold: 184M >> -- >> wbr, Kirill > From kirill at korins.ky Tue Dec 19 15:12:35 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Tue, 19 Dec 2017 15:12:35 +0000 Subject: Rare very big pause In-Reply-To: <69178fbf-e0b4-eeda-9467-3e5fe087556b@redhat.com> References: <69178fbf-e0b4-eeda-9467-3e5fe087556b@redhat.com> Message-ID: <2C3C4693-1910-4009-80FE-1D0BA62292C3@korins.ky> Thanks! Good point. May you suggest any way to understand where and how much it try to allocate? Because this application shouldn't allocate so big memory and how I can see it from line before it had 245Mb free memory when GC cycle happened. > Concurrent marking triggered. Free: 245M, Free Threshold: 245M; Allocated: 245M, Alloc Threshold: 0M -- wbr, Kirill > On 19 Dec 2017, at 19:05, Aleksey Shipilev wrote: > > On 12/19/2017 03:57 PM, Kirill A. Korinsky wrote: >> 2017-12-19T12:11:03.960+0000: 12221.481: [Pause Init Mark, 5.081 ms] >> 2017-12-19T12:11:03.965+0000: 12221.485: [Concurrent markingCancelling concurrent GC: Allocation Failure >> 5887M->6130M(6144M), 653.310 ms] >> Adjusting free threshold to: 14% (860M) >> 2017-12-19T12:11:04.620+0000: 12222.139: [Pause Final MarkTotal Garbage: 4889M >> Immediate Garbage: 2018M, 1009 regions (44% of total) >> Garbage to be collected: 2435M (49% of total), 1264 regions >> Live objects to be evacuated: 87M >> Live/garbage ratio in collected regions: 3% >> 6130M->4114M(6144M), 446.259 ms] > > This is degenerated mark: collector ran out of memory during concurrent mark (or, in other words, > application had allocated too much), and Shenandoah dived into Final Mark right away, where it > completed the marking phase. > > Heuristics is supposed to find the optimal spot when to start the cycle to avoid this, but transient > hiccups like this might still happen early in the application lifecycle. Or, you might want to tell > heuristics how much free space to maintain to absorb allocations, see our Wiki about that. > > Thanks, > -Aleksey > From rkennke at redhat.com Tue Dec 19 15:13:16 2017 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 19 Dec 2017 16:13:16 +0100 Subject: Rare very big pause In-Reply-To: References: <067d1484-2a93-ed14-5b87-c54a47312c44@redhat.com> Message-ID: Am 19.12.2017 um 16:10 schrieb Kirill A. Korinsky: > Hi Roman, > > I mean this pice of log > >> Concurrent marking triggered. Free: 245M, Free Threshold: 245M; Allocated: 245M, Alloc Threshold: 0M >> 2017-12-19T12:11:03.960+0000: 12221.481: [Pause Init Mark, 5.081 ms] >> 2017-12-19T12:11:03.965+0000: 12221.485: [Concurrent markingCancelling concurrent GC: Allocation Failure >> 5887M->6130M(6144M), 653.310 ms] >> Adjusting free threshold to: 14% (860M) >> 2017-12-19T12:11:04.620+0000: 12222.139: [Pause Final MarkTotal Garbage: 4889M >> Immediate Garbage: 2018M, 1009 regions (44% of total) >> Garbage to be collected: 2435M (49% of total), 1264 regions >> Live objects to be evacuated: 87M >> Live/garbage ratio in collected regions: 3% >> 6130M->4114M(6144M), 446.259 ms] >> 2017-12-19T12:11:05.069+0000: 12222.594: [Concurrent evacuation 4118M->4251M(6144M), 116.424 ms] >> 2017-12-19T12:11:05.187+0000: 12222.706: [Pause Init Update Refs, 0.105 ms] >> 2017-12-19T12:11:05.187+0000: 12222.706: [Concurrent update references 4251M->4614M(6144M), 992.439 ms] >> 2017-12-19T12:11:06.180+0000: 12223.699: [Pause Final Update Refs 4614M->2092M(6144M), 2.380 ms] >> 2017-12-19T12:11:06.183+0000: 12223.702: [Concurrent reset bitmaps 2092M->2092M(6144M), 4.246 ms] >> Capacity: 6144M, Peak Occupancy: 6130M, Lowest Free: 13M, Free Threshold: 184M >> Adjusting free threshold to: 17% (1044M) > If I right understand it means that pause was 446.259 ms. > > Am I wrong? > > Thanks for links. I'm reading it. > Ah right. Then what Aleksey said is correct. Roman From shade at redhat.com Tue Dec 19 15:25:19 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 19 Dec 2017 16:25:19 +0100 Subject: Rare very big pause In-Reply-To: <2C3C4693-1910-4009-80FE-1D0BA62292C3@korins.ky> References: <69178fbf-e0b4-eeda-9467-3e5fe087556b@redhat.com> <2C3C4693-1910-4009-80FE-1D0BA62292C3@korins.ky> Message-ID: <2c1845ae-6e18-a3e7-2389-b520e49da78c@redhat.com> On 12/19/2017 04:12 PM, Kirill A. Korinsky wrote: > May you suggest any way to understand where and how much it try to allocate? > > Because this application shouldn't allocate so big memory and how I can see it from line before it had 245Mb free memory when GC cycle happened. >> Concurrent marking triggered. Free: 245M, Free Threshold: 245M; Allocated: 245M, Alloc Threshold: 0M I don't think this is application's fault. Your config runs with 4 GB heap. Default settings for "adaptive" heuristics target to maintain 3% of free heap at all times during the cycle. That means 120M. Single application thread may allocate this much in around 20 ms. And concurrent mark takes around a second. So, we have only a razor-thin room for error here. With a small heap like that, even a trivial allocation spike may blow up with Allocation Failure. I guess Shenandoah's heuristics needs to reserve more on smaller heaps, so that some absolute amount of space was always available. We have patches in queue to make this better. Meanwhile, you may want to adjust these: experimental(uintx, ShenandoahInitFreeThreshold, 30, \ "Initial remaining free threshold for adaptive heuristics") \ range(0,100) \ \ experimental(uintx, ShenandoahMinFreeThreshold, 3, \ "Minimum remaining free threshold for adaptive heuristics") \ range(0,100) \ \ experimental(uintx, ShenandoahMaxFreeThreshold, 70, \ "Maximum remaining free threshold for adaptive heuristics") \ range(0,100) \ \ Say, -XX:ShenandoahMinFreeThreshold=20 for your case. Thanks, -Aleksey From shade at redhat.com Tue Dec 19 15:30:48 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 19 Dec 2017 16:30:48 +0100 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> Message-ID: <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> On 12/14/2017 11:29 PM, Kirill A. Korinsky wrote: > Looks like I found issue that creates strange behaviour and broke starting an application at your > fastdebug image? > > ???~docker run --rm -ti shipilev/openjdk:8-shenandoah-fastdebug bash > root at 27a293be90b2:/# java -Xmx4096m -Xms4096m -version > Killed > root at 27a293be90b2:/# java -Xmx4096m -version > openjdk version "1.8.0-internal-fastdebug" > OpenJDK Runtime Environment (build 1.8.0-internal-fastdebug-jenkins_2017_11_15_05_12-b00) > OpenJDK 64-Bit Server VM (build 25.71-b00-fastdebug, mixed mode) > root at 27a293be90b2:/# exit > exit > ?? ~docker run --rm -ti shipilev/openjdk:8-shenandoah bash ? ? ? ? ? > root at 10aa56f25cb7:/# java -Xmx4096m -Xms4096m -version > openjdk version "1.8.0-internal" > OpenJDK Runtime Environment (build 1.8.0-internal-jenkins_2017_11_15_03_20-b00) > OpenJDK 64-Bit Server VM (build 25.71-b00, mixed mode) > root at 10aa56f25cb7:/# exit > exit > ?? ~docker images | grep shenandoah? ? ? ? ? ? ? ? ? ? ? ? > shipilev/openjdk ? ? ? ? 8-shenandoah-fastdebug ? ec13bdd01380? ? ? ? 4 weeks ago ? ? ? ? 383MB > shipilev/openjdk ? ? ? ? 8-shenandoah? ? ? ? ? ? cb7dbc6f6fb9? ? ? ? 4 weeks ago ? ? ? ? 385MB > ?? ~? > > Yes, I just removed `-Xms` from arguments and it helps: > ?- it starts at fastdebug image; > ?- it doesn't crash at ab test. This is weird, can't reproduce it. I guess this means JVM is blowing per-container memory limits, and then gets killed by OOM killer. It explains why only fastdebug builds are failing: they zap the heap with magic values and thus commit all the memory in. When -Xms4g is set, it overflows the limit and gets killed. When -Xms is not set, we commit less, and do not blow up. With release bits, we "commit" the heap with -Xms4g, but Linux memory subsystem does not actually allocate pages until we write into them. If that is true, then -XX:+AlwaysPreTouch with release builds would also fail. Thanks, -Aleksey From kirill at korins.ky Tue Dec 19 15:35:23 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Tue, 19 Dec 2017 15:35:23 +0000 Subject: Rare very big pause In-Reply-To: <2c1845ae-6e18-a3e7-2389-b520e49da78c@redhat.com> References: <69178fbf-e0b4-eeda-9467-3e5fe087556b@redhat.com> <2C3C4693-1910-4009-80FE-1D0BA62292C3@korins.ky> <2c1845ae-6e18-a3e7-2389-b520e49da78c@redhat.com> Message-ID: <92B95534-7024-4B15-8EED-241CA0C07A5A@korins.ky> Thanks for advice. Trying. -- wbr, Kirill > On 19 Dec 2017, at 19:25, Aleksey Shipilev wrote: > > On 12/19/2017 04:12 PM, Kirill A. Korinsky wrote: >> May you suggest any way to understand where and how much it try to allocate? >> >> Because this application shouldn't allocate so big memory and how I can see it from line before it had 245Mb free memory when GC cycle happened. >>> Concurrent marking triggered. Free: 245M, Free Threshold: 245M; Allocated: 245M, Alloc Threshold: 0M > > I don't think this is application's fault. Your config runs with 4 GB heap. Default settings for > "adaptive" heuristics target to maintain 3% of free heap at all times during the cycle. That means > 120M. Single application thread may allocate this much in around 20 ms. And concurrent mark takes > around a second. So, we have only a razor-thin room for error here. With a small heap like that, > even a trivial allocation spike may blow up with Allocation Failure. > > I guess Shenandoah's heuristics needs to reserve more on smaller heaps, so that some absolute amount > of space was always available. We have patches in queue to make this better. Meanwhile, you may want > to adjust these: > > experimental(uintx, ShenandoahInitFreeThreshold, 30, \ > "Initial remaining free threshold for adaptive heuristics") \ > range(0,100) \ > \ > experimental(uintx, ShenandoahMinFreeThreshold, 3, \ > "Minimum remaining free threshold for adaptive heuristics") \ > range(0,100) \ > \ > experimental(uintx, ShenandoahMaxFreeThreshold, 70, \ > "Maximum remaining free threshold for adaptive heuristics") \ > range(0,100) \ > \ > > Say, -XX:ShenandoahMinFreeThreshold=20 for your case. > > Thanks, > -Aleksey > From kirill at korins.ky Tue Dec 19 15:37:47 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Tue, 19 Dec 2017 15:37:47 +0000 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> Message-ID: <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> Yes, this machine has a limit for docker container. But it is 6Gb. Anyway, original bug is existed but I can't reproduce it yet on simpler case :( I can only hide it (or make it rare?) by increasing heap. -- wbr, Kirill > On 19 Dec 2017, at 19:30, Aleksey Shipilev wrote: > > On 12/14/2017 11:29 PM, Kirill A. Korinsky wrote: >> Looks like I found issue that creates strange behaviour and broke starting an application at your >> fastdebug image >> >> ? ~docker run --rm -ti shipilev/openjdk:8-shenandoah-fastdebug bash >> root at 27a293be90b2:/# java -Xmx4096m -Xms4096m -version >> Killed >> root at 27a293be90b2:/# java -Xmx4096m -version >> openjdk version "1.8.0-internal-fastdebug" >> OpenJDK Runtime Environment (build 1.8.0-internal-fastdebug-jenkins_2017_11_15_05_12-b00) >> OpenJDK 64-Bit Server VM (build 25.71-b00-fastdebug, mixed mode) >> root at 27a293be90b2:/# exit >> exit >> ? ~docker run --rm -ti shipilev/openjdk:8-shenandoah bash >> root at 10aa56f25cb7:/# java -Xmx4096m -Xms4096m -version >> openjdk version "1.8.0-internal" >> OpenJDK Runtime Environment (build 1.8.0-internal-jenkins_2017_11_15_03_20-b00) >> OpenJDK 64-Bit Server VM (build 25.71-b00, mixed mode) >> root at 10aa56f25cb7:/# exit >> exit >> ? ~docker images | grep shenandoah >> shipilev/openjdk 8-shenandoah-fastdebug ec13bdd01380 4 weeks ago 383MB >> shipilev/openjdk 8-shenandoah cb7dbc6f6fb9 4 weeks ago 385MB >> ? ~ >> >> Yes, I just removed `-Xms` from arguments and it helps: >> - it starts at fastdebug image; >> - it doesn't crash at ab test. > > This is weird, can't reproduce it. I guess this means JVM is blowing per-container memory limits, > and then gets killed by OOM killer. It explains why only fastdebug builds are failing: they zap the > heap with magic values and thus commit all the memory in. When -Xms4g is set, it overflows the limit > and gets killed. When -Xms is not set, we commit less, and do not blow up. With release bits, we > "commit" the heap with -Xms4g, but Linux memory subsystem does not actually allocate pages until we > write into them. > > If that is true, then -XX:+AlwaysPreTouch with release builds would also fail. > > Thanks, > -Aleksey > From shade at redhat.com Tue Dec 19 15:40:03 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 19 Dec 2017 16:40:03 +0100 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> Message-ID: <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> On 12/19/2017 04:37 PM, Kirill A. Korinsky wrote: > Yes, this machine has a limit for docker container. But it is 6Gb. -Xmx4g does not mean JVM RSS is exactly 4 GB -- native things are also there, and it might explain what we see. > Anyway, original bug is existed but I can't reproduce it yet on simpler case :( > > I can only hide it (or make it rare?) by increasing heap. Ok, try this on simpler case: -XX:ShenandoahGCHeuristics=aggressive Thanks, -Aleksey From rwestrel at redhat.com Tue Dec 19 15:52:50 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 19 Dec 2017 16:52:50 +0100 Subject: Shenandoah WB fastpath and optimizations In-Reply-To: <32e9db01-2e7c-3cff-a56f-0cfd60e78a17@redhat.com> References: <32e9db01-2e7c-3cff-a56f-0cfd60e78a17@redhat.com> Message-ID: > In fact, I wanted to ask you what would it take to teach C2 to emit C1-style check, e.g. instead of: > > movzbl 0x3d8(%rTLS), %rScratch ; read evac-in-progress > test %rScratch, %rScratch > jne EVAC-ENABLED-SLOW-PATH > mov -0x8(%rObj), %rObj ; read barrier > > ...do: > > cmpb 0x3d8(%TLS), 0 ; read evac-in-progress > jne EVAC-ENABLED-SLOW-PATH > mov -0x8(%rObj), %rObj ; read barrier > > ...thus freeing up the register? This? diff --git a/src/hotspot/cpu/x86/x86_64.ad b/src/hotspot/cpu/x86/x86_64.ad --- a/src/hotspot/cpu/x86/x86_64.ad +++ b/src/hotspot/cpu/x86/x86_64.ad @@ -2966,6 +2966,16 @@ interface(CONST_INTER); %} +operand immU8() +%{ + predicate(0 <= n->get_int() && (n->get_int() <= 255)); + match(ConI); + + op_cost(5); + format %{ %} + interface(CONST_INTER); +%} + operand immI16() %{ predicate((-32768 <= n->get_int()) && (n->get_int() <= 32767)); @@ -11729,6 +11739,18 @@ ins_pipe(ialu_cr_reg_imm); %} +instruct compUB_mem_imm(rFlagsReg cr, memory op1, immU8 op2) +%{ + match(Set cr (CmpI (LoadUB op1) op2)); + + ins_cost(125); + format %{ "cmpb $op1, $op2" %} + ins_encode %{ + __ cmpb($op1$$Address, $op2$$constant); + %} + ins_pipe(ialu_cr_reg_mem); +%} + //----------Max and Min-------------------------------------------------------- // Min Instructions From shade at redhat.com Tue Dec 19 16:58:53 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 19 Dec 2017 17:58:53 +0100 Subject: Shenandoah WB fastpath and optimizations In-Reply-To: References: <32e9db01-2e7c-3cff-a56f-0cfd60e78a17@redhat.com> Message-ID: <6b6dc357-9e76-0e13-1364-3bf16f60c341@redhat.com> On 12/19/2017 04:52 PM, Roland Westrelin wrote: > >> In fact, I wanted to ask you what would it take to teach C2 to emit C1-style check, e.g. instead of: >> >> movzbl 0x3d8(%rTLS), %rScratch ; read evac-in-progress >> test %rScratch, %rScratch >> jne EVAC-ENABLED-SLOW-PATH >> mov -0x8(%rObj), %rObj ; read barrier >> >> ...do: >> >> cmpb 0x3d8(%TLS), 0 ; read evac-in-progress >> jne EVAC-ENABLED-SLOW-PATH >> mov -0x8(%rObj), %rObj ; read barrier >> >> ...thus freeing up the register? > > This? > > diff --git a/src/hotspot/cpu/x86/x86_64.ad b/src/hotspot/cpu/x86/x86_64.ad > --- a/src/hotspot/cpu/x86/x86_64.ad > +++ b/src/hotspot/cpu/x86/x86_64.ad > @@ -2966,6 +2966,16 @@ > interface(CONST_INTER); > %} > > +operand immU8() > +%{ > + predicate(0 <= n->get_int() && (n->get_int() <= 255)); > + match(ConI); > + > + op_cost(5); > + format %{ %} > + interface(CONST_INTER); > +%} > + > operand immI16() > %{ > predicate((-32768 <= n->get_int()) && (n->get_int() <= 32767)); > @@ -11729,6 +11739,18 @@ > ins_pipe(ialu_cr_reg_imm); > %} > > +instruct compUB_mem_imm(rFlagsReg cr, memory op1, immU8 op2) > +%{ > + match(Set cr (CmpI (LoadUB op1) op2)); > + > + ins_cost(125); > + format %{ "cmpb $op1, $op2" %} > + ins_encode %{ > + __ cmpb($op1$$Address, $op2$$constant); > + %} > + ins_pipe(ialu_cr_reg_mem); > +%} > + > //----------Max and Min-------------------------------------------------------- > // Min Instructions Yup, it codegens what we want. It does not help performance though, but I have another idea... -Aleksey From kirill at korins.ky Tue Dec 19 17:43:19 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Tue, 19 Dec 2017 17:43:19 +0000 Subject: Rare very big pause In-Reply-To: <92B95534-7024-4B15-8EED-241CA0C07A5A@korins.ky> References: <69178fbf-e0b4-eeda-9467-3e5fe087556b@redhat.com> <2C3C4693-1910-4009-80FE-1D0BA62292C3@korins.ky> <2c1845ae-6e18-a3e7-2389-b520e49da78c@redhat.com> <92B95534-7024-4B15-8EED-241CA0C07A5A@korins.ky> Message-ID: <5FCDF925-ADC2-4807-8D31-EBE0A25674E8@korins.ky> Well, it helps decrease pauses, but it is still about 150-200 ms. For example: Concurrent marking triggered. Free: 1718M, Free Threshold: 1720M; Allocated: 1718M, Alloc Threshold: 0M 2017-12-19T16:59:02.785+0000: 3055.494: [Pause Init Mark, 2.553 ms] 2017-12-19T16:59:02.788+0000: 3055.496: [Concurrent marking 4418M->4723M(6144M), 916.628 ms] 2017-12-19T16:59:03.710+0000: 3056.419: [Pause Final MarkTotal Garbage: 3555M Immediate Garbage: 1374M, 688 regions (41% of total) Garbage to be collected: 1849M (52% of total), 965 regions Live objects to be evacuated: 80M Live/garbage ratio in collected regions: 4% 4723M->3351M(6144M), 175.951 ms] 2017-12-19T16:59:03.886+0000: 3056.595: [Concurrent evacuation 3352M->3453M(6144M), 71.376 ms] 2017-12-19T16:59:03.959+0000: 3056.667: [Pause Init Update Refs, 0.099 ms] 2017-12-19T16:59:03.959+0000: 3056.667: [Concurrent update references 3453M->3489M(6144M), 578.340 ms] 2017-12-19T16:59:04.538+0000: 3057.246: [Pause Final Update Refs 3489M->1560M(6144M), 2.241 ms] 2017-12-19T16:59:04.540+0000: 3057.249: [Concurrent reset bitmaps 1560M->1560M(6144M), 3.516 ms] Capacity: 6144M, Peak Occupancy: 4723M, Lowest Free: 1420M, Free Threshold: 1228M Adjusting free threshold to: 25% (1536M) -- wbr, Kirill > On 19 Dec 2017, at 19:35, Kirill A. Korinsky wrote: > > Thanks for advice. > > Trying. > > -- > wbr, Kirill > > >> On 19 Dec 2017, at 19:25, Aleksey Shipilev wrote: >> >> On 12/19/2017 04:12 PM, Kirill A. Korinsky wrote: >>> May you suggest any way to understand where and how much it try to allocate? >>> >>> Because this application shouldn't allocate so big memory and how I can see it from line before it had 245Mb free memory when GC cycle happened. >>>> Concurrent marking triggered. Free: 245M, Free Threshold: 245M; Allocated: 245M, Alloc Threshold: 0M >> >> I don't think this is application's fault. Your config runs with 4 GB heap. Default settings for >> "adaptive" heuristics target to maintain 3% of free heap at all times during the cycle. That means >> 120M. Single application thread may allocate this much in around 20 ms. And concurrent mark takes >> around a second. So, we have only a razor-thin room for error here. With a small heap like that, >> even a trivial allocation spike may blow up with Allocation Failure. >> >> I guess Shenandoah's heuristics needs to reserve more on smaller heaps, so that some absolute amount >> of space was always available. We have patches in queue to make this better. Meanwhile, you may want >> to adjust these: >> >> experimental(uintx, ShenandoahInitFreeThreshold, 30, \ >> "Initial remaining free threshold for adaptive heuristics") \ >> range(0,100) \ >> \ >> experimental(uintx, ShenandoahMinFreeThreshold, 3, \ >> "Minimum remaining free threshold for adaptive heuristics") \ >> range(0,100) \ >> \ >> experimental(uintx, ShenandoahMaxFreeThreshold, 70, \ >> "Maximum remaining free threshold for adaptive heuristics") \ >> range(0,100) \ >> \ >> >> Say, -XX:ShenandoahMinFreeThreshold=20 for your case. >> >> Thanks, >> -Aleksey >> > From shade at redhat.com Tue Dec 19 17:51:27 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 19 Dec 2017 18:51:27 +0100 Subject: Rare very big pause In-Reply-To: <5FCDF925-ADC2-4807-8D31-EBE0A25674E8@korins.ky> References: <69178fbf-e0b4-eeda-9467-3e5fe087556b@redhat.com> <2C3C4693-1910-4009-80FE-1D0BA62292C3@korins.ky> <2c1845ae-6e18-a3e7-2389-b520e49da78c@redhat.com> <92B95534-7024-4B15-8EED-241CA0C07A5A@korins.ky> <5FCDF925-ADC2-4807-8D31-EBE0A25674E8@korins.ky> Message-ID: On 12/19/2017 06:43 PM, Kirill A. Korinsky wrote: > Well, it helps decrease pauses, but it is still about 150-200 ms. > > For example: > > Concurrent marking triggered. Free: 1718M, Free Threshold: 1720M; Allocated: 1718M, Alloc Threshold: 0M > 2017-12-19T16:59:02.785+0000: 3055.494: [Pause Init Mark, 2.553 ms] > 2017-12-19T16:59:02.788+0000: 3055.496: [Concurrent marking 4418M->4723M(6144M), 916.628 ms] > 2017-12-19T16:59:03.710+0000: 3056.419: [Pause Final MarkTotal Garbage: 3555M > Immediate Garbage: 1374M, 688 regions (41% of total) > Garbage to be collected: 1849M (52% of total), 965 regions > Live objects to be evacuated: 80M > Live/garbage ratio in collected regions: 4% > 4723M->3351M(6144M), 175.951 ms] > 2017-12-19T16:59:03.886+0000: 3056.595: [Concurrent evacuation 3352M->3453M(6144M), 71.376 ms] > 2017-12-19T16:59:03.959+0000: 3056.667: [Pause Init Update Refs, 0.099 ms] > 2017-12-19T16:59:03.959+0000: 3056.667: [Concurrent update references 3453M->3489M(6144M), 578.340 ms] > 2017-12-19T16:59:04.538+0000: 3057.246: [Pause Final Update Refs 3489M->1560M(6144M), 2.241 ms] > 2017-12-19T16:59:04.540+0000: 3057.249: [Concurrent reset bitmaps 1560M->1560M(6144M), 3.516 ms] > Capacity: 6144M, Peak Occupancy: 4723M, Lowest Free: 1420M, Free Threshold: 1228M > Adjusting free threshold to: 25% (1536M) This one looks like legit non-degraded Final Mark pause. Seeing 175ms for that pause seems odd, unless you are running fastdebug builds (in which case you will spend some time zapping the memory). With -verbose:gc, there would be the GC stats table after JVM exits, which will dissect that pause much better. -Aleksey From shade at redhat.com Tue Dec 19 18:11:00 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 19 Dec 2017 19:11:00 +0100 Subject: Shenandoah WB and tableswitch Message-ID: <480da1e0-646b-d8fa-341b-5bc70c0b177c@redhat.com> I think I have zeroed in on at least one issue with WBs. Successively dissecting the problematic workloads first yields the workload like this, derived from UTF-8 decoders in JDK: http://icedtea.classpath.org/hg/gc-bench/file/d04b4bbbc39f/src/main/java/org/openjdk/gcbench/wip/WriteBarrierUTF8Scan.java ...and then a minimal version of the same: http://icedtea.classpath.org/hg/gc-bench/file/d04b4bbbc39f/src/main/java/org/openjdk/gcbench/wip/WriteBarrierTableSwitch.java Now, running it with current sh/jdk10 yields interesting results. First, running with C1: ------------------------------------------------------------------------------ Benchmark (size) Mode Cnt Score Error Units # Parallel, -XX:TieredStopAtLevel=1 WriteBarrierTableSwitch.common 1000 avgt 15 2137.543 ? 9.084 ns/op WriteBarrierTableSwitch.separate 1000 avgt 15 2260.783 ? 6.355 ns/op # Shenandoah passive, -XX:TieredStopAtLevel=1 WriteBarrierTableSwitch.common 1000 avgt 15 2144.273 ? 7.565 ns/op WriteBarrierTableSwitch.separate 1000 avgt 15 2270.335 ? 6.433 ns/op # Shenandoah passive, -XX:TieredStopAtLevel=1, -XX:+ShenandoahWriteBarrier WriteBarrierTableSwitch.common 1000 avgt 15 2613.767 ? 29.567 ns/op WriteBarrierTableSwitch.separate 1000 avgt 15 2670.697 ? 8.822 ns/op ------------------------------------------------------------------------------ Everything seems to be in order: passive Shenandoah is as fast as Parallel, and enabling WBs makes everything consistently slower, because there are writes to cbuf array all the time. With C2 the picture gets murkier: ------------------------------------------------------------------------------ Benchmark (size) Mode Cnt Score Error Units # Parallel, -XX:-TieredCompilation WriteBarrierTableSwitch.common 1000 avgt 15 1518.773 ? 3.962 ns/op WriteBarrierTableSwitch.separate 1000 avgt 15 2302.127 ? 49.734 ns/op # Shenandoah passive, -XX:-TieredCompilation WriteBarrierTableSwitch.common 1000 avgt 15 1575.086 ? 4.616 ns/op WriteBarrierTableSwitch.separate 1000 avgt 15 2832.982 ? 70.375 ns/op # Shenandoah passive, -XX:-TieredCompilation, -XX:+ShenandoahWriteBarrier WriteBarrierTableSwitch.common 1000 avgt 15 1499.475 ? 38.896 ns/op WriteBarrierTableSwitch.separate 1000 avgt 15 3135.664 ? 11.811 ns/op -------------------------------------------------------------------------------- First of all, why does Shenandoah passive perform worse than Parallel even without barriers? That one is explained by interaction with counted loop safepoints / loop strip mining, see: ------------------------------------------------------------------------------ Benchmark (size) Mode Cnt Score Error Units # Shenandoah passive, -XX:-TieredCompilation, -XX:-UseCountedLoopSafepoints WriteBarrierTableSwitch.common 1000 avgt 15 1526.821 ? 7.644 ns/op WriteBarrierTableSwitch.separate 1000 avgt 15 2327.750 ? 73.020 ns/op ------------------------------------------------------------------------------ It is still weird to see CLS/LSM pessimize this case so much. Then, why does "separate" regresses when WB is enabled, and "common" does not regress? Perfasm suggests that in "common" case we are able to hoist the WB out of the loop, and this is why there is no +WB impact. We failed to do the same with "separate", for some reason. Disabling CLS/LSM helps just a little: ------------------------------------------------------------------------------ Benchmark (size) Mode Cnt Score Error Units WriteBarrierTableSwitch.common 1000 avgt 15 1535.884 ? 21.498 ns/op WriteBarrierTableSwitch.separate 1000 avgt 15 2876.315 ? 43.569 ns/op ------------------------------------------------------------------------------ This pinpoints at least one problem with WBs that impact Stringy/UTF-8-y code we have in benchmarks. Thanks, -Aleksey From zgu at redhat.com Tue Dec 19 20:14:20 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 19 Dec 2017 15:14:20 -0500 Subject: RFR: Fix compilation errors on Windows Message-ID: Fixed some compilation errors on Windows with VS2013. The compilation errors are due to recent atomic implementation changes. For example: c:\Users\zhengyu\workspace\shenandoah-jdk10\src\hotspot\shar\gc/shenandoah/shenandoahConcurrentMark.inline.hpp(77) : error C2668: 'ShenandoahHeapRegion::increase_live_data_words' : ambiguous call to overloaded function c:\Users\zhengyu\workspace\shenandoah-jdk10\src\hotspot\share\gc/shenandoah/shenandoahHeapRegion.hpp(327): could be 'void ShenandoahHeapRegion::increase_live_data_words(jint)' c:\Users\zhengyu\workspace\shenandoah-jdk10\src\hotspot\share\gc/shenandoah/shenandoahHeapRegion.hpp(326): or 'void ShenandoahHeapRegion::increase_live_data_words(size_t)' while trying to match the argument list '(int)' Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/win_atomic_fix/webrev.00/ Test: hotspot_gc_shenandoah on Windows x64 (release + fastdebug) Thanks, -Zhengyu From shade at redhat.com Tue Dec 19 20:28:24 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 19 Dec 2017 21:28:24 +0100 Subject: RFR: Fix compilation errors on Windows In-Reply-To: References: Message-ID: <269360f1-0155-11b4-cf6c-fd2838a6018c@redhat.com> On 12/19/2017 09:14 PM, Zhengyu Gu wrote: > Fixed some compilation errors on Windows with VS2013. The compilation errors are due to recent > atomic implementation changes. > > For example: > > c:\Users\zhengyu\workspace\shenandoah-jdk10\src\hotspot\shar\gc/shenandoah/shenandoahConcurrentMark.inline.hpp(77) > : error C2668: 'ShenandoahHeapRegion::increase_live_data_words' : ambiguous call to overloaded function > > c:\Users\zhengyu\workspace\shenandoah-jdk10\src\hotspot\share\gc/shenandoah/shenandoahHeapRegion.hpp(327): > could be 'void ShenandoahHeapRegion::increase_live_data_words(jint)' > > c:\Users\zhengyu\workspace\shenandoah-jdk10\src\hotspot\share\gc/shenandoah/shenandoahHeapRegion.hpp(326): > or?????? 'void ShenandoahHeapRegion::increase_live_data_words(size_t)' > ??????? while trying to match the argument list '(int)' > > > Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/win_atomic_fix/webrev.00/ *) Please check it still builds on Linux? *) Indentation is wrong: 107 volatile int _clean_klass_tree_claimed; ... 127 volatile int _resolved_method_task_claimed; *) I know there are j-type removals are coming from upstream: http://hg.openjdk.java.net/jdk/hs/rev/7cc7de9bf4a4 ...maybe we should use intX_t to match? Thanks, -Aleksey From zgu at redhat.com Tue Dec 19 21:00:49 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 19 Dec 2017 16:00:49 -0500 Subject: RFR: Fix compilation errors on Windows In-Reply-To: <269360f1-0155-11b4-cf6c-fd2838a6018c@redhat.com> References: <269360f1-0155-11b4-cf6c-fd2838a6018c@redhat.com> Message-ID: <60962cab-3370-8e66-0cc1-90294b47bf3e@redhat.com> On 12/19/2017 03:28 PM, Aleksey Shipilev wrote: > On 12/19/2017 09:14 PM, Zhengyu Gu wrote: >> Fixed some compilation errors on Windows with VS2013. The compilation errors are due to recent >> atomic implementation changes. >> >> For example: >> >> c:\Users\zhengyu\workspace\shenandoah-jdk10\src\hotspot\shar\gc/shenandoah/shenandoahConcurrentMark.inline.hpp(77) >> : error C2668: 'ShenandoahHeapRegion::increase_live_data_words' : ambiguous call to overloaded function >> >> c:\Users\zhengyu\workspace\shenandoah-jdk10\src\hotspot\share\gc/shenandoah/shenandoahHeapRegion.hpp(327): >> could be 'void ShenandoahHeapRegion::increase_live_data_words(jint)' >> >> c:\Users\zhengyu\workspace\shenandoah-jdk10\src\hotspot\share\gc/shenandoah/shenandoahHeapRegion.hpp(326): >> or 'void ShenandoahHeapRegion::increase_live_data_words(size_t)' >> while trying to match the argument list '(int)' >> >> >> Webrev: http://cr.openjdk.java.net/~zgu/shenandoah/win_atomic_fix/webrev.00/ > > *) Please check it still builds on Linux? Forgot to mention, I did check Linux builds. > > *) Indentation is wrong: > > 107 volatile int _clean_klass_tree_claimed; > > ... > > 127 volatile int _resolved_method_task_claimed; > Fixed > > *) I know there are j-type removals are coming from upstream: > http://hg.openjdk.java.net/jdk/hs/rev/7cc7de9bf4a4 > > ...maybe we should use intX_t to match? > Those match what are in jdk10's G1 implementation. Updated webrev: http://cr.openjdk.java.net/~zgu/shenandoah/win_atomic_fix/webrev.01/index.html Thanks, -Zhengyu > Thanks, > -Aleksey > From kirill at korins.ky Tue Dec 19 23:12:02 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Tue, 19 Dec 2017 23:12:02 +0000 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> Message-ID: Well, Simpler case includes decreasing heap to 1Gb. I attached two file: - errors.log with jetty errors - gc.log with gc.log It run at fedora 27 openjdk-1.8.0 without fast debug. Same setting but without Shenandoah doest reproduce this. If I increase heap size this issue is making rare. Since 6Gb heap I can't reproduce it. Maybe it help. -- wbr, Kirill -------------- next part -------------- > On 19 Dec 2017, at 19:40, Aleksey Shipilev wrote: > > On 12/19/2017 04:37 PM, Kirill A. Korinsky wrote: >> Yes, this machine has a limit for docker container. But it is 6Gb. > > -Xmx4g does not mean JVM RSS is exactly 4 GB -- native things are also there, and it might explain > what we see. > >> Anyway, original bug is existed but I can't reproduce it yet on simpler case :( >> >> I can only hide it (or make it rare?) by increasing heap. > > Ok, try this on simpler case: -XX:ShenandoahGCHeuristics=aggressive > > Thanks, > -Aleksey > From shade at redhat.com Wed Dec 20 07:58:47 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 20 Dec 2017 08:58:47 +0100 Subject: RFR: Fix compilation errors on Windows In-Reply-To: <60962cab-3370-8e66-0cc1-90294b47bf3e@redhat.com> References: <269360f1-0155-11b4-cf6c-fd2838a6018c@redhat.com> <60962cab-3370-8e66-0cc1-90294b47bf3e@redhat.com> Message-ID: <4bfcdb9f-7d8d-088f-085b-f5a52bec9463@redhat.com> On 12/19/2017 10:00 PM, Zhengyu Gu wrote: > Updated webrev: http://cr.openjdk.java.net/~zgu/shenandoah/win_atomic_fix/webrev.01/index.html These guys still use jint: 64 assert (s <= (size_t)max_jint, "sanity"); 65 increase_live_data_words((jint)s); Otherwise good. Thanks, -Alksey From shade at redhat.com Wed Dec 20 08:27:08 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 20 Dec 2017 09:27:08 +0100 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> Message-ID: <2a0604fa-7535-1e7c-c878-a81335c89653@redhat.com> On 12/20/2017 12:12 AM, Kirill A. Korinsky wrote: > Well, > > Simpler case includes decreasing heap to 1Gb. > > I attached two file: > - errors.log with jetty errors > - gc.log with gc.log > > It run at fedora 27 openjdk-1.8.0 without fast debug. > > Same setting but without Shenandoah doest reproduce this. > > If I increase heap size this issue is making rare. Since 6Gb heap I can't reproduce it. Maybe it help. GC logs and application logs are useless to debug the failure like this. We need the actual VM error or other kind of hs_err-generating smoking gun. Or, at least the MCVE that fails and debuggable, so we can take a look deeper here. I assume something like stripped down embedded Jetty with trivial response handler fails? Tried to do this [1], and Shenandoah/8u runs fine -- what is missing there? Since you seem to have the application that fails reliably, could you strip it down to bare essentials that still fails? Thanks, -Aleksey [1] https://github.com/shipilev/jetty-test From zgu at redhat.com Wed Dec 20 14:24:06 2017 From: zgu at redhat.com (zgu at redhat.com) Date: Wed, 20 Dec 2017 14:24:06 +0000 Subject: hg: shenandoah/jdk10: Fixed compilation errors on Windows due to Atomic implementation changes Message-ID: <201712201424.vBKEO6uf009859@aojmv0008.oracle.com> Changeset: 9120ee43c56c Author: zgu Date: 2017-12-20 09:20 -0500 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/9120ee43c56c Fixed compilation errors on Windows due to Atomic implementation changes ! src/hotspot/share/gc/shared/parallelCleaning.cpp ! src/hotspot/share/gc/shared/parallelCleaning.hpp ! src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.cpp ! src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.hpp ! src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.cpp ! src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.hpp ! src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.inline.hpp From kirill at korins.ky Wed Dec 20 16:16:01 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Wed, 20 Dec 2017 16:16:01 +0000 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: <2a0604fa-7535-1e7c-c878-a81335c89653@redhat.com> References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> <2a0604fa-7535-1e7c-c878-a81335c89653@redhat.com> Message-ID: <61428EB7-243D-4771-98B8-EDF8C315F2D3@korins.ky> Hey, I made very simple test case: https://github.com/catap/jetty-shenandoah-error It is durty but it is good enough. How to reproduce: The first terminal: > mvn clean package > docker-compose up --build The second terminal: > ab -k -c 1 -n 10000 -p request_body http://localhost:8080/test/ and you should have a lot of errors likes: > server_1 | 2017-12-20 16:09:56.287:WARN:oejh.HttpParser:qtp1706377736-41: parse exception: java.lang.IndexOutOfBoundsException: 20 for HttpChannelOverHttp at 1138214c{r=4,c=false,a=IDLE,uri=null} > server_1 | java.lang.IndexOutOfBoundsException: 20 > server_1 | at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139) > server_1 | at org.eclipse.jetty.http.HttpParser.parseLine(HttpParser.java:767) > server_1 | at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1310) > server_1 | at org.eclipse.jetty.server.HttpConnection.parseRequestBuffer(HttpConnection.java:353) > server_1 | at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:236) > server_1 | at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273) > server_1 | at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) > server_1 | at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) > server_1 | at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) > server_1 | at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) > server_1 | at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) > server_1 | at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) > server_1 | at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) > server_1 | at java.lang.Thread.run(Thread.java:748) If you remove -XX:+UseShenandoahGC from java arguments (at Dockerfile) it works fine. If you remove -XX:ShenandoahGCHeuristics=aggressive it will request a lot of requests to reproduce it. -- wbr, Kirill > On 20 Dec 2017, at 12:27, Aleksey Shipilev wrote: > > On 12/20/2017 12:12 AM, Kirill A. Korinsky wrote: >> Well, >> >> Simpler case includes decreasing heap to 1Gb. >> >> I attached two file: >> - errors.log with jetty errors >> - gc.log with gc.log >> >> It run at fedora 27 openjdk-1.8.0 without fast debug. >> >> Same setting but without Shenandoah doest reproduce this. >> >> If I increase heap size this issue is making rare. Since 6Gb heap I can't reproduce it. Maybe it help. > > GC logs and application logs are useless to debug the failure like this. > > We need the actual VM error or other kind of hs_err-generating smoking gun. Or, at least the MCVE > that fails and debuggable, so we can take a look deeper here. I assume something like stripped down > embedded Jetty with trivial response handler fails? Tried to do this [1], and Shenandoah/8u runs > fine -- what is missing there? Since you seem to have the application that fails reliably, could you > strip it down to bare essentials that still fails? > > Thanks, > -Aleksey > > [1] https://github.com/shipilev/jetty-test > From zgu at redhat.com Wed Dec 20 17:15:57 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 20 Dec 2017 12:15:57 -0500 Subject: RFR: Fix assertion failure of matrix size due to vm allocation granularity on Win x64 Message-ID: <7d8f4d4c-6e8e-2e86-30da-3802be27bc9b@redhat.com> This patch fixes assertion failure of TestGCThreadGroups on Windows x64. Internal Error (c:/Users/zhengyu/workspace/shenandoah-jdk10/src/hotspot/share/memory/virtualspace.cpp:103), pid=6404, tid=4448 assert((size & (granularity - 1)) == 0) failed: size not aligned to os::vm_allocation_granularity() On Windows x64, vm_allocation_granularity is 64K, instead of 4K. Webrev: file:///home/zgu/webrevs/shenandoah/matrix_size/webrev.00/index.html I will look into other vm allocations, and fix them separately if find any. Test: hotspot_gc_shenandoah: fastdebug on Linux x64 and Windows x64 Thanks, -Zhengyu From rkennke at redhat.com Wed Dec 20 18:10:04 2017 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 20 Dec 2017 19:10:04 +0100 Subject: RFR: Fix assertion failure of matrix size due to vm allocation granularity on Win x64 In-Reply-To: <7d8f4d4c-6e8e-2e86-30da-3802be27bc9b@redhat.com> References: <7d8f4d4c-6e8e-2e86-30da-3802be27bc9b@redhat.com> Message-ID: Am 20.12.2017 um 18:15 schrieb Zhengyu Gu: > This patch fixes assertion failure of TestGCThreadGroups on Windows x64. > > ? Internal Error > (c:/Users/zhengyu/workspace/shenandoah-jdk10/src/hotspot/share/memory/virtualspace.cpp:103), > pid=6404, tid=4448 > ? assert((size & (granularity - 1)) == 0) failed: size not aligned to > os::vm_allocation_granularity() > > On Windows x64, vm_allocation_granularity is 64K, instead of 4K. > > > Webrev: > file:///home/zgu/webrevs/shenandoah/matrix_size/webrev.00/index.html > You posted a link to a local webrev :-) From zgu at redhat.com Wed Dec 20 18:11:57 2017 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 20 Dec 2017 13:11:57 -0500 Subject: RFR: Fix assertion failure of matrix size due to vm allocation granularity on Win x64 In-Reply-To: References: <7d8f4d4c-6e8e-2e86-30da-3802be27bc9b@redhat.com> Message-ID: Oops. http://cr.openjdk.java.net/~zgu/shenandoah/matrix_size/webrev.00/ -Zhengyu On 12/20/2017 01:10 PM, Roman Kennke wrote: > Am 20.12.2017 um 18:15 schrieb Zhengyu Gu: >> This patch fixes assertion failure of TestGCThreadGroups on Windows x64. >> >> Internal Error >> (c:/Users/zhengyu/workspace/shenandoah-jdk10/src/hotspot/share/memory/virtualspace.cpp:103), >> pid=6404, tid=4448 >> assert((size & (granularity - 1)) == 0) failed: size not aligned to >> os::vm_allocation_granularity() >> >> On Windows x64, vm_allocation_granularity is 64K, instead of 4K. >> >> >> Webrev: >> file:///home/zgu/webrevs/shenandoah/matrix_size/webrev.00/index.html >> > > You posted a link to a local webrev :-) > From rkennke at redhat.com Wed Dec 20 18:14:06 2017 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 20 Dec 2017 19:14:06 +0100 Subject: RFR: Fix assertion failure of matrix size due to vm allocation granularity on Win x64 In-Reply-To: References: <7d8f4d4c-6e8e-2e86-30da-3802be27bc9b@redhat.com> Message-ID: <361c8680-a3ba-476f-892b-d84a3d7fd943@redhat.com> That looks good to me. Thanks, Roman > Oops. > > http://cr.openjdk.java.net/~zgu/shenandoah/matrix_size/webrev.00/ > > -Zhengyu > > On 12/20/2017 01:10 PM, Roman Kennke wrote: >> Am 20.12.2017 um 18:15 schrieb Zhengyu Gu: >>> This patch fixes assertion failure of TestGCThreadGroups on Windows x64. >>> >>> ?? Internal Error >>> (c:/Users/zhengyu/workspace/shenandoah-jdk10/src/hotspot/share/memory/virtualspace.cpp:103), >>> pid=6404, tid=4448 >>> ?? assert((size & (granularity - 1)) == 0) failed: size not aligned >>> to os::vm_allocation_granularity() >>> >>> On Windows x64, vm_allocation_granularity is 64K, instead of 4K. >>> >>> >>> Webrev: >>> file:///home/zgu/webrevs/shenandoah/matrix_size/webrev.00/index.html >>> >> >> You posted a link to a local webrev :-) >> From zgu at redhat.com Wed Dec 20 19:00:56 2017 From: zgu at redhat.com (zgu at redhat.com) Date: Wed, 20 Dec 2017 19:00:56 +0000 Subject: hg: shenandoah/jdk10: Fix assertion failure of matrix size due to vm allocation granularity on Win x64 Message-ID: <201712201900.vBKJ0v8T019286@aojmv0008.oracle.com> Changeset: 991f2adcddaa Author: zgu Date: 2017-12-20 13:57 -0500 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/991f2adcddaa Fix assertion failure of matrix size due to vm allocation granularity on Win x64 ! src/hotspot/share/gc/shenandoah/shenandoahConnectionMatrix.cpp From kirill at korins.ky Wed Dec 20 19:14:52 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Wed, 20 Dec 2017 19:14:52 +0000 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: <61428EB7-243D-4771-98B8-EDF8C315F2D3@korins.ky> References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> <2a0604fa-7535-1e7c-c878-a81335c89653@redhat.com> <61428EB7-243D-4771-98B8-EDF8C315F2D3@korins.ky> Message-ID: Hey, I simplified my test case to > public class DemoServer { > public static void main(String[] args) throws Exception { > Server server = new Server(80); > server.start(); > server.join(); > } > } and it generates 3-4 errors for 500 requests. This code at same JVM without Shenandoah processed 1 million requests without this error. -- wbr, Kirill > On 20 Dec 2017, at 20:16, Kirill A. Korinsky wrote: > > Hey, > > I made very simple test case: https://github.com/catap/jetty-shenandoah-error > > It is durty but it is good enough. > > How to reproduce: > The first terminal: >> mvn clean package >> docker-compose up --build > The second terminal: >> ab -k -c 1 -n 10000 -p request_body http://localhost:8080/test/ > > and you should have a lot of errors likes: >> server_1 | 2017-12-20 16:09:56.287:WARN:oejh.HttpParser:qtp1706377736-41: parse exception: java.lang.IndexOutOfBoundsException: 20 for HttpChannelOverHttp at 1138214c{r=4,c=false,a=IDLE,uri=null} >> server_1 | java.lang.IndexOutOfBoundsException: 20 >> server_1 | at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:139) >> server_1 | at org.eclipse.jetty.http.HttpParser.parseLine(HttpParser.java:767) >> server_1 | at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:1310) >> server_1 | at org.eclipse.jetty.server.HttpConnection.parseRequestBuffer(HttpConnection.java:353) >> server_1 | at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:236) >> server_1 | at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273) >> server_1 | at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95) >> server_1 | at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93) >> server_1 | at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303) >> server_1 | at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148) >> server_1 | at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136) >> server_1 | at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671) >> server_1 | at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589) >> server_1 | at java.lang.Thread.run(Thread.java:748) > > > If you remove -XX:+UseShenandoahGC from java arguments (at Dockerfile) it works fine. If you remove -XX:ShenandoahGCHeuristics=aggressive it will request a lot of requests to reproduce it. > > -- > wbr, Kirill > > >> On 20 Dec 2017, at 12:27, Aleksey Shipilev > wrote: >> >> On 12/20/2017 12:12 AM, Kirill A. Korinsky wrote: >>> Well, >>> >>> Simpler case includes decreasing heap to 1Gb. >>> >>> I attached two file: >>> - errors.log with jetty errors >>> - gc.log with gc.log >>> >>> It run at fedora 27 openjdk-1.8.0 without fast debug. >>> >>> Same setting but without Shenandoah doest reproduce this. >>> >>> If I increase heap size this issue is making rare. Since 6Gb heap I can't reproduce it. Maybe it help. >> >> GC logs and application logs are useless to debug the failure like this. >> >> We need the actual VM error or other kind of hs_err-generating smoking gun. Or, at least the MCVE >> that fails and debuggable, so we can take a look deeper here. I assume something like stripped down >> embedded Jetty with trivial response handler fails? Tried to do this [1], and Shenandoah/8u runs >> fine -- what is missing there? Since you seem to have the application that fails reliably, could you >> strip it down to bare essentials that still fails? >> >> Thanks, >> -Aleksey >> >> [1] https://github.com/shipilev/jetty-test >> > From shade at redhat.com Wed Dec 20 21:10:41 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 20 Dec 2017 22:10:41 +0100 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> <2a0604fa-7535-1e7c-c878-a81335c89653@redhat.com> <61428EB7-243D-4771-98B8-EDF8C315F2D3@korins.ky> Message-ID: On 12/20/2017 08:14 PM, Kirill A. Korinsky wrote: >> On 20 Dec 2017, at 20:16, Kirill A. Korinsky > wrote: >> Hey, >> >> I made very simple test case:?https://github.com/catap/jetty-shenandoah-error >> >> It is durty but it is good enough. >> >> How to reproduce: >> The first terminal: >>> mvn?clean package? >>> docker-compose?up?--build >> The second terminal: >>> ab?-k -c 1 -n 10000 -p request_body http://localhost:8080/test/ >> >> and you should have a lot of errors likes: >>> server_1??|?2017-12-20 16:09:56.287:WARN:oejh.HttpParser:qtp1706377736-41: parse exception: >>> java.lang.IndexOutOfBoundsException: 20 for?HttpChannelOverHttp at 1138214c{r=4,c=false,a=IDLE,uri=null} Many thanks! I have successfully reproduced the failure locally with your docker-compose thing that brings in Fedora 27 OpenJDK build. It seems the exact Jetty version is important. Based on your example, minimized this further to non-dockerized version: https://github.com/shipilev/jetty-test It seems to be the bug in Shenandoah+C1, because -Xint works fine, and -XX:-TieredCompilation works fine. All recent versions: sh/jdk8u, sh/jdk9, sh/jdk10 fail -- which is a good sign, and it simplifies the bug hunt. Thanks, -Aleksey From shade at redhat.com Wed Dec 20 21:12:11 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 20 Dec 2017 22:12:11 +0100 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> <2a0604fa-7535-1e7c-c878-a81335c89653@redhat.com> <61428EB7-243D-4771-98B8-EDF8C315F2D3@korins.ky> Message-ID: <6ee72934-d82a-7a95-590f-8f3a2690aa66@redhat.com> On 12/20/2017 10:10 PM, Aleksey Shipilev wrote: > On 12/20/2017 08:14 PM, Kirill A. Korinsky wrote: >>> On 20 Dec 2017, at 20:16, Kirill A. Korinsky > wrote: >>> Hey, >>> >>> I made very simple test case:?https://github.com/catap/jetty-shenandoah-error >>> >>> It is durty but it is good enough. >>> >>> How to reproduce: >>> The first terminal: >>>> mvn?clean package? >>>> docker-compose?up?--build >>> The second terminal: >>>> ab?-k -c 1 -n 10000 -p request_body http://localhost:8080/test/ >>> >>> and you should have a lot of errors likes: >>>> server_1??|?2017-12-20 16:09:56.287:WARN:oejh.HttpParser:qtp1706377736-41: parse exception: >>>> java.lang.IndexOutOfBoundsException: 20 for?HttpChannelOverHttp at 1138214c{r=4,c=false,a=IDLE,uri=null} > > Many thanks! I have successfully reproduced the failure locally with your docker-compose thing that > brings in Fedora 27 OpenJDK build. > > It seems the exact Jetty version is important. Based on your example, minimized this further to > non-dockerized version: > https://github.com/shipilev/jetty-test > > It seems to be the bug in Shenandoah+C1, because -Xint works fine, and -XX:-TieredCompilation works > fine. All recent versions: sh/jdk8u, sh/jdk9, sh/jdk10 fail -- which is a good sign, and it > simplifies the bug hunt. Ah, forgot to point out that -XX:-TieredCompilation is the workaround for this: it goes straight to C2, bypassing C1. Thanks, -Aleksey From shade at redhat.com Thu Dec 21 10:23:23 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 21 Dec 2017 11:23:23 +0100 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> <2a0604fa-7535-1e7c-c878-a81335c89653@redhat.com> <61428EB7-243D-4771-98B8-EDF8C315F2D3@korins.ky> Message-ID: <858a08a6-50fc-d010-2321-5af77a5ecc62@redhat.com> On 12/20/2017 10:10 PM, Aleksey Shipilev wrote: > On 12/20/2017 08:14 PM, Kirill A. Korinsky wrote: >>> On 20 Dec 2017, at 20:16, Kirill A. Korinsky > wrote: >>> Hey, >>> >>> I made very simple test case:?https://github.com/catap/jetty-shenandoah-error >>> >>> It is durty but it is good enough. >>> >>> How to reproduce: >>> The first terminal: >>>> mvn?clean package? >>>> docker-compose?up?--build >>> The second terminal: >>>> ab?-k -c 1 -n 10000 -p request_body http://localhost:8080/test/ >>> >>> and you should have a lot of errors likes: >>>> server_1??|?2017-12-20 16:09:56.287:WARN:oejh.HttpParser:qtp1706377736-41: parse exception: >>>> java.lang.IndexOutOfBoundsException: 20 for?HttpChannelOverHttp at 1138214c{r=4,c=false,a=IDLE,uri=null} > > Many thanks! I have successfully reproduced the failure locally with your docker-compose thing that > brings in Fedora 27 OpenJDK build. > > It seems the exact Jetty version is important. Based on your example, minimized this further to > non-dockerized version: > https://github.com/shipilev/jetty-test > > It seems to be the bug in Shenandoah+C1, because -Xint works fine, and -XX:-TieredCompilation works > fine. All recent versions: sh/jdk8u, sh/jdk9, sh/jdk10 fail -- which is a good sign, and it > simplifies the bug hunt. Smoking gun: the failure is here in HeapByteBuffer: public byte get(int i) { return hb[ix(checkIndex(i))]; <--- ! } And -XX:DisableIntrinsic=_checkIndex helps. Something is wrong in LIRGenerator::do_NIOCheckIndex? I am on the go, and my build/debug capabilities are slim. Anyone sees something completely off there? Don't we need RB on buf read? -Aleksey From rkennke at redhat.com Thu Dec 21 10:28:57 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 21 Dec 2017 11:28:57 +0100 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: <858a08a6-50fc-d010-2321-5af77a5ecc62@redhat.com> References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> <2a0604fa-7535-1e7c-c878-a81335c89653@redhat.com> <61428EB7-243D-4771-98B8-EDF8C315F2D3@korins.ky> <858a08a6-50fc-d010-2321-5af77a5ecc62@redhat.com> Message-ID: <36030879-92D7-4EA9-9928-FA527619C60E@redhat.com> I can look, this afternoon. Yes I believe we are lacking a read barrier there. Roman Am 21. Dezember 2017 11:23:23 MEZ schrieb Aleksey Shipilev : >On 12/20/2017 10:10 PM, Aleksey Shipilev wrote: >> On 12/20/2017 08:14 PM, Kirill A. Korinsky wrote: >>>> On 20 Dec 2017, at 20:16, Kirill A. Korinsky > wrote: >>>> Hey, >>>> >>>> I made very simple test >case:?https://github.com/catap/jetty-shenandoah-error >>>> >>>> It is durty but it is good enough. >>>> >>>> How to reproduce: >>>> The first terminal: >>>>> mvn?clean package? >>>>> docker-compose?up?--build >>>> The second terminal: >>>>> ab?-k -c 1 -n 10000 -p request_body http://localhost:8080/test/ >>>> >>>> and you should have a lot of errors likes: >>>>> server_1??|?2017-12-20 >16:09:56.287:WARN:oejh.HttpParser:qtp1706377736-41: parse exception: >>>>> java.lang.IndexOutOfBoundsException: 20 >for?HttpChannelOverHttp at 1138214c{r=4,c=false,a=IDLE,uri=null} >> >> Many thanks! I have successfully reproduced the failure locally with >your docker-compose thing that >> brings in Fedora 27 OpenJDK build. >> >> It seems the exact Jetty version is important. Based on your example, >minimized this further to >> non-dockerized version: >> https://github.com/shipilev/jetty-test >> >> It seems to be the bug in Shenandoah+C1, because -Xint works fine, >and -XX:-TieredCompilation works >> fine. All recent versions: sh/jdk8u, sh/jdk9, sh/jdk10 fail -- which >is a good sign, and it >> simplifies the bug hunt. > >Smoking gun: the failure is here in HeapByteBuffer: > > public byte get(int i) { > return hb[ix(checkIndex(i))]; <--- ! > } > >And -XX:DisableIntrinsic=_checkIndex helps. Something is wrong in >LIRGenerator::do_NIOCheckIndex? I >am on the go, and my build/debug capabilities are slim. Anyone sees >something completely off there? >Don't we need RB on buf read? > >-Aleksey -- Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet. From kirill at korins.ky Thu Dec 21 10:49:59 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Thu, 21 Dec 2017 10:49:59 +0000 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: <858a08a6-50fc-d010-2321-5af77a5ecc62@redhat.com> References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> <2a0604fa-7535-1e7c-c878-a81335c89653@redhat.com> <61428EB7-243D-4771-98B8-EDF8C315F2D3@korins.ky> <858a08a6-50fc-d010-2321-5af77a5ecc62@redhat.com> Message-ID: > On 21 Dec 2017, at 14:23, Aleksey Shipilev wrote: > > Smoking gun: the failure is here in HeapByteBuffer: > > public byte get(int i) { > return hb[ix(checkIndex(i))]; <--- ! > } I mentored it at original email ;) -- wbr, Kirill From rwestrel at redhat.com Thu Dec 21 12:46:38 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Thu, 21 Dec 2017 13:46:38 +0100 Subject: Shenandoah WB and tableswitch In-Reply-To: <480da1e0-646b-d8fa-341b-5bc70c0b177c@redhat.com> References: <480da1e0-646b-d8fa-341b-5bc70c0b177c@redhat.com> Message-ID: > http://icedtea.classpath.org/hg/gc-bench/file/d04b4bbbc39f/src/main/java/org/openjdk/gcbench/wip/WriteBarrierTableSwitch.java Both issues seem to boil down to the lack of profiling for tableswitch in C2: - the strip mining issue is caused by the scheduling of instructions in the inner loop when they should be in the outer loop . C2 gives all tableswitch branches the same frequency and so sees that exiting the loop is quite common which in turns means putting stuff at the end of the inner loop is cheaper than in the outer loop. - the write barrier is not hoisted because it depends on a null check that is itself not hoisted. The null check is not hoisted because it's on a branch that C2 sees as not always taken. UseProfiledLoopPredicate is supposed to help with that by moving predicates on common branches out of loop except it needs profiling which it doesn't have for tableswitch. Making C2 use tableswitch profiling leads to: WriteBarrierTableSwitch.common 1000 avgt 15 1041.808 ? 17.481 ns/op WriteBarrierTableSwitch.separate 1000 avgt 15 1104.196 ? 0.726 ns/op with: -XX:-TieredCompilation -XX:+UseShenandoahGC -XX:+UnlockDiagnosticVMOptions -XX:ShenandoahGCHeuristics=passive -XX:+ShenandoahWriteBarrier -XX:+UseCountedLoopSafepoints -XX:LoopStripMiningIter=1000 For comparison current shenandoah tip with parallel gc on the same machine: WriteBarrierTableSwitch.common 1000 avgt 15 1067.409 ? 1.154 ns/op WriteBarrierTableSwitch.separate 1000 avgt 15 1595.011 ? 1.448 ns/op I'll work on cleaning up the patch so you can give it a try. Roland. From rkennke at redhat.com Thu Dec 21 15:09:48 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 21 Dec 2017 16:09:48 +0100 Subject: Add missing barrier in C1 NIOCheckIndex intrinsic Message-ID: <0a317bca-9f98-bafc-38ab-59a9b0d75444@redhat.com> We're missing a read-barrier in a C1 intrinsic for NIO checkIndex. This is the root cause for: http://mail.openjdk.java.net/pipermail/shenandoah-dev/2017-December/004495.html This is the fix: http://cr.openjdk.java.net/~rkennke/c1-missing-barrier/webrev.00/ Ok? Roman From rkennke at redhat.com Thu Dec 21 15:11:12 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 21 Dec 2017 16:11:12 +0100 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: <858a08a6-50fc-d010-2321-5af77a5ecc62@redhat.com> References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> <2a0604fa-7535-1e7c-c878-a81335c89653@redhat.com> <61428EB7-243D-4771-98B8-EDF8C315F2D3@korins.ky> <858a08a6-50fc-d010-2321-5af77a5ecc62@redhat.com> Message-ID: Am 21.12.2017 um 11:23 schrieb Aleksey Shipilev: > On 12/20/2017 10:10 PM, Aleksey Shipilev wrote: >> On 12/20/2017 08:14 PM, Kirill A. Korinsky wrote: >>>> On 20 Dec 2017, at 20:16, Kirill A. Korinsky > wrote: >>>> Hey, >>>> >>>> I made very simple test case:?https://github.com/catap/jetty-shenandoah-error >>>> >>>> It is durty but it is good enough. >>>> >>>> How to reproduce: >>>> The first terminal: >>>>> mvn?clean package >>>>> docker-compose?up?--build >>>> The second terminal: >>>>> ab?-k -c 1 -n 10000 -p request_body http://localhost:8080/test/ >>>> >>>> and you should have a lot of errors likes: >>>>> server_1??|?2017-12-20 16:09:56.287:WARN:oejh.HttpParser:qtp1706377736-41: parse exception: >>>>> java.lang.IndexOutOfBoundsException: 20 for?HttpChannelOverHttp at 1138214c{r=4,c=false,a=IDLE,uri=null} >> >> Many thanks! I have successfully reproduced the failure locally with your docker-compose thing that >> brings in Fedora 27 OpenJDK build. >> >> It seems the exact Jetty version is important. Based on your example, minimized this further to >> non-dockerized version: >> https://github.com/shipilev/jetty-test >> >> It seems to be the bug in Shenandoah+C1, because -Xint works fine, and -XX:-TieredCompilation works >> fine. All recent versions: sh/jdk8u, sh/jdk9, sh/jdk10 fail -- which is a good sign, and it >> simplifies the bug hunt. > > Smoking gun: the failure is here in HeapByteBuffer: > > public byte get(int i) { > return hb[ix(checkIndex(i))]; <--- ! > } > > And -XX:DisableIntrinsic=_checkIndex helps. Something is wrong in LIRGenerator::do_NIOCheckIndex? I > am on the go, and my build/debug capabilities are slim. Anyone sees something completely off there? > Don't we need RB on buf read? > > -Aleksey > > This fixes it: http://cr.openjdk.java.net/~rkennke/c1-missing-barrier/webrev.00/ Proposed for review. Roman From shade at redhat.com Thu Dec 21 15:13:00 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 21 Dec 2017 16:13:00 +0100 Subject: Add missing barrier in C1 NIOCheckIndex intrinsic In-Reply-To: <0a317bca-9f98-bafc-38ab-59a9b0d75444@redhat.com> References: <0a317bca-9f98-bafc-38ab-59a9b0d75444@redhat.com> Message-ID: <0eebbe6d-cc4d-55e8-9361-be4e926a30cc@redhat.com> On 12/21/2017 04:09 PM, Roman Kennke wrote: > We're missing a read-barrier in a C1 intrinsic for NIO checkIndex. > > This is the root cause for: > http://mail.openjdk.java.net/pipermail/shenandoah-dev/2017-December/004495.html > > This is the fix: > http://cr.openjdk.java.net/~rkennke/c1-missing-barrier/webrev.00/ This makes sense. I just got home, and was meaning to write the regression test for it. Give me an hour? -Aleksey From rkennke at redhat.com Thu Dec 21 15:13:44 2017 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 21 Dec 2017 16:13:44 +0100 Subject: Add missing barrier in C1 NIOCheckIndex intrinsic In-Reply-To: <0eebbe6d-cc4d-55e8-9361-be4e926a30cc@redhat.com> References: <0a317bca-9f98-bafc-38ab-59a9b0d75444@redhat.com> <0eebbe6d-cc4d-55e8-9361-be4e926a30cc@redhat.com> Message-ID: <7426da4c-ad8b-267e-3699-fa8a4fb2b19f@redhat.com> Am 21.12.2017 um 16:13 schrieb Aleksey Shipilev: > On 12/21/2017 04:09 PM, Roman Kennke wrote: >> We're missing a read-barrier in a C1 intrinsic for NIO checkIndex. >> >> This is the root cause for: >> http://mail.openjdk.java.net/pipermail/shenandoah-dev/2017-December/004495.html >> >> This is the fix: >> http://cr.openjdk.java.net/~rkennke/c1-missing-barrier/webrev.00/ > > This makes sense. > > I just got home, and was meaning to write the regression test for it. Give me an hour? > Oh yes, that would be perfect! Thanks, Roma From shade at redhat.com Thu Dec 21 16:28:14 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 21 Dec 2017 17:28:14 +0100 Subject: Add missing barrier in C1 NIOCheckIndex intrinsic In-Reply-To: <7426da4c-ad8b-267e-3699-fa8a4fb2b19f@redhat.com> References: <0a317bca-9f98-bafc-38ab-59a9b0d75444@redhat.com> <0eebbe6d-cc4d-55e8-9361-be4e926a30cc@redhat.com> <7426da4c-ad8b-267e-3699-fa8a4fb2b19f@redhat.com> Message-ID: On 12/21/2017 04:13 PM, Roman Kennke wrote: > Am 21.12.2017 um 16:13 schrieb Aleksey Shipilev: >> On 12/21/2017 04:09 PM, Roman Kennke wrote: >>> We're missing a read-barrier in a C1 intrinsic for NIO checkIndex. >>> >>> This is the root cause for: >>> http://mail.openjdk.java.net/pipermail/shenandoah-dev/2017-December/004495.html >>> >>> This is the fix: >>> http://cr.openjdk.java.net/~rkennke/c1-missing-barrier/webrev.00/ >> >> This makes sense. >> >> I just got home, and was meaning to write the regression test for it. Give me an hour? >> > > Oh yes, that would be perfect! Coming up with a regression test for this proves much more difficult than I anticipated: we seem to be needing a particular sequence of ByteBuffer calls to arrive to the bug. I'll figure it out eventually. Meanwhile, I tested if this fixes the Jetty issue, and indeed it does. Push the fix! We can add the regression test later. Thanks, -Aleksey From roman at kennke.org Thu Dec 21 16:33:10 2017 From: roman at kennke.org (roman at kennke.org) Date: Thu, 21 Dec 2017 16:33:10 +0000 Subject: hg: shenandoah/jdk10: Add missing barrier in C1 NIOCheckIndex intrinsic Message-ID: <201712211633.vBLGXALb010347@aojmv0008.oracle.com> Changeset: 440068c24134 Author: rkennke Date: 2017-12-21 16:06 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk10/rev/440068c24134 Add missing barrier in C1 NIOCheckIndex intrinsic ! src/hotspot/share/c1/c1_LIRGenerator.cpp From shade at redhat.com Thu Dec 21 16:32:18 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 21 Dec 2017 17:32:18 +0100 Subject: Add missing barrier in C1 NIOCheckIndex intrinsic In-Reply-To: References: <0a317bca-9f98-bafc-38ab-59a9b0d75444@redhat.com> <0eebbe6d-cc4d-55e8-9361-be4e926a30cc@redhat.com> <7426da4c-ad8b-267e-3699-fa8a4fb2b19f@redhat.com> Message-ID: <1fa18719-55a8-cb78-ed18-6129a545e401@redhat.com> On 12/21/2017 05:28 PM, Aleksey Shipilev wrote: > On 12/21/2017 04:13 PM, Roman Kennke wrote: >> Am 21.12.2017 um 16:13 schrieb Aleksey Shipilev: >>> On 12/21/2017 04:09 PM, Roman Kennke wrote: >>>> We're missing a read-barrier in a C1 intrinsic for NIO checkIndex. >>>> >>>> This is the root cause for: >>>> http://mail.openjdk.java.net/pipermail/shenandoah-dev/2017-December/004495.html >>>> >>>> This is the fix: >>>> http://cr.openjdk.java.net/~rkennke/c1-missing-barrier/webrev.00/ >>> >>> This makes sense. >>> >>> I just got home, and was meaning to write the regression test for it. Give me an hour? >>> >> >> Oh yes, that would be perfect! > > Coming up with a regression test for this proves much more difficult than I anticipated: we seem to > be needing a particular sequence of ByteBuffer calls to arrive to the bug. I'll figure it out > eventually. Meanwhile, I tested if this fixes the Jetty issue, and indeed it does. Push the fix! We > can add the regression test later. ...and maybe backport it right away to sh/jdk9 and sh/jdk8u? The fix is trivial, and the impact for it is potentially huge. Thanks, -Aleksey From shade at redhat.com Thu Dec 21 17:55:43 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 21 Dec 2017 18:55:43 +0100 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> <2a0604fa-7535-1e7c-c878-a81335c89653@redhat.com> <61428EB7-243D-4771-98B8-EDF8C315F2D3@korins.ky> <858a08a6-50fc-d010-2321-5af77a5ecc62@redhat.com> Message-ID: <3689f0ed-4726-4bb9-1fc6-91d2a6d8649a@redhat.com> On 12/21/2017 11:49 AM, Kirill A. Korinsky wrote: >> On 21 Dec 2017, at 14:23, Aleksey Shipilev wrote: >> >> Smoking gun: the failure is here in HeapByteBuffer: >> >> public byte get(int i) { >> return hb[ix(checkIndex(i))]; <--- ! >> } > > I mentored it at original email ;) Indeed you have :) My fault for not seeing/remembering right away that checkIndex is the intrinsic. It comes as testament to our Wiki in "Functional Diagnostics", that mentions how to dissect compiler-based issues, if one is suspected. The mere fact it fails only with one particular compiler, and that compiler is baseline C1, is one big smell. If that was a simple Java method failing -- and I originally thought it is -- it would be a harder deal, because we are supposed to make the barriers right for plain heap accesses, otherwise *everything* should crash and burn, and not just one test. Your effort is appreciated, and it helped to nail the bug! Roman pushed the fix to sh/jdk10. sh/jdk9, sh/jdk8u and Fedora RPMs would be updated in due course, but since NY holidays are upon us, I would expect RPMs happen somewhere in January. You may want to test with our nightly builds, or disable C1 with -XX:-TieredCompilation if working with Fedora-shipped Shenandoah for a time being. Cheers, -Aleksey From roman at kennke.org Thu Dec 21 18:18:19 2017 From: roman at kennke.org (roman at kennke.org) Date: Thu, 21 Dec 2017 18:18:19 +0000 Subject: hg: shenandoah/jdk9/hotspot: [backport] Add missing barrier in C1 NIOCheckIndex intrinsic Message-ID: <201712211818.vBLIIJGb020595@aojmv0008.oracle.com> Changeset: e5862f4bb621 Author: rkennke Date: 2017-12-21 19:14 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk9/hotspot/rev/e5862f4bb621 [backport] Add missing barrier in C1 NIOCheckIndex intrinsic ! src/share/vm/c1/c1_LIRGenerator.cpp From roman at kennke.org Thu Dec 21 18:33:46 2017 From: roman at kennke.org (roman at kennke.org) Date: Thu, 21 Dec 2017 18:33:46 +0000 Subject: hg: shenandoah/jdk8u/hotspot: [backport] Add missing barrier in C1 NIOCheckIndex intrinsic Message-ID: <201712211833.vBLIXk2P026353@aojmv0008.oracle.com> Changeset: d3b954dae9fa Author: rkennke Date: 2017-12-21 19:29 +0100 URL: http://hg.openjdk.java.net/shenandoah/jdk8u/hotspot/rev/d3b954dae9fa [backport] Add missing barrier in C1 NIOCheckIndex intrinsic ! src/share/vm/c1/c1_LIRGenerator.cpp From kirill at korins.ky Thu Dec 21 22:30:08 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Thu, 21 Dec 2017 22:30:08 +0000 Subject: Rare very big pause In-Reply-To: References: <69178fbf-e0b4-eeda-9467-3e5fe087556b@redhat.com> <2C3C4693-1910-4009-80FE-1D0BA62292C3@korins.ky> <2c1845ae-6e18-a3e7-2389-b520e49da78c@redhat.com> <92B95534-7024-4B15-8EED-241CA0C07A5A@korins.ky> <5FCDF925-ADC2-4807-8D31-EBE0A25674E8@korins.ky> Message-ID: <30B090C9-C5F4-423B-833E-D8B9392748EF@korins.ky> Hey, You can find below an output of verbose:gc. Short summary: > Pause Final Mark (G) = 4.59 s (a = 9911 us) (n = 463) (lvls, us = 3633, 5215, 6230, 10547, 329910) > Pause Final Mark (N) = 3.74 s (a = 8082 us) (n = 463) (lvls, us = 2969, 4102, 4766, 6504, 328861) > Finish Queues = 0.90 s (a = 1935 us) (n = 463) (lvls, us = 275, 799, 975, 1348, 325530) I have no idea why :( > GC STATISTICS: > "(G)" (gross) pauses include VM time: time to notify and block threads, do the pre- > and post-safepoint housekeeping. Use -XX:+PrintSafepointStatistics to dissect. > "(N)" (net) pauses are the times spent in the actual GC code. > "a" is average time for each phase, look at levels to see if average makes sense. > "lvls" are quantiles: 0% (minimum), 25%, 50% (median), 75%, 100% (maximum). > > Total Pauses (G) = 8.66 s (a = 4678 us) (n = 1852) (lvls, us = 721, 2891, 3516, 5156, 329914) > Total Pauses (N) = 6.15 s (a = 3319 us) (n = 1852) (lvls, us = 87, 170, 2363, 3711, 328864) > Pause Init Mark (G) = 2.09 s (a = 4509 us) (n = 463) (lvls, us = 3184, 3828, 3945, 4473, 18191) > Pause Init Mark (N) = 1.33 s (a = 2880 us) (n = 463) (lvls, us = 2285, 2832, 2891, 2930, 9342) > Accumulate Stats = 0.02 s (a = 53 us) (n = 463) (lvls, us = 47, 49, 50, 51, 154) > Make Parsable = 0.03 s (a = 73 us) (n = 463) (lvls, us = 62, 69, 71, 74, 143) > Clear Liveness = 0.08 s (a = 181 us) (n = 463) (lvls, us = 145, 172, 176, 180, 1377) > Scan Roots = 1.15 s (a = 2474 us) (n = 463) (lvls, us = 1914, 2441, 2480, 2539, 8814) > S: Thread Roots = 0.30 s (a = 639 us) (n = 463) (lvls, us = 471, 496, 510, 553, 2176) > S: String Table Roots = 0.33 s (a = 713 us) (n = 463) (lvls, us = 0, 850, 865, 891, 2207) > S: Universe Roots = 0.00 s (a = 2 us) (n = 463) (lvls, us = 2, 2, 2, 2, 61) > S: JNI Roots = 0.00 s (a = 6 us) (n = 463) (lvls, us = 4, 5, 5, 6, 8) > S: JNI Weak Roots = 0.02 s (a = 35 us) (n = 463) (lvls, us = 0, 40, 41, 43, 236) > S: Synchronizer Roots = 0.02 s (a = 43 us) (n = 463) (lvls, us = 12, 36, 40, 46, 646) > S: Flat Profiler Roots = 0.00 s (a = 10 us) (n = 463) (lvls, us = 6, 10, 10, 10, 24) > S: Management Roots = 0.00 s (a = 3 us) (n = 463) (lvls, us = 2, 2, 3, 3, 50) > S: System Dict Roots = 0.05 s (a = 104 us) (n = 463) (lvls, us = 15, 16, 17, 21, 501) > S: CLDG Roots = 0.38 s (a = 825 us) (n = 463) (lvls, us = 201, 912, 926, 947, 4637) > S: JVMTI Roots = 0.00 s (a = 1 us) (n = 463) (lvls, us = 1, 1, 1, 1, 2) > Resize TLABs = 0.02 s (a = 34 us) (n = 463) (lvls, us = 29, 31, 33, 35, 134) > Pause Final Mark (G) = 4.59 s (a = 9911 us) (n = 463) (lvls, us = 3633, 5215, 6230, 10547, 329910) > Pause Final Mark (N) = 3.74 s (a = 8082 us) (n = 463) (lvls, us = 2969, 4102, 4766, 6504, 328861) > Finish Queues = 0.90 s (a = 1935 us) (n = 463) (lvls, us = 275, 799, 975, 1348, 325530) > Weak References = 0.14 s (a = 1549 us) (n = 92) (lvls, us = 908, 1152, 1289, 1504, 7248) > Process = 0.13 s (a = 1379 us) (n = 92) (lvls, us = 783, 994, 1113, 1367, 7108) > Enqueue = 0.02 s (a = 166 us) (n = 92) (lvls, us = 104, 146, 160, 174, 577) > System Purge = 1.21 s (a = 13126 us) (n = 92) (lvls, us = 12500, 12891, 12891, 13086, 15472) > Unload Classes = 0.03 s (a = 378 us) (n = 92) (lvls, us = 334, 344, 348, 357, 1531) > Parallel Cleanup = 1.17 s (a = 12738 us) (n = 92) (lvls, us = 11719, 12500, 12695, 12695, 14853) > Code Cache = 0.22 s (a = 2411 us) (n = 92) (lvls, us = 1836, 2324, 2383, 2461, 4410) > String/Symbol Tables = 0.55 s (a = 6027 us) (n = 92) (lvls, us = 5898, 5977, 5996, 6035, 6708) > Clean Classes = 0.39 s (a = 4202 us) (n = 92) (lvls, us = 3750, 4160, 4199, 4219, 4368) > CLDG = 0.00 s (a = 6 us) (n = 92) (lvls, us = 4, 4, 5, 5, 84) > Prepare Evacuation = 0.21 s (a = 456 us) (n = 463) (lvls, us = 270, 414, 451, 500, 722) > Initial Evacuation = 1.25 s (a = 2710 us) (n = 463) (lvls, us = 1973, 2598, 2676, 2734, 8715) > E: Thread Roots = 0.23 s (a = 492 us) (n = 463) (lvls, us = 166, 457, 471, 500, 2367) > E: Code Cache Roots = 0.92 s (a = 1993 us) (n = 463) (lvls, us = 957, 1914, 1973, 2051, 4452) > Pause Init Update Refs (G) = 0.51 s (a = 1092 us) (n = 463) (lvls, us = 719, 865, 891, 949, 7584) > Pause Init Update Refs (N) = 0.04 s (a = 96 us) (n = 463) (lvls, us = 85, 90, 92, 95, 170) > Pause Final Update Refs (G) = 1.48 s (a = 3192 us) (n = 463) (lvls, us = 2500, 3008, 3066, 3145, 9424) > Pause Final Update Refs (N) = 1.03 s (a = 2214 us) (n = 463) (lvls, us = 1992, 2109, 2148, 2207, 7563) > Update Roots = 0.75 s (a = 1627 us) (n = 463) (lvls, us = 1484, 1543, 1562, 1602, 6927) > UR: Thread Roots = 0.21 s (a = 443 us) (n = 463) (lvls, us = 271, 420, 432, 445, 1406) > UR: String Table Roots = 0.22 s (a = 465 us) (n = 463) (lvls, us = 301, 451, 457, 475, 937) > UR: Universe Roots = 0.00 s (a = 1 us) (n = 463) (lvls, us = 1, 1, 1, 1, 15) > UR: JNI Roots = 0.00 s (a = 3 us) (n = 463) (lvls, us = 2, 3, 3, 3, 35) > UR: JNI Weak Roots = 0.01 s (a = 17 us) (n = 463) (lvls, us = 12, 15, 16, 16, 91) > UR: Synchronizer Roots = 0.02 s (a = 38 us) (n = 463) (lvls, us = 7, 37, 38, 39, 71) > UR: Flat Profiler Roots = 0.00 s (a = 10 us) (n = 463) (lvls, us = 2, 9, 10, 10, 118) > UR: Management Roots = 0.00 s (a = 2 us) (n = 463) (lvls, us = 1, 2, 2, 2, 32) > UR: System Dict Roots = 0.01 s (a = 13 us) (n = 463) (lvls, us = 9, 12, 12, 13, 77) > UR: CLDG Roots = 0.24 s (a = 508 us) (n = 463) (lvls, us = 422, 482, 492, 506, 4392) > UR: JVMTI Roots = 0.00 s (a = 1 us) (n = 463) (lvls, us = 1, 1, 1, 1, 2) > Recycle = 0.25 s (a = 549 us) (n = 463) (lvls, us = 432, 508, 539, 574, 895) > Concurrent Marking = 546.57 s (a = 1180492 us) (n = 463) (lvls, us = 474609, 964844, 1230469, 1425781, 2008314) > Concurrent Precleaning = 0.07 s (a = 800 us) (n = 92) (lvls, us = 174, 613, 807, 932, 2517) > Concurrent Evacuation = 45.53 s (a = 98326 us) (n = 463) (lvls, us = 56445, 76562, 98828, 115234, 272469) > Concurrent Update Refs = 334.94 s (a = 723406 us) (n = 463) (lvls, us = 304688, 625000, 757812, 847656, 1095144) > Concurrent Reset Bitmaps = 1.83 s (a = 3961 us) (n = 463) (lvls, us = 107, 3184, 3809, 4336, 13480) > > 0 allocation failure and 0 user requested GCs > 463 successful and 0 degenerated concurrent markings > 463 successful and 0 degenerated update references -- wbr, Kirill > On 19 Dec 2017, at 21:51, Aleksey Shipilev wrote: > > On 12/19/2017 06:43 PM, Kirill A. Korinsky wrote: >> Well, it helps decrease pauses, but it is still about 150-200 ms. >> >> For example: >> >> Concurrent marking triggered. Free: 1718M, Free Threshold: 1720M; Allocated: 1718M, Alloc Threshold: 0M >> 2017-12-19T16:59:02.785+0000: 3055.494: [Pause Init Mark, 2.553 ms] >> 2017-12-19T16:59:02.788+0000: 3055.496: [Concurrent marking 4418M->4723M(6144M), 916.628 ms] >> 2017-12-19T16:59:03.710+0000: 3056.419: [Pause Final MarkTotal Garbage: 3555M >> Immediate Garbage: 1374M, 688 regions (41% of total) >> Garbage to be collected: 1849M (52% of total), 965 regions >> Live objects to be evacuated: 80M >> Live/garbage ratio in collected regions: 4% >> 4723M->3351M(6144M), 175.951 ms] >> 2017-12-19T16:59:03.886+0000: 3056.595: [Concurrent evacuation 3352M->3453M(6144M), 71.376 ms] >> 2017-12-19T16:59:03.959+0000: 3056.667: [Pause Init Update Refs, 0.099 ms] >> 2017-12-19T16:59:03.959+0000: 3056.667: [Concurrent update references 3453M->3489M(6144M), 578.340 ms] >> 2017-12-19T16:59:04.538+0000: 3057.246: [Pause Final Update Refs 3489M->1560M(6144M), 2.241 ms] >> 2017-12-19T16:59:04.540+0000: 3057.249: [Concurrent reset bitmaps 1560M->1560M(6144M), 3.516 ms] >> Capacity: 6144M, Peak Occupancy: 4723M, Lowest Free: 1420M, Free Threshold: 1228M >> Adjusting free threshold to: 25% (1536M) > > This one looks like legit non-degraded Final Mark pause. Seeing 175ms for that pause seems odd, > unless you are running fastdebug builds (in which case you will spend some time zapping the memory). > With -verbose:gc, there would be the GC stats table after JVM exits, which will dissect that pause > much better. > > -Aleksey > From kirill at korins.ky Fri Dec 22 10:58:08 2017 From: kirill at korins.ky (Kirill A. Korinsky) Date: Fri, 22 Dec 2017 10:58:08 +0000 Subject: Strange bug inside jetty at shenandoah/jdk8u In-Reply-To: <3689f0ed-4726-4bb9-1fc6-91d2a6d8649a@redhat.com> References: <2F36163B-520A-4CA8-A616-B45011FFD37A@korins.ky> <9a44cc76-fa80-506e-d50b-f305b44b218e@redhat.com> <10B4809F-C9C9-430E-8114-729B5D0409E2@korins.ky> <5bb1abe4-1396-afe4-0d0f-c7ae45701e79@redhat.com> <5F8332D4-CB44-4C64-8B4E-84C299DA087D@korins.ky> <8cb74d52-1542-f990-4dfa-5036a1e8cabc@redhat.com> <2a0604fa-7535-1e7c-c878-a81335c89653@redhat.com> <61428EB7-243D-4771-98B8-EDF8C315F2D3@korins.ky> <858a08a6-50fc-d010-2321-5af77a5ecc62@redhat.com> <3689f0ed-4726-4bb9-1fc6-91d2a6d8649a@redhat.com> Message-ID: <765D93FA-41C4-4D4E-9036-2B92BE38F7A4@korins.ky> Thanks! -- wbr, Kirill > On 21 Dec 2017, at 21:55, Aleksey Shipilev wrote: > > On 12/21/2017 11:49 AM, Kirill A. Korinsky wrote: >>> On 21 Dec 2017, at 14:23, Aleksey Shipilev wrote: >>> >>> Smoking gun: the failure is here in HeapByteBuffer: >>> >>> public byte get(int i) { >>> return hb[ix(checkIndex(i))]; <--- ! >>> } >> >> I mentored it at original email ;) > > Indeed you have :) My fault for not seeing/remembering right away that checkIndex is the intrinsic. > It comes as testament to our Wiki in "Functional Diagnostics", that mentions how to dissect > compiler-based issues, if one is suspected. The mere fact it fails only with one particular > compiler, and that compiler is baseline C1, is one big smell. If that was a simple Java method > failing -- and I originally thought it is -- it would be a harder deal, because we are supposed to > make the barriers right for plain heap accesses, otherwise *everything* should crash and burn, and > not just one test. > > Your effort is appreciated, and it helped to nail the bug! > > Roman pushed the fix to sh/jdk10. sh/jdk9, sh/jdk8u and Fedora RPMs would be updated in due course, > but since NY holidays are upon us, I would expect RPMs happen somewhere in January. You may want to > test with our nightly builds, or disable C1 with -XX:-TieredCompilation if working with > Fedora-shipped Shenandoah for a time being. > > Cheers, > -Aleksey > From rwestrel at redhat.com Fri Dec 22 12:44:30 2017 From: rwestrel at redhat.com (Roland Westrelin) Date: Fri, 22 Dec 2017 13:44:30 +0100 Subject: Shenandoah WB and tableswitch In-Reply-To: References: <480da1e0-646b-d8fa-341b-5bc70c0b177c@redhat.com> Message-ID: Here is a WIP patch if you want to give it a try: http://cr.openjdk.java.net/~roland/shenandoah/predicate%2btableswitch.patch Roland. From shade at redhat.com Fri Dec 22 13:29:55 2017 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 22 Dec 2017 14:29:55 +0100 Subject: Shenandoah WB and tableswitch In-Reply-To: References: <480da1e0-646b-d8fa-341b-5bc70c0b177c@redhat.com> Message-ID: On 12/22/2017 01:44 PM, Roland Westrelin wrote: > > Here is a WIP patch if you want to give it a try: > > http://cr.openjdk.java.net/~roland/shenandoah/predicate%2btableswitch.patch Very good! Our most targeted benchmark: # WriteBarrierTableSwitch.separate Parallel: 1227 ? 9 ns/op Shenandoah baseline, -WB: 1663 ? 1 ns/op Shenandoah patched, -WB: 1081 ? 5 ns/op Shenandoah baseline, +WB: 1839 ? 1 ns/op Shenandoah patched, +WB: 1169 ? 1 ns/op // +57% better ...which was the minimized version of: # ObjectInputStream read test Parallel: 10981 ? 32 ns/op Shenandoah baseline, -WB: 10256 ? 206 ns/op Shenandoah patched, -WB: 10589 ? 247 ns/op Shenandoah baseline, +WB: 12890 ? 414 ns/op Shenandoah patched, +WB: 11153 ? 209 ns/op // +15% better ...which was the minimized version of: # Serial Parallel: 2646 ? 2 ops/s Shenandoah baseline, -WB: 2591 ? 32 ops/s Shenandoah patched, -WB: 2553 ? 33 ops/s Shenandoah baseline, +WB: 2269 ? 30 ops/s Shenandoah patched, +WB: 2355 ? 35 ops/s // +4% better -Aleksey From roman at kennke.org Fri Dec 22 15:08:01 2017 From: roman at kennke.org (Roman Kennke) Date: Fri, 22 Dec 2017 16:08:01 +0100 Subject: Shenandoah WB and tableswitch In-Reply-To: References: <480da1e0-646b-d8fa-341b-5bc70c0b177c@redhat.com> Message-ID: <2a9de2fc-2bcd-7bbb-f2ec-57534b9b7d18@kennke.org> Am 22.12.2017 um 14:29 schrieb Aleksey Shipilev: > On 12/22/2017 01:44 PM, Roland Westrelin wrote: >> Here is a WIP patch if you want to give it a try: >> >> http://cr.openjdk.java.net/~roland/shenandoah/predicate%2btableswitch.patch > Very good! Our most targeted benchmark: > > # WriteBarrierTableSwitch.separate > Parallel: 1227 ? 9 ns/op > Shenandoah baseline, -WB: 1663 ? 1 ns/op > Shenandoah patched, -WB: 1081 ? 5 ns/op > Shenandoah baseline, +WB: 1839 ? 1 ns/op > Shenandoah patched, +WB: 1169 ? 1 ns/op // +57% better > > ...which was the minimized version of: > > # ObjectInputStream read test > Parallel: 10981 ? 32 ns/op > Shenandoah baseline, -WB: 10256 ? 206 ns/op > Shenandoah patched, -WB: 10589 ? 247 ns/op > Shenandoah baseline, +WB: 12890 ? 414 ns/op > Shenandoah patched, +WB: 11153 ? 209 ns/op // +15% better > > ...which was the minimized version of: > > # Serial > Parallel: 2646 ? 2 ops/s > Shenandoah baseline, -WB: 2591 ? 32 ops/s > Shenandoah patched, -WB: 2553 ? 33 ops/s > Shenandoah baseline, +WB: 2269 ? 30 ops/s > Shenandoah patched, +WB: 2355 ? 35 ops/s // +4% better > > -Aleksey > > Great! :-) Wasn't xml benchmarks also affected by this, or am I confusing things? How do we look vs ZGC now? Roman