From kim.barrett at oracle.com Fri Nov 1 02:12:36 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 31 Oct 2019 22:12:36 -0400 Subject: RFR (XS): 8232951: TestG1ParallelPhases.java fails with phase NonYoungFreeCSet not found In-Reply-To: References: <27fa21ab-d8ac-95ea-3485-7a72116c22f2@oracle.com> <951EB603-F273-4787-9D8D-32D8194ECFAA@oracle.com> <7734C751-1D9B-45F5-86FB-D51D2BE8985F@oracle.com> <1fff7f47-aebc-bbcd-bda6-bb6185c11c3a@oracle.com> Message-ID: <4560FCD7-91B3-42C4-A2C4-B183C2A12B8A@oracle.com> > On Oct 31, 2019, at 5:51 AM, Thomas Schatzl wrote: > > Updated in place; also fixed Kim's comment about line length. > > http://cr.openjdk.java.net/~tschatzl/8232951/webrev/ Still looks good. > >> I am sorry that my "improvements" probably caused this failure, though just having heaps of code and not understanding why, is probably worse in the long run --- at least that is my thinking. > > The question I have is whether I can push these changes under this CR (and if it occurs again we at least have a log to look at) or use another CR for it? I say go ahead. From dean.long at oracle.com Fri Nov 1 08:05:27 2019 From: dean.long at oracle.com (dean.long at oracle.com) Date: Fri, 1 Nov 2019 01:05:27 -0700 Subject: RFR: 8231955: ARM32: Address displacement is 0 for volatile field access because of Unsafe field access. In-Reply-To: <2vyvfdgdqk-1@aserp2050.oracle.com> References: <20191010143426.BA4B6319F46@aojmv0009> <20191015073212.7FCCA319074@aojmv0009> <587f6363-bbdc-da12-9e50-82acc5bc5853@oracle.com> <2vyvfdgdqk-1@aserp2050.oracle.com> Message-ID: On 10/31/19 2:12 AM, christoph.goettschkes at microdoc.com wrote: >> I see now that BarrierSetC1::resolve_address() is calling >> generate_address(), at least when access isn't patched. So now I'm >> thinking that the address passed to >> volatile_field_load/volatile_field_store should be correct, and the call >> to add_large_constant() isn't necessary. > Yes, this is correct. The LIR_Address is created by > LIRGenerator::generate_address and has a displacement of 0. > I attached a backtrace of the failing assert at the end of this mail. > > Do you think the patch makes sense and can be pushed? > The HotSpot tier1 JTreg tests are passing with this and other patches I am > working on applied with a debug VM. Yes,? it looks fine now that I am reminded that arm32 is using ldmia/stmia in volatile_move_op, which means the displacement must be 0. dl > -- Christoph > > #0 0x7636b860 in LIRGenerator::add_large_constant > (this=0x641ae2f0, src=0xe500b, c=0, dest=0xe900b) > at src/hotspot/cpu/arm/c1_LIRGenerator_arm.cpp:166 > #1 0x7636f266 in LIRGenerator::volatile_field_load > (this=0x641ae2f0, address=0x6429c970, result=0xdd093, info=0x0) > at src/hotspot/cpu/arm/c1_LIRGenerator_arm.cpp:1326 > #2 0x762d9806 in BarrierSetC1::load_at_resolved > (this=0x7602b1f0, access=..., result=0xdd093) > at src/hotspot/share/gc/shared/c1/barrierSetC1.cpp:183 > #3 0x762d929a in BarrierSetC1::load_at > (this=0x7602b1f0, access=..., result=0xdd093) > at src/hotspot/share/gc/shared/c1/barrierSetC1.cpp:94 > #4 0x7635f6cc in LIRGenerator::access_load_at > (this=0x641ae2f0, decorators=9127331840, type=T_LONG, base=..., > offset=0xd900b, result=0xdd093, patch_info=0x0, load_emit_info=0x0) > at src/hotspot/share/c1/c1_LIRGenerator.cpp:1618 > #5 0x7636133e in LIRGenerator::do_UnsafeGetObject > (this=0x641ae2f0, x=0x6429a0d0) > at src/hotspot/share/c1/c1_LIRGenerator.cpp:2173 > #6 0x76328bdc in UnsafeGetObject::visit > (this=0x6429a0d0, v=0x641ae2f0) > at src/hotspot/share/c1/c1_Instruction.hpp:2407 > #7 0x7635b2d2 in LIRGenerator::do_root > (this=0x641ae2f0, instr=0x6429a0d0) > at src/hotspot/share/c1/c1_LIRGenerator.cpp:373 > #8 0x7635b1f2 in LIRGenerator::block_do > (this=0x641ae2f0, block=0x64299788) > at src/hotspot/share/c1/c1_LIRGenerator.cpp:354 > #9 0x76337d5a in BlockList::iterate_forward > (this=0x6429bf00, closure=0x641ae2f4) > at src/hotspot/share/c1/c1_Instruction.cpp:921 > #10 0x76332936 in IR::iterate_linear_scan_order > (this=0x642994d0, closure=0x641ae2f4) > at src/hotspot/share/c1/c1_IR.cpp:1221 > #11 0x7630ed10 in Compilation::emit_lir > (this=0x641ae5c0) > at src/hotspot/share/c1/c1_Compilation.cpp:259 > #12 0x7630f2be in Compilation::compile_java_method > (this=0x641ae5c0) > at src/hotspot/share/c1/c1_Compilation.cpp:398 > #13 0x7630f566 in Compilation::compile_method > (this=0x641ae5c0) > at src/hotspot/share/c1/c1_Compilation.cpp:460 > #14 0x7630fabc in Compilation::Compilation > (this=0x641ae5c0, compiler=0x760eb610, env=0x641ae848, > method=0x63d2edc8, osr_bci=-1, buffer_blob=0x73eb7448, > directive=0x760cf858) > at src/hotspot/share/c1/c1_Compilation.cpp:583 > #15 0x76312d6e in Compiler::compile_method > (this=0x760eb610, env=0x641ae848, method=0x63d2edc8, entry_bci=-1, > directive=0x760cf858) > at src/hotspot/share/c1/c1_Compiler.cpp:247 > #16 0x76453704 in CompileBroker::invoke_compiler_on_method > (task=0x642cfa50) > at src/hotspot/share/compiler/compileBroker.cpp:2115 > #17 0x764529ba in CompileBroker::compiler_thread_loop > () > at src/hotspot/share/compiler/compileBroker.cpp:1800 > #18 0x7693548c in compiler_thread_entry > (thread=0x6423b400, __the_thread__=0x6423b400) > at src/hotspot/share/runtime/thread.cpp:3401 > #19 0x769315d4 in JavaThread::thread_main_inner > (this=0x6423b400) > at src/hotspot/share/runtime/thread.cpp:1917 > #20 0x769314ac in JavaThread::run > (this=0x6423b400) > at src/hotspot/share/runtime/thread.cpp:1900 > #21 0x7692e884 in Thread::call_run > (this=0x6423b400) > at src/hotspot/share/runtime/thread.cpp:398 > #22 0x768285ce in thread_native_entry > (thread=0x6423b400) > at src/hotspot/os/linux/os_linux.cpp:790 > #23 0x76f84568 in start_thread() from target:/usr/lib/libpthread.so.0 > #24 0x76ef8ac8 in ?? () from target:/usr/lib/libc.so.6 > From aph at redhat.com Fri Nov 1 10:15:37 2019 From: aph at redhat.com (Andrew Haley) Date: Fri, 1 Nov 2019 10:15:37 +0000 Subject: [aarch64-port-dev ] RFR 8233339: Shenandoah: Centralize load barrier decisions into ShenandoahBarrierSet In-Reply-To: <6ef89df6-84db-0ffe-d1fc-7ffde7e622bf@redhat.com> References: <6ef89df6-84db-0ffe-d1fc-7ffde7e622bf@redhat.com> Message-ID: <4ed90469-8689-b49d-69f1-98f644e9edd0@redhat.com> On 10/31/19 6:48 PM, Zhengyu Gu wrote: > Right now, the decisions on, if a load barrier needs load reference > barrier, if so, what kind? and if the reference needs to be kept alive, > are scattered inside interpreter/c1/2 load barrier code, which is hard > to make them consistent. > > I would like to centralize the decision making into > ShenandoahBarrierSet, so them can be consistent and easy to maintain. You should say, at the start of every routine you touch, which registers are inputs, which are outputs, and (important) which may alias with rscratch1 and rscratch2. Please also mark clobbers of rscratch1 and 2. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From shade at redhat.com Fri Nov 1 14:06:50 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 1 Nov 2019 15:06:50 +0100 Subject: RFR (S) 8233387: Shenandoah: passive mode should disable pacing ergonomically Message-ID: <43d07d59-f685-23a9-7a73-38e7284a341f@redhat.com> RFE: https://bugs.openjdk.java.net/browse/JDK-8233387 This is the follow-up from JDK-8232791. There, we have added the ergonomic disable block in ShenandoahPassiveHeuristics. But, there is already the disable block in ShenandoahPassiveMode! In fact, it was there in ShenandoaPassiveHeuristics before introducing ShenandoahPassiveMode. We should revert JDK-8232791, and set the flag ergonomically in ShenandoahPassiveMode. Fix: https://cr.openjdk.java.net/~shade/8233387/webrev.01/ Testing: eyeballing GC logs, hotspot_gc_shenandoah -- Thanks, -Aleksey From zgu at redhat.com Fri Nov 1 14:15:56 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 1 Nov 2019 10:15:56 -0400 Subject: [aarch64-port-dev ] RFR 8233339: Shenandoah: Centralize load barrier decisions into ShenandoahBarrierSet In-Reply-To: <4ed90469-8689-b49d-69f1-98f644e9edd0@redhat.com> References: <6ef89df6-84db-0ffe-d1fc-7ffde7e622bf@redhat.com> <4ed90469-8689-b49d-69f1-98f644e9edd0@redhat.com> Message-ID: >> >> I would like to centralize the decision making into >> ShenandoahBarrierSet, so them can be consistent and easy to maintain. > > You should say, at the start of every routine you touch, which > registers are inputs, which are outputs, and (important) which may > alias with rscratch1 and rscratch2. Please also mark clobbers of > rscratch1 and 2. > Okay, updated: Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233339/webrev.01/index.html Thanks, -Zhengyu From zgu at redhat.com Fri Nov 1 14:55:03 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 1 Nov 2019 10:55:03 -0400 Subject: RFR (S) 8233387: Shenandoah: passive mode should disable pacing ergonomically In-Reply-To: <43d07d59-f685-23a9-7a73-38e7284a341f@redhat.com> References: <43d07d59-f685-23a9-7a73-38e7284a341f@redhat.com> Message-ID: <9e721547-6d76-3ab7-3048-0f335b32186b@redhat.com> Good to me. -Zhengyu On 11/1/19 10:06 AM, Aleksey Shipilev wrote: > RFE: > https://bugs.openjdk.java.net/browse/JDK-8233387 > > This is the follow-up from JDK-8232791. There, we have added the ergonomic disable block in > ShenandoahPassiveHeuristics. But, there is already the disable block in ShenandoahPassiveMode! In > fact, it was there in ShenandoaPassiveHeuristics before introducing ShenandoahPassiveMode. We should > revert JDK-8232791, and set the flag ergonomically in ShenandoahPassiveMode. > > Fix: > https://cr.openjdk.java.net/~shade/8233387/webrev.01/ > > Testing: eyeballing GC logs, hotspot_gc_shenandoah > From shade at redhat.com Fri Nov 1 15:43:20 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 1 Nov 2019 16:43:20 +0100 Subject: [aarch64-port-dev ] RFR 8233339: Shenandoah: Centralize load barrier decisions into ShenandoahBarrierSet In-Reply-To: References: <6ef89df6-84db-0ffe-d1fc-7ffde7e622bf@redhat.com> <4ed90469-8689-b49d-69f1-98f644e9edd0@redhat.com> Message-ID: <0fb9cd70-0a89-8c14-7469-55205c4c3808@redhat.com> On 11/1/19 3:15 PM, Zhengyu Gu wrote: > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233339/webrev.01/index.html To be honest, it does not look like much of the improvement from the first glance. Maybe we should massage the code a bit to make it more readable? Roman also needs to take a look. *) shenandoahBarrierSetAssembler_x86.cpp, I believe it would be more straightforward to save branching on local variable "need_load_reference_barrier" by spelling out the "disabled" path directly (in fact, I think you are almost there in shenandoahBarrierSetC1.cpp!): if (!ShenandoahBarrierSet::need_load_reference_barrier(decorators, type)) { BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp_thread); return; } ... code that assumes need_load_reference_barrier = true follows ... Register result_dst = dst; bool use_tmp1_for_dst = false; *) shenandoahBarrierSetC1.cpp: local variable "need_load_reference_barrier" is not needed, there is only a single use *) shenandoahBarrierSetC2.cpp: this block should go all the way up: 557 if (!ShenandoahBarrierSet::need_load_reference_barrier(decorators, type)) { 558 return load; 559 } *) shenandoahBarrierSet.cpp: this is just "return is_reference_type(type)". Saves some inversions. 78 if (!is_reference_type(type)) return false; 79 return true; *) shenandoahBarrierSet.cpp: should be "Should be subset of LRB": 83 assert(need_load_reference_barrier(decorators, type), "Why ask?"); *) shenandoahBarrierSet.cpp: seems like this assert is subsumed by the previous one? 84 assert(is_reference_type(type), "Why we here?"); -- Thanks, -Aleksey From zgu at redhat.com Fri Nov 1 17:37:49 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 1 Nov 2019 13:37:49 -0400 Subject: [aarch64-port-dev ] RFR 8233339: Shenandoah: Centralize load barrier decisions into ShenandoahBarrierSet In-Reply-To: <0fb9cd70-0a89-8c14-7469-55205c4c3808@redhat.com> References: <6ef89df6-84db-0ffe-d1fc-7ffde7e622bf@redhat.com> <4ed90469-8689-b49d-69f1-98f644e9edd0@redhat.com> <0fb9cd70-0a89-8c14-7469-55205c4c3808@redhat.com> Message-ID: Hi Aleksey, On 11/1/19 11:43 AM, Aleksey Shipilev wrote: > On 11/1/19 3:15 PM, Zhengyu Gu wrote: >> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233339/webrev.01/index.html > > To be honest, it does not look like much of the improvement from the first glance. Maybe we should > massage the code a bit to make it more readable? Roman also needs to take a look. Right, it is not. But I believe that should be done in separate CR, as it may cause backport headache, right? Filed: https://bugs.openjdk.java.net/browse/JDK-8233401 Matter of fact, I would like to hold off this code review, till reactor is done. Thanks, -Zhengyu > > *) shenandoahBarrierSetAssembler_x86.cpp, I believe it would be more straightforward to save > branching on local variable "need_load_reference_barrier" by spelling out the "disabled" path > directly (in fact, I think you are almost there in shenandoahBarrierSetC1.cpp!): > > if (!ShenandoahBarrierSet::need_load_reference_barrier(decorators, type)) { > BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp_thread); > return; > } > > ... code that assumes need_load_reference_barrier = true follows ... > > Register result_dst = dst; > bool use_tmp1_for_dst = false; > > *) shenandoahBarrierSetC1.cpp: local variable "need_load_reference_barrier" is not needed, there is > only a single use > > *) shenandoahBarrierSetC2.cpp: this block should go all the way up: > > 557 if (!ShenandoahBarrierSet::need_load_reference_barrier(decorators, type)) { > 558 return load; > 559 } > > *) shenandoahBarrierSet.cpp: this is just "return is_reference_type(type)". Saves some inversions. > > 78 if (!is_reference_type(type)) return false; > 79 return true; > > *) shenandoahBarrierSet.cpp: should be "Should be subset of LRB": > > 83 assert(need_load_reference_barrier(decorators, type), "Why ask?"); > > *) shenandoahBarrierSet.cpp: seems like this assert is subsumed by the previous one? > > 84 assert(is_reference_type(type), "Why we here?"); > > From sangheon.kim at oracle.com Sat Nov 2 06:08:15 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Fri, 1 Nov 2019 23:08:15 -0700 Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for G1, Logging (3/3) In-Reply-To: <6449da1d-6dcc-50ba-8ae8-7615e7ad35f9@oracle.com> References: <743d16cc-499d-784b-79fc-c006643f9ec5@oracle.com> <6449da1d-6dcc-50ba-8ae8-7615e7ad35f9@oracle.com> Message-ID: Hi Stefan, On 10/31/19 8:31 AM, Stefan Johansson wrote: > Hi Sangheon, > > On 2019-10-23 08:39, sangheon.kim at oracle.com wrote: >> Hi Thomas, >> >> I am posting the next webrev as Kim is waiting it. >> >> Webrev: >> http://cr.openjdk.java.net/~sangheki/8220312/webrev.3 >> http://cr.openjdk.java.net/~sangheki/8220312/webrev.3.inc > > Here are my comments: > src/hotspot/share/gc/g1/g1CollectedHeap.hpp > --- > 2397???? st->print("? remaining free region(s) from each node id: "); > > What do you think about changing this to "... region(s) on each NUMA > node: "? I think we should be clear about the logging being for NUMA. Changed as you suggested. > --- > > src/hotspot/share/gc/g1/g1EdenRegions.hpp > --- > 33 class G1EdenRegions : public G1RegionCounts { > > I don?t think G1EdenRegions is a G1RegionCounts but rather it should > have one. So instead of using inheritance here I think G1EdenRegions > should have a G1RegionsCount. Instead of overloading length I would > then suggest adding a region_count(uint node_index) to get the count. > > Same goes for G1SurvivorRegions. Changed as you suggested for both G1EdenRegions and G1SurvivorRegions. > --- > > src/hotspot/share/gc/g1/g1NUMA.cpp > --- > ?279 bool NodeIndexCheckClosure::do_heap_region(HeapRegion* hr) { > ?280?? uint preferred_node_index = > _numa->preferred_node_index_for_index(hr->hrm_index()); > ?281?? uint active_node_index = _numa->index_of_address(hr->bottom()); > ?282 > ?283?? if (preferred_node_index == active_node_index) { > ?284???? _matched[preferred_node_index]++; > ?285?? } else if (active_node_index == G1NUMA::UnknownNodeIndex) { > ?286???? _unknown++; > ?287?? } > ?288?? _total++; > ?289 > ?290?? return false; > ?291 } > > As we discussed offline, I would like to know the mismatches as well, > I think the easiest approach would be to make the total count per node > as well and that way we can see if there were any regions that didn't > match. What do you think about printing the info like this: > [3,009s][trace][gc,heap,numa ] GC(6) NUMA region verification > (actual/expected): 0: 1024/1024, 1: 270/1024, Unknown: 0 Changed as you suggested to have per node but deleted unknown as unknown is 'total - matched'. > > When testing this I also realized this output is problematic in the > case where we have committed regions that have not yet been used. > Reading the manual for get_mempolicy (the way we get the numa id for > the address) say: > "If no page has yet been allocated for the specified address, > get_mempolicy() will allocate a page as if the thread had performed a > read (load) access to that address, and return the ID of the node > where that page was allocated." Nice catch. Shortly saying get_mempolicy() doesn't honor os::numa_make_local() call that we previously requested. And this problem occurs when AlwaysPreTouch is disabled. However, my initial implementation which uses 'numa_move_pages()' doesn't have this problem. So one fundamental solution would be replacing linux implementation. In current scope of 3 patches, there will be no problem if we add 'hr->free && !AlwaysPreTouch' condition check however os::numa_get_group_id_for_address() will still have such limitation. What do you think about changing the Linux implementation? (webrev link is added at the end) > > Doing a read access seem to always get a page on NUMA node 0, so the > accounting will not be correct in this case. Right, to be clear the calling process's NUMA id. > > One way to fix this would be to only do accounting for regions > currently used (!hr->is_free()) but I'm not sure that is exactly what > we want, at least not if we only do this after the GC, then only the > survivors and old will be checked. We could solve this by also do > verification before the GC. I think this might be the way to go, what > do you think? If my proposal was hard to follow, here's a patch: > http://cr.openjdk.java.net/~sjohanss/numa/verify-alternative/ > > The output from this patch would be: > 9,233s][trace][gc,heap,numa?? ] GC(18) GC Start: NUMA region > verification (actual/expected): 0: 358/358, 1: 361/361, Unknown: 0 > [9,306s][trace][gc,heap,numa?? ] GC(18) GC End: NUMA region > verification (actual/expected): 0: 348/348, 1: 347/347, Unknown: 0 > > One can also see that this verification takes some time, so maybe it > would make sense to have this logging under gc+numa+verify. I think if we avoid calling G1NUMAA:index_of_address() if 'hr->free() && !AlwaysPreTouch'? but count total, we will be fine. Basically I imported your patch but the verification is only happening at the beginning of GC. Or with 'numa_move_pages()', we don't need such condition check. > --- > > ?234?? uint converted_req_index = requested_node_index; > ?235?? if(converted_req_index == AnyNodeIndex) { > ?236???? converted_req_index = _num_active_node_ids; > ?237?? } > ?238?? if (converted_req_index <= _num_active_node_ids) { > ?239???? _times->update(phase, converted_req_index, > allocated_node_index); > ?240?? } > > I had to read this more than once to understand what it really did and > I think we can simplify it a bit, by just doing an if-else that checks > for AnyNodeIndex and if so passes in _num_active_node_ids to update(). > This should be ok since requested_node_index never can be larger than > _num_active_node_ids. Tried to reflect your comment. :) > --- > > src/hotspot/share/gc/g1/g1ParScanThreadState.cpp > --- > I would prefer if we hide all the accounting in helper functions, but > it might be good to declare them to be inlined. > > ? 85?? if (_numa->is_enabled()) { > ? 86???? LogTarget(Info, gc, heap, numa) lt; > ? 87 > ? 88???? if (lt.is_enabled()) { > ? 89?????? uint num_nodes = _numa->num_active_nodes(); > ? 90?????? // Record only if there are multiple active nodes. > ? 91?????? _obj_alloc_stat = NEW_C_HEAP_ARRAY(size_t, num_nodes, mtGC); > ? 92?????? memset((void*)_obj_alloc_stat, 0, sizeof(size_t) * num_nodes); > ? 93???? } > ? 94?? } > > Move to something like initialize_numa_stats(). > > ?108?? if (_obj_alloc_stat != NULL) { > ?109???? uint node_index = _numa->index_of_current_thread(); > ?110 _numa->copy_statistics(G1NodeTimes::LocalObjProcessAtCopyToSurv, > node_index, _obj_alloc_stat); > ?111?? } > > This could be called flush_numa_stats(). > > ?268???? if (_obj_alloc_stat != NULL) { > ?269?????? _obj_alloc_stat[node_index]++; > ?270???? } > > And this something like update_numa_stats(uint). All changed with suggested helper functions. > -- > > heapRegionSet.hpp > --- > 159?? inline void update_length(HeapRegion* hr, bool increase); > 254?? inline void update_length(HeapRegion* hr, bool increase); > > Is there any reason for having update_length that takes a bool rather > than having one function for increments and one for decrements? To me > it looks like all uses are pretty well defined and it would make the > code easier to read. I also think we could pass in the node index > rather than the HeapRegion since the getter lenght() does this. Done. > --- > > src/hotspot/share/gc/g1/g1NodeTimes.cpp > --- > First, a question about the names, G1NodeTimes signals that it has to > do with timing, but currently we don't really record any timings. Same > thing with NodeStatPhases, not really the same type of phases that we > have for the rest of the GC logging. What do you think about renaming > the class to G1NUMAStats and the enum to NodeDataItems? Changed both class and enum names as you suggested. > > ?166 void G1NodeTimes::print_phase_info(G1NodeTimes::NodeStatPhases > phase) { > ?167?? LogTarget(Info, gc, heap, numa) lt; > > I think this should be on debug level, but if you don't agree leave it > as is. I feel Info seems okay, so let me leave as is. > --- > > ?191 void G1NodeTimes::print_mutator_alloc_stat_debug() { > ?192?? LogTarget(Debug, gc, heap, numa) lt; > > And if you agree on moving the above to debug I think this should be > on trace level. As is, please. Here's the webrev: http://cr.openjdk.java.net/~sangheki/8220312/webrev.4 http://cr.openjdk.java.net/~sangheki/8220312/webrev.4.inc http://cr.openjdk.java.net/~sangheki/8220312/webrev.4.inc.numa_move_pages Testing: hs-tier 1 ~ 5 with / without UseNUMA. Thanks, Sangheon > --- > > This is it for now. Thanks, > Stefan > > >> Testing: hs-tier 1 ~ 4 with/without UseNUMA. hs-tier5 is almost >> finished without new failures. >> >> Thanks, >> Sangheon >> >> From zgu at redhat.com Sat Nov 2 15:07:31 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Sat, 2 Nov 2019 11:07:31 -0400 Subject: RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code Message-ID: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> Please review this refactor of Shenandoah load barrier. The goal is to make the barrier structurally similar cross interpreter, C1 and C2, improve readability and maintainability. Bug: https://bugs.openjdk.java.net/browse/JDK-8233401 Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.00/index.html Test: hotspot_gc_shenandoah (fastdebug and release) x86_64 and x86_32 on Linux AArch64 on Linux Thanks, -Zhengyu From kim.barrett at oracle.com Mon Nov 4 04:07:33 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Sun, 3 Nov 2019 23:07:33 -0500 Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for G1, Logging (3/3) In-Reply-To: References: <743d16cc-499d-784b-79fc-c006643f9ec5@oracle.com> <6449da1d-6dcc-50ba-8ae8-7615e7ad35f9@oracle.com> Message-ID: > On Nov 2, 2019, at 2:08 AM, sangheon.kim at oracle.com wrote: > > Here's the webrev: > http://cr.openjdk.java.net/~sangheki/8220312/webrev.4 > http://cr.openjdk.java.net/~sangheki/8220312/webrev.4.inc > http://cr.openjdk.java.net/~sangheki/8220312/webrev.4.inc.numa_move_pages > > Testing: hs-tier 1 ~ 5 with / without UseNUMA. I didn't spend much time looking at the actual logging output; Thomas and Stefan have both given that a pretty thorough look. Instead I looked for more structural things. ------------------------------------------------------------------------------ There are several functions named "print()" which I think should have some other name. The classes involved (G1NUMA, G1NUMAStats, maybe other that I didn't check for) are derived from CHeapObj<>. In a non-product build, CHeapObj<> is derived from AllocatedObj, which provides a (non-virtual) "print()" function that calls the virtual "print_on()" function. By giving these classes their own print(), this change is overriding this existing public API in non-product builds only. While overriding a public non-virtual function is permitted, it is usually denigrated. And having different overriding behavior based on build type seems particularly confusing, especially since one is about logging and the other isn't. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1ParScanThreadState.inline.hpp 241 memset((void*)_obj_alloc_stat, 0, sizeof(size_t) * num_nodes); Unnecessary cast to void*. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1RegionCounts.hpp 28 #include "gc/g1/heapRegion.hpp" Unnecessary #include here; forward declaration of HeapRegion would be sufficient. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1RegionCounts.hpp Why are add, clear, and length virtual? This class doesn't seem to be used as a base class, and isn't overriding anything from it's base. If it were a base class, then the destructor should not be public and non-virtual. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1RegionCounts.hpp 34 class G1RegionCounts : public StackObj { The name of this class seems very generic and isn't at all suggestive of it having anything at all to do with NUMA. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/heapRegionManager.cpp 128 if (hr->node_index() < numa->num_active_nodes()) { 129 numa->update_statistics(G1NUMAStats::NewRegionAlloc, requested_node_index, hr->node_index()); 130 } Should we be doing this if NUMA disabled? ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/heapRegionSet.cpp 323 FreeRegionList::FreeRegionList(const char* name, HeapRegionSetChecker* checker): 324 HeapRegionSetBase(name, checker), 325 _node_info(G1NUMA::numa()->is_enabled() ? new NodeInfo() : NULL) { Unusual indentatio of initializer-list. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/heapRegionSet.hpp 189 const uint requested_node_index); [pre-existing from earlier patch in this series?] Useless const qualifier. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/heapRegionSet.hpp 246 class NodeInfo : public CHeapObj { NodeInfo is an overly generic name for the global namespace. ------------------------------------------------------------------------------ From shade at redhat.com Mon Nov 4 09:10:29 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 4 Nov 2019 10:10:29 +0100 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> Message-ID: On 11/2/19 4:07 PM, Zhengyu Gu wrote: > Please review this refactor of Shenandoah load barrier. The goal is to make the barrier structurally > similar cross interpreter, C1 and C2, improve readability and maintainability. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8233401 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.00/index.html This is cute patch. *) Typo "non-reference load": 207 // 1: none-reference load, no additional barrier is needed *) The comment style is inconsistent with other places: 537 Node* ShenandoahBarrierSetC2::load_at_resolved(C2Access& access, const Type* val_type) const { 538 // 1: load reference 539 Node* load = BarrierSetC2::load_at_resolved(access, val_type); 540 // For none-reference load, no additional barrier is needed *) In constructions like this, it seems more consistent to introduce the local variable for matching the decorator? 387 // Native barrier is for concurrent root processing 388 if (((decorators & IN_NATIVE) != 0) && 389 ShenandoahConcurrentRoots::can_do_concurrent_roots()) { Otherwise looks good. Roman needs to take a look as well. -- Thanks, -Aleksey From aph at redhat.com Mon Nov 4 09:44:34 2019 From: aph at redhat.com (Andrew Haley) Date: Mon, 4 Nov 2019 09:44:34 +0000 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> Message-ID: On 11/2/19 3:07 PM, Zhengyu Gu wrote: > Please review this refactor of Shenandoah load barrier. The goal is to > make the barrier structurally similar cross interpreter, C1 and C2, > improve readability and maintainability. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8233401 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.00/index.html > > Test: > hotspot_gc_shenandoah (fastdebug and release) > x86_64 and x86_32 on Linux > AArch64 on Linux Thanks, this is an improvement. However, it's still weird. // // Arguments: // // Inputs: // src: oop location to load from, might be clobbered // tmp1: unused // tmp_thread: unused // // Output: // dst: oop loaded from src location // // Kill: // rscratch1 (scratch reg) // // Alias: // dst: rscratch1 (might use rscratch1 as temporary output register to avoid clobbering src) // void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, Register dst, Address src, Register tmp1, Register tmp_thread) { tmp1 and tmp_thread are unused? It'd be a good idea, then, to say if they are safe to use or not. Or maybe even better do this if you want to keep the same arg list: void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, Register dst, Address src, Register, Register) { I guess it really isn't safe to use "tmp1" as a tmp, regardless of its name. If so, better pass it as noreg/ -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From thomas.schatzl at oracle.com Mon Nov 4 10:34:26 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 4 Nov 2019 11:34:26 +0100 Subject: RFR (XS): 8232951: TestG1ParallelPhases.java fails with phase NonYoungFreeCSet not found In-Reply-To: <4560FCD7-91B3-42C4-A2C4-B183C2A12B8A@oracle.com> References: <27fa21ab-d8ac-95ea-3485-7a72116c22f2@oracle.com> <951EB603-F273-4787-9D8D-32D8194ECFAA@oracle.com> <7734C751-1D9B-45F5-86FB-D51D2BE8985F@oracle.com> <1fff7f47-aebc-bbcd-bda6-bb6185c11c3a@oracle.com> <4560FCD7-91B3-42C4-A2C4-B183C2A12B8A@oracle.com> Message-ID: Hi, On 01.11.19 03:12, Kim Barrett wrote: >> On Oct 31, 2019, at 5:51 AM, Thomas Schatzl wrote: >> >> Updated in place; also fixed Kim's comment about line length. >> >> http://cr.openjdk.java.net/~tschatzl/8232951/webrev/ > > Still looks good. > >> >>> I am sorry that my "improvements" probably caused this failure, though just having heaps of code and not understanding why, is probably worse in the long run --- at least that is my thinking. >> >> The question I have is whether I can push these changes under this CR (and if it occurs again we at least have a log to look at) or use another CR for it? > > I say go ahead. > Thanks Kim, Leo for your input and reviews! Pushed. Thanks, Thomas From thomas.schatzl at oracle.com Mon Nov 4 10:46:49 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 4 Nov 2019 11:46:49 +0100 Subject: RFR (S): 8233301: Implementation of JEP 366: Deprecate the ParallelScavenge + SerialOld GC Combination In-Reply-To: References: <727338ce-845c-067f-e56e-f066d2a602dd@oracle.com> Message-ID: Hi, On 01.11.19 00:05, Bernd Eckenfels wrote: > The help message: > > Use the Parallel Old garbage collector. Deprecated. product(bool, UseParallelOldGC, false, \ "Use the Serial Old garbage collection algorithm for old " \ "generation. Deprecated." \ > > Looks a bit missleading to me. I know it means the option is deprecated (especially the non default negative value > but it could easily be understood as ParallelOld beeing deprecated. I hope the new text is better understandable and does not lead to the confusion you suggested. Note that the default value of "false" is correct although maybe surprising, as this change does not modify it. It's only that -XX:+UseParallelGC is actually a shorthand for "-XX:+UseParallelGC -XX:+UseParallelOldGC" for like forever. I do not want to change this here. Note that these strings are not the official documentation, but the manpage. They read: `-XX:+UseParallelGC` : Enables the use of the parallel scavenge garbage collector (also known as the throughput collector) to improve the performance of your application by leveraging multiple processors. By default, this option is disabled and the default collector is used. If it's enabled, then the `-XX:+UseParallelOldGC` option is automatically enabled, unless you explicitly disable it. and (with this change, as part of the "Deprecated" section): `-XX:+UseParallelOldGC` : Enables the use of the parallel garbage collector for full GCs. By default, this option is disabled. Enabling it automatically enables the `-XX:+UseParallelGC` option. Which seems in line with what is now told in the sources too. > > There is no jtreg for +UseParallelOld. It would need to document that deprecation warning is expected for that as well? There is a separate test case in test/hotspot/jtreg/runtime/CommandLine/VMDeprecatedOptions.java which does not seem updated for any and all deprecated options. I added this flag to it anyway. Webrev: http://cr.openjdk.java.net/~tschatzl/8233301/webrev/ Testing: local compilation + local jtreg test Thanks, Thomas From zgu at redhat.com Mon Nov 4 14:08:42 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 4 Nov 2019 09:08:42 -0500 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> Message-ID: Hi Andrew, Thanks for the review. > void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, > Register dst, Address src, Register tmp1, Register tmp_thread) { > > tmp1 and tmp_thread are unused? It'd be a good idea, then, to say if they are > safe to use or not. Or maybe even better do this if you want to keep the same > arg list: > > void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, > Register dst, Address src, Register, Register) { > This is an overrode method. What you get for tmp1 and tmp_thread, is really platform dependent. On AArch64, you usually get noreg for tmp1 and tmp_thread. I can not tell if you can safely use tmp1 if it is valid. I don't use tmp1 here, since I don't think it is worth the trouble, as we have spare scratch registers. I do use tmp1 in x86 through. What do you suggest the comment should be? Thanks, -Zhengyu > I guess it really isn't safe to use "tmp1" as a tmp, regardless of its name. > > If so, better pass it as noreg/ > From aph at redhat.com Mon Nov 4 14:32:36 2019 From: aph at redhat.com (Andrew Haley) Date: Mon, 4 Nov 2019 14:32:36 +0000 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> Message-ID: <6ff66df6-cba8-e2a3-30ba-0ba5656e15fb@redhat.com> On 11/4/19 2:08 PM, Zhengyu Gu wrote: > >> void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, >> Register dst, Address src, Register tmp1, Register tmp_thread) { >> >> tmp1 and tmp_thread are unused? It'd be a good idea, then, to say if they are >> safe to use or not. Or maybe even better do this if you want to keep the same >> arg list: >> >> void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, >> Register dst, Address src, Register, Register) { >> > > This is an overrode method. What you get for tmp1 and tmp_thread, is > really platform dependent. > > On AArch64, you usually get noreg for tmp1 and tmp_thread. I can not > tell if you can safely use tmp1 if it is valid. > > I don't use tmp1 here, since I don't think it is worth the trouble, as > we have spare scratch registers. I do use tmp1 in x86 through. OK, so please just do this for now: >> void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, >> Register dst, Address src, Register, Register) { I'm working on a redesign of the way that scratch registers are used in AArch64, and this code is likely to have to be changed. Accurate information about register usage is likely to be crucial for that. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From rkennke at redhat.com Mon Nov 4 15:35:52 2019 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 4 Nov 2019 16:35:52 +0100 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> Message-ID: >> Please review this refactor of Shenandoah load barrier. The goal is to make the barrier structurally >> similar cross interpreter, C1 and C2, improve readability and maintainability. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8233401 >> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.00/index.html > > This is cute patch. > > *) Typo "non-reference load": > > 207 // 1: none-reference load, no additional barrier is needed > > *) The comment style is inconsistent with other places: > > 537 Node* ShenandoahBarrierSetC2::load_at_resolved(C2Access& access, const Type* val_type) const { > 538 // 1: load reference > 539 Node* load = BarrierSetC2::load_at_resolved(access, val_type); > 540 // For none-reference load, no additional barrier is needed > > *) In constructions like this, it seems more consistent to introduce the local variable for matching > the decorator? > > 387 // Native barrier is for concurrent root processing > 388 if (((decorators & IN_NATIVE) != 0) && > 389 ShenandoahConcurrentRoots::can_do_concurrent_roots()) { > > Otherwise looks good. Roman needs to take a look as well. Yes, otherwise looks good. Thanks, Roman From stefan.johansson at oracle.com Mon Nov 4 16:20:11 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Mon, 4 Nov 2019 17:20:11 +0100 Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for G1, Logging (3/3) In-Reply-To: References: <743d16cc-499d-784b-79fc-c006643f9ec5@oracle.com> <6449da1d-6dcc-50ba-8ae8-7615e7ad35f9@oracle.com> Message-ID: <2c4e49d8-91f6-4109-b9a3-78b1999ccda3@oracle.com> Hi Sangheon, Thanks for addressing my comments, some comments inline and additional comments below. On 2019-11-02 07:08, sangheon.kim at oracle.com wrote: >> ... >> As we discussed offline, I would like to know the mismatches as well, >> I think the easiest approach would be to make the total count per node >> as well and that way we can see if there were any regions that didn't >> match. What do you think about printing the info like this: >> [3,009s][trace][gc,heap,numa ] GC(6) NUMA region verification >> (actual/expected): 0: 1024/1024, 1: 270/1024, Unknown: 0 > Changed as you suggested to have per node but deleted unknown as unknown > is 'total - matched'. I'm fine with skipping 'Unknown', but not all regions in (total - matched) must be 'Unknown' since there is the possibility of a real mismatch as well. > >> >> When testing this I also realized this output is problematic in the >> case where we have committed regions that have not yet been used. >> Reading the manual for get_mempolicy (the way we get the numa id for >> the address) say: >> "If no page has yet been allocated for the specified address, >> get_mempolicy() will allocate a page as if the thread had performed a >> read (load) access to that address, and return the ID of the node >> where that page was allocated." > Nice catch. > > Shortly saying get_mempolicy() doesn't honor os::numa_make_local() call > that we previously requested. And this problem occurs when > AlwaysPreTouch is disabled. > > However, my initial implementation which uses 'numa_move_pages()' > doesn't have this problem. So one fundamental solution would be > replacing linux implementation. > In current scope of 3 patches, there will be no problem if we add > 'hr->free && !AlwaysPreTouch' condition check however > os::numa_get_group_id_for_address() will still have such limitation. > > What do you think about changing the Linux implementation? > (webrev link is added at the end) > I agree that we should use numa_move_pages(), but I think we might want to change things a bit more because of this. I will add those comments below the patch to have all code comments in one place. >> ... >> One can also see that this verification takes some time, so maybe it >> would make sense to have this logging under gc+numa+verify. > I think if we avoid calling G1NUMAA:index_of_address() if 'hr->free() && > !AlwaysPreTouch'? but count total, we will be fine. > Basically I imported your patch but the verification is only happening > at the beginning of GC. Any reason for not including it at the end of GC as well? I guess that could be good to be able to diagnose if there was an increase imbalance due to the collection. >> ... >> ?166 void G1NodeTimes::print_phase_info(G1NodeTimes::NodeStatPhases >> phase) { >> ?167?? LogTarget(Info, gc, heap, numa) lt; >> >> I think this should be on debug level, but if you don't agree leave it >> as is. > I feel Info seems okay, so let me leave as is. > >> --- >> >> ?191 void G1NodeTimes::print_mutator_alloc_stat_debug() { >> ?192?? LogTarget(Debug, gc, heap, numa) lt; >> >> And if you agree on moving the above to debug I think this should be >> on trace level. > As is, please. I'm okay with that, I just wanted to point out that compared to the other G1 logging on Info/Debug level these seemed to be one off. > > Here's the webrev: > http://cr.openjdk.java.net/~sangheki/8220312/webrev.4 Some additional comments: src/hotspot/share/gc/g1/g1CollectedHeap.cpp --- 2598 void G1CollectedHeap::verify_numa_regions() { 2599 LogTarget(Trace, gc, heap, numa) lt; I realize I didn't include that in my example patch, but did you have anything against changing the tags here to 'gc, heap, verify'? --- src/hotspot/share/gc/g1/g1NUMA.cpp --- 281 _ls->print("%d: %u / %u, ", numa_ids[i], _matched[i], _total[i]); This will leave a trailing ',' now when we don't have the 'Unknown' output. I think just having a space as the separator between the nodes could be fine to avoid this. --- 292 if (hr->is_free() && !AlwaysPreTouch) { 293 active_node_index = G1NUMA::UnknownNodeIndex; 294 } else { 295 active_node_index = _numa->index_of_address(hr->bottom()); 296 } 297 298 if (preferred_node_index == active_node_index) { 299 _matched[preferred_node_index]++; 300 } 301 _total[preferred_node_index]++; As I touched upon above, I think we still lack some important information here. If we are to look at all committed regions, which probably is good, we need to keep a mismatch count as well. Otherwise we can't know if the diff between matched and total is just unused regions or actual mismatches without relying on other logs and analysis. Or am I missing something? -- src/hotspot/share/gc/g1/g1SurvivorRegions.hpp --- 30 class G1RegionCounts; Just having a forward declaration here is not enough, so please add back the include. --- src/hotspot/share/gc/g1/g1NUMAStats.hpp --- 99 void print_phase_info(G1NUMAStats::NodeDataItems phase); Sorry for being so picky on the naming, but since we changed the class and enum I think this method, the comments and the parameters should change as well. For example: void print_info(G1NUMAStats::NodeDataItems type); > http://cr.openjdk.java.net/~sangheki/8220312/webrev.4.inc > http://cr.openjdk.java.net/~sangheki/8220312/webrev.4.inc.numa_move_pages As I said above, I think we should go with this implementation of getting the node given an address, but I have a comment: src/hotspot/os/linux/os_linux.cpp --- 3018 if (os::Linux::numa_move_pages(0, 1, pages, NULL, &id, 0) == -1) { 3019 return -1; 3020 } 3021 if (id < 0) { 3022 return -1; 3023 } I think we should differ between the case where the call fails (return -1) and the id returned is negative. From reading the man_pages for 'move_pages()' (used by numa_move_pages()) it seems we can expect to either get -EFAULT (for normal pages) or -ENOENT (for large pages) when a region has not yet been used. I'm not completely sure how much logic we want to add to this method, so maybe creating a RFE to look at this in more detail later is good. But for this first version I think your proposal is good. One thought I just got, with this implementation is more like 'Unused', rather then 'Unknown'. But such a notion change could also be looked at in the RFE. --- Thanks, Stefan > > Testing: hs-tier 1 ~ 5 with / without UseNUMA. > > Thanks, > Sangheon > > >> --- >> >> This is it for now. Thanks, >> Stefan >> >> >>> Testing: hs-tier 1 ~ 4 with/without UseNUMA. hs-tier5 is almost >>> finished without new failures. >>> >>> Thanks, >>> Sangheon >>> >>> > From thomas.schatzl at oracle.com Mon Nov 4 16:40:05 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 4 Nov 2019 17:40:05 +0100 Subject: RFR (M): 8189737: Make HeapRegion not derive from Space In-Reply-To: References: <4fdeb066-e9eb-c0e6-5fd2-5ec9a368bc23@oracle.com> Message-ID: <07ef2312-d974-be19-c887-828696a8493f@oracle.com> Hi, On 01.11.19 00:20, Thomas Schatzl wrote: > Hi Kim, > > thanks for your review. > > On Thu, 2019-10-31 at 18:12 -0400, Kim Barrett wrote: >>> On Oct 31, 2019, at 9:43 AM, Thomas Schatzl < >>> thomas.schatzl at oracle.com> wrote: >>> >>> Hi all, >>> >>> can I get reviews for this refactoring that removes the >>> inheritance of HeapRegion from Space? >>> >>> > [...] >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8189737 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8189737/webrev/ >>> Testing: >>> hs-tier-1-5 >>> >>> Thanks, >>> Thomas [..] >> >> ------------------------------------------------------------------- >> ----------- >> >> Looks good. >> >> I don't need a new webrev for the parameter list indentation fix. >> > > I will update the webrev later in place. Done. Thanks for your review. Thomas From zgu at redhat.com Mon Nov 4 16:55:59 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 4 Nov 2019 11:55:59 -0500 Subject: RFR 8233500: Shenandoah: Shenandoah load barrier should save registers before calling keep alive barrier on x86 Message-ID: This bug was found and fixed during concurrent class unloading work in shenandoah/jdk. However, I don't think it is concurrent class unloading specific issue, and could result hard to find problem in jdk/jdk. BTW: AArch64 already does right thing. Bug: https://bugs.openjdk.java.net/browse/JDK-8233500 Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233500/webrev.00/ Test: hotspot_gc_shenandoah (fastdebug and release) x86_64 and x86_32 on Linux Thanks, -Zhengyu From shade at redhat.com Mon Nov 4 17:07:58 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 4 Nov 2019 18:07:58 +0100 Subject: RFR 8233500: Shenandoah: Shenandoah load barrier should save registers before calling keep alive barrier on x86 In-Reply-To: References: Message-ID: On 11/4/19 5:55 PM, Zhengyu Gu wrote: > This bug was found and fixed during concurrent class unloading work in shenandoah/jdk. However, I > don't think it is concurrent class unloading specific issue, and could result hard to find problem > in jdk/jdk. > > BTW: AArch64 already does right thing. Where? Please be specific when saying this (i.e. point to code), for archival reasons. > Bug: https://bugs.openjdk.java.net/browse/JDK-8233500 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233500/webrev.00/ I don't understand this. SATB handling is similar to G1 is doing, where's the similar code in G1? The patch adds save/restore at in SBSA::load_at, but there is a similar block in SBSA::store_at, why it is not needed there? -- Thanks, -Aleksey From zgu at redhat.com Mon Nov 4 17:32:11 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 4 Nov 2019 12:32:11 -0500 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <6ff66df6-cba8-e2a3-30ba-0ba5656e15fb@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> <6ff66df6-cba8-e2a3-30ba-0ba5656e15fb@redhat.com> Message-ID: >> On AArch64, you usually get noreg for tmp1 and tmp_thread. I can not >> tell if you can safely use tmp1 if it is valid. >> >> I don't use tmp1 here, since I don't think it is worth the trouble, as >> we have spare scratch registers. I do use tmp1 in x86 through. > > OK, so please just do this for now: Thanks! -Zhengyu > >>> void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, >>> Register dst, Address src, Register, Register) { > > I'm working on a redesign of the way that scratch registers are used in > AArch64, and this code is likely to have to be changed. Accurate information > about register usage is likely to be crucial for that. > From zgu at redhat.com Mon Nov 4 17:33:20 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 4 Nov 2019 12:33:20 -0500 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> Message-ID: <87b115fb-5353-b21b-3cbc-f862bd932b3e@redhat.com> Updated: http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.01/index.html Okay now? Thanks, -Zhengyu On 11/4/19 10:35 AM, Roman Kennke wrote: >>> Please review this refactor of Shenandoah load barrier. The goal is to make the barrier structurally >>> similar cross interpreter, C1 and C2, improve readability and maintainability. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8233401 >>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.00/index.html >> >> This is cute patch. >> >> *) Typo "non-reference load": >> >> 207 // 1: none-reference load, no additional barrier is needed >> >> *) The comment style is inconsistent with other places: >> >> 537 Node* ShenandoahBarrierSetC2::load_at_resolved(C2Access& access, const Type* val_type) const { >> 538 // 1: load reference >> 539 Node* load = BarrierSetC2::load_at_resolved(access, val_type); >> 540 // For none-reference load, no additional barrier is needed >> >> *) In constructions like this, it seems more consistent to introduce the local variable for matching >> the decorator? >> >> 387 // Native barrier is for concurrent root processing >> 388 if (((decorators & IN_NATIVE) != 0) && >> 389 ShenandoahConcurrentRoots::can_do_concurrent_roots()) { >> >> Otherwise looks good. Roman needs to take a look as well. > > Yes, otherwise looks good. > > Thanks, > Roman > > From aph at redhat.com Mon Nov 4 17:38:14 2019 From: aph at redhat.com (Andrew Haley) Date: Mon, 4 Nov 2019 17:38:14 +0000 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <87b115fb-5353-b21b-3cbc-f862bd932b3e@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> <87b115fb-5353-b21b-3cbc-f862bd932b3e@redhat.com> Message-ID: <3d70db1c-c927-48f8-23ab-8937838e0302@redhat.com> On 11/4/19 5:33 PM, Zhengyu Gu wrote: > Updated: http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.01/index.html > > Okay now? AArch64 still says void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, Register dst, Address src, Register tmp1, Register tmp_thread) { instead of void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, Register dst, Address src, Register, Register) { -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From shade at redhat.com Mon Nov 4 17:47:24 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 4 Nov 2019 18:47:24 +0100 Subject: RFR (S) 8233520: Shenandoah: do not sleep when thread is attaching Message-ID: <04f22413-4d71-7a63-85e5-563b13710aa2@redhat.com> Bug: https://bugs.openjdk.java.net/browse/JDK-8233520 Fix: http://cr.openjdk.java.net/~shade/8233520/webrev.01/ This was exposed by recently added assert. But the bug itself is legit Shenandoah bug, and should be fixed everywhere. Testing: affected test; hotspot_gc_shenandoah -- Thanks, -Aleksey From zgu at redhat.com Mon Nov 4 18:18:38 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 4 Nov 2019 13:18:38 -0500 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <3d70db1c-c927-48f8-23ab-8937838e0302@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> <87b115fb-5353-b21b-3cbc-f862bd932b3e@redhat.com> <3d70db1c-c927-48f8-23ab-8937838e0302@redhat.com> Message-ID: <0d347d16-f870-798f-0165-1ee4dfae511b@redhat.com> On 11/4/19 12:38 PM, Andrew Haley wrote: > On 11/4/19 5:33 PM, Zhengyu Gu wrote: >> Updated: http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.01/index.html >> >> Okay now? > AArch64 still says > > void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, > Register dst, Address src, Register tmp1, Register tmp_thread) { > > instead of > > void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type, > Register dst, Address src, Register, Register) { They are still needed for calling super class's load_at(). Even though, they are not used there neither. // 1: non-reference load, no additional barrier is needed if (!is_reference_type(type) ) { BarrierSetAssembler::load_at(masm, decorators, type, dst, src, tmp1, tmp_thread); return; } -Zhengyu > From zgu at redhat.com Mon Nov 4 18:23:12 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 4 Nov 2019 13:23:12 -0500 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <0d347d16-f870-798f-0165-1ee4dfae511b@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> <87b115fb-5353-b21b-3cbc-f862bd932b3e@redhat.com> <3d70db1c-c927-48f8-23ab-8937838e0302@redhat.com> <0d347d16-f870-798f-0165-1ee4dfae511b@redhat.com> Message-ID: <859e48d6-9af5-b4af-32ac-4b07ce92e94d@redhat.com> On 11/4/19 1:18 PM, Zhengyu Gu wrote: > > > On 11/4/19 12:38 PM, Andrew Haley wrote: >> On 11/4/19 5:33 PM, Zhengyu Gu wrote: >>> Updated: >>> http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.01/index.html >>> >>> Okay now? >> AArch64 still says >> >> ? void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, >> DecoratorSet decorators, BasicType type, >> ????????????????????????????????????????????? Register dst, Address >> src, Register tmp1, Register tmp_thread) { >> >> instead of >> >> ? void ShenandoahBarrierSetAssembler::load_at(MacroAssembler* masm, >> DecoratorSet decorators, BasicType type, >> ????????????????????????????????????????????? Register dst, Address >> src, Register, Register) { > > They are still needed for calling super class's load_at(). Even though, > they are not used there neither. Or I should say, they are not used there right now, but may be used in future ... -Zhengyu > > ? // 1: non-reference load, no additional barrier is needed > ? if (!is_reference_type(type) ) { > ??? BarrierSetAssembler::load_at(masm, decorators, type, dst, src, > tmp1, tmp_thread); > ??? return; > ? } > > > -Zhengyu > >> From rkennke at redhat.com Mon Nov 4 18:23:51 2019 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 4 Nov 2019 19:23:51 +0100 Subject: RFR (S) 8233520: Shenandoah: do not sleep when thread is attaching In-Reply-To: <04f22413-4d71-7a63-85e5-563b13710aa2@redhat.com> References: <04f22413-4d71-7a63-85e5-563b13710aa2@redhat.com> Message-ID: <31515f5f-505d-4bfb-8105-9ffcf86870da@redhat.com> Ok. Thanks, Roman > Bug: > https://bugs.openjdk.java.net/browse/JDK-8233520 > > Fix: > http://cr.openjdk.java.net/~shade/8233520/webrev.01/ > > This was exposed by recently added assert. But the bug itself is legit Shenandoah bug, and should be > fixed everywhere. > > Testing: affected test; hotspot_gc_shenandoah > From zgu at redhat.com Mon Nov 4 18:34:45 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 4 Nov 2019 13:34:45 -0500 Subject: RFR 8233500: Shenandoah: Shenandoah load barrier should save registers before calling keep alive barrier on x86 In-Reply-To: References: Message-ID: <1078cf3a-3d99-4cc3-dd0f-55a63967caa5@redhat.com> On 11/4/19 12:07 PM, Aleksey Shipilev wrote: > On 11/4/19 5:55 PM, Zhengyu Gu wrote: >> This bug was found and fixed during concurrent class unloading work in shenandoah/jdk. However, I >> don't think it is concurrent class unloading specific issue, and could result hard to find problem >> in jdk/jdk. >> >> BTW: AArch64 already does right thing. > > Where? Please be specific when saying this (i.e. point to code), for archival reasons. http://hg.openjdk.java.net/jdk/jdk/file/33f9271b3167/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp#l383 > > >> Bug: https://bugs.openjdk.java.net/browse/JDK-8233500 >> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233500/webrev.00/ > > I don't understand this. > > SATB handling is similar to G1 is doing, where's the similar code in G1? The patch adds save/restore > at in SBSA::load_at, but there is a similar block in SBSA::store_at, why it is not needed there? Because we do self-fixing in LRB and have to reshuffle registers. Not sure about SBSA::store_at(), because it still similar to G1 code? Thanks, -Zhengyu > From shade at redhat.com Mon Nov 4 18:39:33 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Mon, 4 Nov 2019 19:39:33 +0100 Subject: RFR 8233500: Shenandoah: Shenandoah load barrier should save registers before calling keep alive barrier on x86 In-Reply-To: <1078cf3a-3d99-4cc3-dd0f-55a63967caa5@redhat.com> References: <1078cf3a-3d99-4cc3-dd0f-55a63967caa5@redhat.com> Message-ID: On 11/4/19 7:34 PM, Zhengyu Gu wrote: >> SATB handling is similar to G1 is doing, where's the similar code in G1? The patch adds save/restore >> at in SBSA::load_at, but there is a similar block in SBSA::store_at, why it is not needed there? > > Because we do self-fixing in LRB and have to reshuffle registers. Okay. So AArch64 does enter()/leave(), why x86 needs the entire IU_state pushed/popped? My concern is that pushing/popping the entire state explodes code size (we don't care about performance much, but we do care about hitting the stub boundaries), and probably hides some bugs with register shuffles. -- Thanks, -Aleksey From zgu at redhat.com Mon Nov 4 18:42:19 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 4 Nov 2019 13:42:19 -0500 Subject: RFR 8233500: Shenandoah: Shenandoah load barrier should save registers before calling keep alive barrier on x86 In-Reply-To: References: <1078cf3a-3d99-4cc3-dd0f-55a63967caa5@redhat.com> Message-ID: <4719fe38-c2be-bfe6-13a5-a2b050c7796e@redhat.com> On 11/4/19 1:39 PM, Aleksey Shipilev wrote: > On 11/4/19 7:34 PM, Zhengyu Gu wrote: >>> SATB handling is similar to G1 is doing, where's the similar code in G1? The patch adds save/restore >>> at in SBSA::load_at, but there is a similar block in SBSA::store_at, why it is not needed there? >> >> Because we do self-fixing in LRB and have to reshuffle registers. > > Okay. So AArch64 does enter()/leave(), why x86 needs the entire IU_state pushed/popped? Roman suggested. Roman, could you answer? Thanks, -Zhengyu > > My concern is that pushing/popping the entire state explodes code size (we don't care about > performance much, but we do care about hitting the stub boundaries), and probably hides some bugs > with register shuffles. > From rkennke at redhat.com Mon Nov 4 18:59:14 2019 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 4 Nov 2019 19:59:14 +0100 Subject: RFR 8233500: Shenandoah: Shenandoah load barrier should save registers before calling keep alive barrier on x86 In-Reply-To: <4719fe38-c2be-bfe6-13a5-a2b050c7796e@redhat.com> References: <1078cf3a-3d99-4cc3-dd0f-55a63967caa5@redhat.com> <4719fe38-c2be-bfe6-13a5-a2b050c7796e@redhat.com> Message-ID: <3dbf88b0-5de6-a2b9-93b5-086927e88f82@redhat.com> >>>> SATB handling is similar to G1 is doing, where's the similar code in >>>> G1? The patch adds save/restore >>>> at in SBSA::load_at, but there is a similar block in SBSA::store_at, >>>> why it is not needed there? >>> >>> Because we do self-fixing in LRB and have to reshuffle registers. >> >> Okay. So AArch64 does enter()/leave(), why x86 needs the entire >> IU_state pushed/popped? > Roman suggested. > > Roman, could you answer? enter()/leave() sets up/tears down the stub frame for the runtime call. push/pop_IU_state() saves/restores the registers. Aarch64 code also saves/restores the registers via push/pop_call_clobbered_registers(). Roman > Thanks, > > -Zhengyu > >> >> My concern is that pushing/popping the entire state explodes code size >> (we don't care about >> performance much, but we do care about hitting the stub boundaries), >> and probably hides some bugs >> with register shuffles. >> From zgu at redhat.com Mon Nov 4 19:12:35 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 4 Nov 2019 14:12:35 -0500 Subject: RFR 8233500: Shenandoah: Shenandoah load barrier should save registers before calling keep alive barrier on x86 In-Reply-To: <3dbf88b0-5de6-a2b9-93b5-086927e88f82@redhat.com> References: <1078cf3a-3d99-4cc3-dd0f-55a63967caa5@redhat.com> <4719fe38-c2be-bfe6-13a5-a2b050c7796e@redhat.com> <3dbf88b0-5de6-a2b9-93b5-086927e88f82@redhat.com> Message-ID: On 11/4/19 1:59 PM, Roman Kennke wrote: >>>>> SATB handling is similar to G1 is doing, where's the similar code in >>>>> G1? The patch adds save/restore >>>>> at in SBSA::load_at, but there is a similar block in SBSA::store_at, >>>>> why it is not needed there? >>>> >>>> Because we do self-fixing in LRB and have to reshuffle registers. >>> >>> Okay. So AArch64 does enter()/leave(), why x86 needs the entire >>> IU_state pushed/popped? >> Roman suggested. >> >> Roman, could you answer? > > enter()/leave() sets up/tears down the stub frame for the runtime call. Ha, I misunderstood enter()/leave(). > push/pop_IU_state() saves/restores the registers. Aarch64 code also > saves/restores the registers via push/pop_call_clobbered_registers(). I think we are still okay with AArch64, because unlike x86, it at most clobbers rscratch1. -Zhengyu > > Roman > > >> Thanks, >> >> -Zhengyu >> >>> >>> My concern is that pushing/popping the entire state explodes code size >>> (we don't care about >>> performance much, but we do care about hitting the stub boundaries), >>> and probably hides some bugs >>> with register shuffles. >>> > From sangheon.kim at oracle.com Mon Nov 4 19:13:39 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Mon, 4 Nov 2019 11:13:39 -0800 Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for G1, Logging (3/3) In-Reply-To: References: <743d16cc-499d-784b-79fc-c006643f9ec5@oracle.com> <6449da1d-6dcc-50ba-8ae8-7615e7ad35f9@oracle.com> Message-ID: Hi Kim, On 11/3/19 8:07 PM, Kim Barrett wrote: >> On Nov 2, 2019, at 2:08 AM, sangheon.kim at oracle.com wrote: >> >> Here's the webrev: >> http://cr.openjdk.java.net/~sangheki/8220312/webrev.4 >> http://cr.openjdk.java.net/~sangheki/8220312/webrev.4.inc >> http://cr.openjdk.java.net/~sangheki/8220312/webrev.4.inc.numa_move_pages >> >> Testing: hs-tier 1 ~ 5 with / without UseNUMA. > I didn't spend much time looking at the actual logging output; Thomas > and Stefan have both given that a pretty thorough look. Instead I > looked for more structural things. Thanks for your review. > > ------------------------------------------------------------------------------ > > There are several functions named "print()" which I think should have > some other name. The classes involved (G1NUMA, G1NUMAStats, maybe > other that I didn't check for) are derived from CHeapObj<>. In a > non-product build, CHeapObj<> is derived from AllocatedObj, which > provides a (non-virtual) "print()" function that calls the virtual > "print_on()" function. > > By giving these classes their own print(), this change is overriding > this existing public API in non-product builds only. While overriding > a public non-virtual function is permitted, it is usually denigrated. > And having different overriding behavior based on build type seems > particularly confusing, especially since one is about logging and the > other isn't. Changed to have other name of 'print_statistics()'. > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/g1ParScanThreadState.inline.hpp > 241 memset((void*)_obj_alloc_stat, 0, sizeof(size_t) * num_nodes); > > Unnecessary cast to void*. Done. > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/g1RegionCounts.hpp > 28 #include "gc/g1/heapRegion.hpp" > > Unnecessary #include here; forward declaration of HeapRegion would be > sufficient. Done. > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/g1RegionCounts.hpp > > Why are add, clear, and length virtual? This class doesn't seem to be > used as a base class, and isn't overriding anything from it's base. > > If it were a base class, then the destructor should not be public and > non-virtual. Removed 'virtual'. It was a base class before. > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/g1RegionCounts.hpp > 34 class G1RegionCounts : public StackObj { > > The name of this class seems very generic and isn't at all suggestive > of it having anything at all to do with NUMA. > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/heapRegionManager.cpp > 128 if (hr->node_index() < numa->num_active_nodes()) { > 129 numa->update_statistics(G1NUMAStats::NewRegionAlloc, requested_node_index, hr->node_index()); > 130 } > > Should we be doing this if NUMA disabled? Added G1NUMA::is_enabled(). Inside of G1NUMA::update_statistics() also checks is_enabled(), so I didn't add checking it before. > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/heapRegionSet.cpp > 323 FreeRegionList::FreeRegionList(const char* name, HeapRegionSetChecker* checker): > 324 HeapRegionSetBase(name, checker), > 325 _node_info(G1NUMA::numa()->is_enabled() ? new NodeInfo() : NULL) { > > Unusual indentatio of initializer-list. Fixed. > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/heapRegionSet.hpp > 189 const uint requested_node_index); > > [pre-existing from earlier patch in this series?] > Useless const qualifier. Removed const qualifier. > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/heapRegionSet.hpp > 246 class NodeInfo : public CHeapObj { > > NodeInfo is an overly generic name for the global namespace. Moved into FreeRegionList. I addressed all your comments but let me post the next webrev after reflecting Stefan's comment as well. Thanks, Sangheon > > ------------------------------------------------------------------------------ > From fujie at loongson.cn Tue Nov 5 03:07:29 2019 From: fujie at loongson.cn (Jie Fu) Date: Tue, 5 Nov 2019 11:07:29 +0800 Subject: RFR(trivial): 8233574: Shenandoah: build is broken without jfr Message-ID: <0a6ba9dc-06c6-e1ef-993d-7117ff827a09@loongson.cn> Hi all, May I get reviews for the one-line change? This bug was found while I was testing David's patch for JDK-8233454 [1]. JBS:??? https://bugs.openjdk.java.net/browse/JDK-8233574 Webrev: http://cr.openjdk.java.net/~jiefu/8233574/webrev.00/ Thanks a lot. Best regards, Jie [1] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-November/036788.html From sangheon.kim at oracle.com Tue Nov 5 06:22:19 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Mon, 4 Nov 2019 22:22:19 -0800 Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for G1, Logging (3/3) In-Reply-To: <2c4e49d8-91f6-4109-b9a3-78b1999ccda3@oracle.com> References: <743d16cc-499d-784b-79fc-c006643f9ec5@oracle.com> <6449da1d-6dcc-50ba-8ae8-7615e7ad35f9@oracle.com> <2c4e49d8-91f6-4109-b9a3-78b1999ccda3@oracle.com> Message-ID: Hi Stefan, On 11/4/19 8:20 AM, Stefan Johansson wrote: > Hi Sangheon, > > Thanks for addressing my comments, some comments inline and additional > comments below. > > On 2019-11-02 07:08, sangheon.kim at oracle.com wrote: >>> ... >>> As we discussed offline, I would like to know the mismatches as >>> well, I think the easiest approach would be to make the total count >>> per node as well and that way we can see if there were any regions >>> that didn't match. What do you think about printing the info like this: >>> [3,009s][trace][gc,heap,numa ] GC(6) NUMA region verification >>> (actual/expected): 0: 1024/1024, 1: 270/1024, Unknown: 0 >> Changed as you suggested to have per node but deleted unknown as >> unknown is 'total - matched'. > I'm fine with skipping 'Unknown', but not all regions in (total - > matched) must be 'Unknown' since there is the possibility of a real > mismatch as well. Basically I agree with you that 'total - matched' remains some other cases. i.e. 'mismatch' and 'unknown (mostly not yet touched)'. Previously I just wanted to highlight the most interesting part which I think is 'matched and total'. New patch counts, matched / mismatched / total. >> >>> >>> When testing this I also realized this output is problematic in the >>> case where we have committed regions that have not yet been used. >>> Reading the manual for get_mempolicy (the way we get the numa id for >>> the address) say: >>> "If no page has yet been allocated for the specified address, >>> get_mempolicy() will allocate a page as if the thread had performed >>> a read (load) access to that address, and return the ID of the node >>> where that page was allocated." >> Nice catch. >> >> Shortly saying get_mempolicy() doesn't honor os::numa_make_local() >> call that we previously requested. And this problem occurs when >> AlwaysPreTouch is disabled. >> >> However, my initial implementation which uses 'numa_move_pages()' >> doesn't have this problem. So one fundamental solution would be >> replacing linux implementation. >> In current scope of 3 patches, there will be no problem if we add >> 'hr->free && !AlwaysPreTouch' condition check however >> os::numa_get_group_id_for_address() will still have such limitation. >> >> What do you think about changing the Linux implementation? >> (webrev link is added at the end) >> > I agree that we should use numa_move_pages(), but I think we might > want to change things a bit more because of this. I will add those > comments below the patch to have all code comments in one place. > okay >>> ... >> One can also see that this verification takes some time, so >>> maybe it >>> would make sense to have this logging under gc+numa+verify. >> I think if we avoid calling G1NUMAA:index_of_address() if 'hr->free() >> && !AlwaysPreTouch'? but count total, we will be fine. >> Basically I imported your patch but the verification is only >> happening at the beginning of GC. > > Any reason for not including it at the end of GC as well? I guess that > could be good to be able to diagnose if there was an increase > imbalance due to the collection. Added after gc as well. I just wanted to avoid logging to much. But as you said, node information after gc seems helpful too. > >>> ... >>> ?166 void G1NodeTimes::print_phase_info(G1NodeTimes::NodeStatPhases >>> phase) { >>> ?167?? LogTarget(Info, gc, heap, numa) lt; >>> >>> I think this should be on debug level, but if you don't agree leave >>> it as is. >> I feel Info seems okay, so let me leave as is. >> >>> --- >>> >>> ?191 void G1NodeTimes::print_mutator_alloc_stat_debug() { >>> ?192?? LogTarget(Debug, gc, heap, numa) lt; >>> >>> And if you agree on moving the above to debug I think this should be >>> on trace level. >> As is, please. > I'm okay with that, I just wanted to point out that compared to the > other G1 logging on Info/Debug level these seemed to be one off. OK > >> >> Here's the webrev: >> http://cr.openjdk.java.net/~sangheki/8220312/webrev.4 > > Some additional comments: > > src/hotspot/share/gc/g1/g1CollectedHeap.cpp > --- > 2598 void G1CollectedHeap::verify_numa_regions() { > 2599?? LogTarget(Trace, gc, heap, numa) lt; > > I realize I didn't include that in my example patch, but did you have > anything against changing the tags here to 'gc, heap, verify'? I'm with gc+heap+verify so update it. > --- > > src/hotspot/share/gc/g1/g1NUMA.cpp > --- > ?281???? _ls->print("%d: %u / %u, ", numa_ids[i], _matched[i], > _total[i]); > > This will leave a trailing ',' now when we don't have the 'Unknown' > output. I think just having a space as the separator between the nodes > could be fine to avoid this. Right, just a space. > --- > > ?292?? if (hr->is_free() && !AlwaysPreTouch) { > ?293???? active_node_index = G1NUMA::UnknownNodeIndex; > ?294?? } else { > ?295???? active_node_index = _numa->index_of_address(hr->bottom()); > ?296?? } > ?297 > ?298?? if (preferred_node_index == active_node_index) { > ?299???? _matched[preferred_node_index]++; > ?300?? } > ?301?? _total[preferred_node_index]++; > > As I touched upon above, I think we still lack some important > information here. If we are to look at all committed regions, which > probably is good, we need to keep a mismatch count as well. Otherwise > we can't know if the diff between matched and total is just unused > regions or actual mismatches without relying on other logs and > analysis. Or am I missing something? Added 'mismatch' which doesn't count active_node_index == UnknownNodeIndex. So now we are printing 'matched / mismatched / total'. We are not printing unknown but 'total = matched + mismatched + unknown', so we can calculate it. One may want to explicitly print 'unknown' additionally or print 'unknown' instead of 'total'. Or we can split into 2 levels too. But I think matched / mismatched / total seems okay for now. > -- > > src/hotspot/share/gc/g1/g1SurvivorRegions.hpp > --- > ? 30 class G1RegionCounts; > > Just having a forward declaration here is not enough, so please add > back the include. Oops, reverted. > --- > > src/hotspot/share/gc/g1/g1NUMAStats.hpp > --- > 99?? void print_phase_info(G1NUMAStats::NodeDataItems phase); > > Sorry for being so picky on the naming, but since we changed the class > and enum I think this method, the comments and the parameters should > change as well. For example: > void print_info(G1NUMAStats::NodeDataItems type); :) Okay > >> http://cr.openjdk.java.net/~sangheki/8220312/webrev.4.inc >> http://cr.openjdk.java.net/~sangheki/8220312/webrev.4.inc.numa_move_pages >> > As I said above, I think we should go with this implementation of > getting the node given an address, but I have a comment: > src/hotspot/os/linux/os_linux.cpp > --- > 3018?? if (os::Linux::numa_move_pages(0, 1, pages, NULL, &id, 0) == -1) { > 3019???? return -1; > 3020?? } > 3021?? if (id < 0) { > 3022???? return -1; > 3023?? } > > I think we should differ between the case where the call fails (return > -1) and the id returned is negative. From reading the man_pages for > 'move_pages()' (used by numa_move_pages()) it seems we can expect to > either get -EFAULT (for normal pages) or -ENOENT (for large pages) > when a region has not yet been used. I'm not completely sure how much > logic we want to add to this method, so maybe creating a RFE to look > at this in more detail later is good. But for this first version I > think your proposal is good. I understand your comment to distinguish 'not yet faulted in (not yet touched)' case which seems nice to have. But I don't have good usage for it on top of my head. This may be used when record / print at NodeIndexCheckClosure::do_heap_region(). Filed a RFE, https://bugs.openjdk.java.net/browse/JDK-8233535 > > One thought I just got, with this implementation is more like > 'Unused', rather then 'Unknown'. But such a notion change could also > be looked at in the RFE. I'm not sure I fully agree / understand with you. :) I think when we combine 'unused or unknown' with 'node index / id', unknown seems better fit. 'unknown node index' which means the value is unknown for some reason(mostly not yet faulted in) seems better than 'unused node index' . To me 'unused' seems better when combined with 'page or address'. e..g unused page or unused address etc. If you still think further notion change is necessary in separate CR, I can file for you. :) webrev: http://cr.openjdk.java.net/~sangheki/8220312/webrev.5 http://cr.openjdk.java.net/~sangheki/8220312/webrev.5.inc - All comments from Kim and Stefan. - Includes class / methods name changes which Kim suggested to me directly. Testing: hs-tier 1 ~ 5, with/without UseNUMA. Thanks, Sangheon > --- > > Thanks, > Stefan > >> >> Testing: hs-tier 1 ~ 5 with / without UseNUMA. >> >> Thanks, >> Sangheon >> >> >>> --- >>> >>> This is it for now. Thanks, >>> Stefan >>> >>> >>>> Testing: hs-tier 1 ~ 4 with/without UseNUMA. hs-tier5 is almost >>>> finished without new failures. >>>> >>>> Thanks, >>>> Sangheon >>>> >>>> >> From rkennke at redhat.com Tue Nov 5 07:50:45 2019 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 5 Nov 2019 08:50:45 +0100 Subject: RFR(trivial): 8233574: Shenandoah: build is broken without jfr In-Reply-To: <0a6ba9dc-06c6-e1ef-993d-7117ff827a09@loongson.cn> References: <0a6ba9dc-06c6-e1ef-993d-7117ff827a09@loongson.cn> Message-ID: Hi Jie, the change looks good (and trivial), thank you! Roman > Hi all, > > May I get reviews for the one-line change? > This bug was found while I was testing David's patch for JDK-8233454 [1]. > > JBS:??? https://bugs.openjdk.java.net/browse/JDK-8233574 > Webrev: http://cr.openjdk.java.net/~jiefu/8233574/webrev.00/ > > Thanks a lot. > Best regards, > Jie > > [1] > https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-November/036788.html > > > From fujie at loongson.cn Tue Nov 5 08:09:48 2019 From: fujie at loongson.cn (Jie Fu) Date: Tue, 5 Nov 2019 16:09:48 +0800 Subject: RFR(trivial): 8233574: Shenandoah: build is broken without jfr In-Reply-To: References: <0a6ba9dc-06c6-e1ef-993d-7117ff827a09@loongson.cn> Message-ID: <981d5500-11ed-1677-9fff-1733b3e149b3@loongson.cn> Thanks Roman for your review. Could you please sponsor it? Thanks a lot. Best regards, Jie On 2019/11/5 ??3:50, Roman Kennke wrote: > Hi Jie, > > the change looks good (and trivial), thank you! > > Roman > >> Hi all, >> >> May I get reviews for the one-line change? >> This bug was found while I was testing David's patch for JDK-8233454 [1]. >> >> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8233574 >> Webrev: http://cr.openjdk.java.net/~jiefu/8233574/webrev.00/ >> >> Thanks a lot. >> Best regards, >> Jie >> >> [1] >> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-November/036788.html >> >> >> From per.liden at oracle.com Tue Nov 5 08:26:07 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 5 Nov 2019 09:26:07 +0100 Subject: RFR (S): 8233301: Implementation of JEP 366: Deprecate the ParallelScavenge + SerialOld GC Combination In-Reply-To: References: <727338ce-845c-067f-e56e-f066d2a602dd@oracle.com> Message-ID: <70424c6c-e598-f12a-f261-a0b62cd5869a@oracle.com> Hi Thomas, On 11/4/19 11:46 AM, Thomas Schatzl wrote: > Hi, > > On 01.11.19 00:05, Bernd Eckenfels wrote: >> The help message: >> >> Use the Parallel Old garbage collector. Deprecated. > > ? product(bool, UseParallelOldGC, false, ??? \ > ????????? "Use the Serial Old garbage collection algorithm for old " ??? \ > ????????? "generation. Deprecated." ??? \ I know it's not easy to get this right, but I think the above might perhaps be even more confusing. How about something like: UseParallelGC -> "Use the Parallel garbage collector" UseParallelOldGC -> "Use the Parallel or the Serial garbage collector when collecting the old generation (Deprecated)." cheers, Per > >> >> Looks a bit missleading to me. I know it means the option is >> deprecated (especially the non default negative value > > but it could easily be understood as ParallelOld beeing deprecated. > > I hope the new text is better understandable and does not lead to the > confusion you suggested. > > Note that the default value of "false" is correct although maybe > surprising, as this change does not modify it. It's only that > -XX:+UseParallelGC is actually a shorthand for "-XX:+UseParallelGC > -XX:+UseParallelOldGC" for like forever. > > I do not want to change this here. > > Note that these strings are not the official documentation, but the > manpage. They read: > > `-XX:+UseParallelGC` > :?? Enables the use of the parallel scavenge garbage collector (also > known as > ??? the throughput collector) to improve the performance of your > application by > ??? leveraging multiple processors. > > ??? By default, this option is disabled and the default collector is used. > ??? If it's enabled, then the `-XX:+UseParallelOldGC` option is > ??? automatically enabled, unless you explicitly disable it. > > and (with this change, as part of the "Deprecated" section): > > `-XX:+UseParallelOldGC` > :?? Enables the use of the parallel garbage collector for full GCs. By > default, > ??? this option is disabled. Enabling it automatically enables the > ??? `-XX:+UseParallelGC` option. > > Which seems in line with what is now told in the sources too. > >> >> There is no jtreg for +UseParallelOld. It would need to document that >> deprecation warning is expected for that as well? > > There is a separate test case in > test/hotspot/jtreg/runtime/CommandLine/VMDeprecatedOptions.java which > does not seem updated for any and all deprecated options. I added this > flag to it anyway. > > Webrev: > http://cr.openjdk.java.net/~tschatzl/8233301/webrev/ > Testing: > local compilation + local jtreg test > > Thanks, > ? Thomas From stefan.johansson at oracle.com Tue Nov 5 08:24:32 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 5 Nov 2019 09:24:32 +0100 Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for G1, Logging (3/3) In-Reply-To: References: <743d16cc-499d-784b-79fc-c006643f9ec5@oracle.com> <6449da1d-6dcc-50ba-8ae8-7615e7ad35f9@oracle.com> <2c4e49d8-91f6-4109-b9a3-78b1999ccda3@oracle.com> Message-ID: <0ee0f822-a4e4-41b4-d31d-658ef2d06015@oracle.com> Hi Sangheon, Again, thanks for working through our comments. Some inline responses and one minor additional thing. On 2019-11-05 07:22, sangheon.kim at oracle.com wrote: > Hi Stefan, > > On 11/4/19 8:20 AM, Stefan Johansson wrote: >> Hi Sangheon, >> >> Thanks for addressing my comments, some comments inline and additional >> comments below. >> >> On 2019-11-02 07:08, sangheon.kim at oracle.com wrote: >>>> ... >>>> As we discussed offline, I would like to know the mismatches as >>>> well, I think the easiest approach would be to make the total count >>>> per node as well and that way we can see if there were any regions >>>> that didn't match. What do you think about printing the info like this: >>>> [3,009s][trace][gc,heap,numa ] GC(6) NUMA region verification >>>> (actual/expected): 0: 1024/1024, 1: 270/1024, Unknown: 0 >>> Changed as you suggested to have per node but deleted unknown as >>> unknown is 'total - matched'. >> I'm fine with skipping 'Unknown', but not all regions in (total - >> matched) must be 'Unknown' since there is the possibility of a real >> mismatch as well. > Basically I agree with you that 'total - matched' remains some other > cases. i.e. 'mismatch' and 'unknown (mostly not yet touched)'. > Previously I just wanted to highlight the most interesting part which I > think is 'matched and total'. > > New patch counts, matched / mismatched / total. This is great, then we all get the info we like :) > ... >> Any reason for not including it at the end of GC as well? I guess that >> could be good to be able to diagnose if there was an increase >> imbalance due to the collection. > Added after gc as well. > I just wanted to avoid logging to much. But as you said, node > information after gc seems helpful too. > ... >> I realize I didn't include that in my example patch, but did you have >> anything against changing the tags here to 'gc, heap, verify'? > I'm with gc+heap+verify so update it. Nice =) > > Added 'mismatch' which doesn't count active_node_index == UnknownNodeIndex. > So now we are printing 'matched / mismatched / total'. We are not > printing unknown but 'total = matched + mismatched + unknown', so we can > calculate it. > > One may want to explicitly print 'unknown' additionally or print > 'unknown' instead of 'total'. > Or we can split into 2 levels too. > But I think matched / mismatched / total seems okay for now. > Agreed >> ... >> I think we should differ between the case where the call fails (return >> -1) and the id returned is negative. From reading the man_pages for >> 'move_pages()' (used by numa_move_pages()) it seems we can expect to >> either get -EFAULT (for normal pages) or -ENOENT (for large pages) >> when a region has not yet been used. I'm not completely sure how much >> logic we want to add to this method, so maybe creating a RFE to look >> at this in more detail later is good. But for this first version I >> think your proposal is good. > I understand your comment to distinguish 'not yet faulted in (not yet > touched)' case which seems nice to have. But I don't have good usage for > it on top of my head. This may be used when record / print at > NodeIndexCheckClosure::do_heap_region(). > > Filed a RFE, https://bugs.openjdk.java.net/browse/JDK-8233535 > Thanks, I'm not sure about the use case either so a new RFE to investigate is good. >> >> One thought I just got, with this implementation is more like >> 'Unused', rather then 'Unknown'. But such a notion change could also >> be looked at in the RFE. > I'm not sure I fully agree / understand with you. :) > > I think when we combine 'unused or unknown' with 'node index / id', > unknown seems better fit. > 'unknown node index' which means the value is unknown for some > reason(mostly not yet faulted in) seems better than 'unused node index' . > To me 'unused' seems better when combined with 'page or address'. e..g > unused page or unused address etc. > > If you still think further notion change is necessary in separate CR, I > can file for you. :) I agree with you that currently Unknown is better, but one possible use case if distinguish between failing the syscall and getting the expected status for not yet faulted pages would be that we could say that some pages are unused and some are unknown (but that would probably lead to most just being unused). Well, I don't need another CR for this, just thinking out in the mail =) > > webrev: > http://cr.openjdk.java.net/~sangheki/8220312/webrev.5 > http://cr.openjdk.java.net/~sangheki/8220312/webrev.5.inc This all looks good, but there is an else-if statement that you can simplify a bit: src/hotspot/share/gc/g1/g1NUMA.cpp --- 297 if (preferred_node_index == active_node_index) { 298 _matched[preferred_node_index]++; 299 } else if (preferred_node_index != active_node_index && 300 active_node_index != G1NUMA::UnknownNodeIndex) { 301 _mismatched[preferred_node_index]++; 302 } The first condition in the else-if statement will always be true (or the if-branch will be taken) and can be removed. --- Thanks, Stefan > > - All comments from Kim and Stefan. > - Includes class / methods name changes which Kim suggested to me directly. > > Testing: hs-tier 1 ~ 5, with/without UseNUMA. > > Thanks, > Sangheon > > >> --- >> >> Thanks, >> Stefan >> >>> >>> Testing: hs-tier 1 ~ 5 with / without UseNUMA. >>> >>> Thanks, >>> Sangheon >>> >>> >>>> --- >>>> >>>> This is it for now. Thanks, >>>> Stefan >>>> >>>> >>>>> Testing: hs-tier 1 ~ 4 with/without UseNUMA. hs-tier5 is almost >>>>> finished without new failures. >>>>> >>>>> Thanks, >>>>> Sangheon >>>>> >>>>> >>> > From shade at redhat.com Tue Nov 5 08:51:50 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 5 Nov 2019 09:51:50 +0100 Subject: RFR(trivial): 8233574: Shenandoah: build is broken without jfr In-Reply-To: <0a6ba9dc-06c6-e1ef-993d-7117ff827a09@loongson.cn> References: <0a6ba9dc-06c6-e1ef-993d-7117ff827a09@loongson.cn> Message-ID: On 11/5/19 4:07 AM, Jie Fu wrote: > JBS:??? https://bugs.openjdk.java.net/browse/JDK-8233574 > Webrev: http://cr.openjdk.java.net/~jiefu/8233574/webrev.00/ Minor nit, once you include ".inline.hpp", you don't need to include the ".hpp" 31 #include "gc/shenandoah/shenandoahHeap.hpp" <--- drop this 32 #include "gc/shenandoah/shenandoahHeap.inline.hpp" -- Thanks, -Aleksey From aph at redhat.com Tue Nov 5 08:52:20 2019 From: aph at redhat.com (Andrew Haley) Date: Tue, 5 Nov 2019 08:52:20 +0000 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <859e48d6-9af5-b4af-32ac-4b07ce92e94d@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> <87b115fb-5353-b21b-3cbc-f862bd932b3e@redhat.com> <3d70db1c-c927-48f8-23ab-8937838e0302@redhat.com> <0d347d16-f870-798f-0165-1ee4dfae511b@redhat.com> <859e48d6-9af5-b4af-32ac-4b07ce92e94d@redhat.com> Message-ID: <6c110878-a477-df8a-e566-84b113806044@redhat.com> On 11/4/19 6:23 PM, Zhengyu Gu wrote: >> They are still needed for calling super class's load_at(). Even though, >> they are not used there neither. Aha! Sorry, I missed that. > Or I should say, they are not used there right now, but may be used in > future ... So add them in the future, surely. All you're doing by passing unused args is confusing the reader. It definitely succeeded with me... -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From fujie at loongson.cn Tue Nov 5 09:07:44 2019 From: fujie at loongson.cn (Jie Fu) Date: Tue, 5 Nov 2019 17:07:44 +0800 Subject: RFR(trivial): 8233574: Shenandoah: build is broken without jfr In-Reply-To: References: <0a6ba9dc-06c6-e1ef-993d-7117ff827a09@loongson.cn> Message-ID: <12cfede6-de28-b459-7236-657068f8865e@loongson.cn> Hi Aleksey, Thanks for your review. Good catch. Updated: http://cr.openjdk.java.net/~jiefu/8233574/webrev.01/ Could you or Roman help to sponsor it? Thanks a lot. Best regards, Jie On 2019/11/5 ??4:51, Aleksey Shipilev wrote: > On 11/5/19 4:07 AM, Jie Fu wrote: >> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8233574 >> Webrev: http://cr.openjdk.java.net/~jiefu/8233574/webrev.00/ > Minor nit, once you include ".inline.hpp", you don't need to include the ".hpp" > > 31 #include "gc/shenandoah/shenandoahHeap.hpp" <--- drop this > 32 #include "gc/shenandoah/shenandoahHeap.inline.hpp" > From shade at redhat.com Tue Nov 5 09:08:48 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 5 Nov 2019 10:08:48 +0100 Subject: RFR(trivial): 8233574: Shenandoah: build is broken without jfr In-Reply-To: <12cfede6-de28-b459-7236-657068f8865e@loongson.cn> References: <0a6ba9dc-06c6-e1ef-993d-7117ff827a09@loongson.cn> <12cfede6-de28-b459-7236-657068f8865e@loongson.cn> Message-ID: On 11/5/19 10:07 AM, Jie Fu wrote: > Updated: http://cr.openjdk.java.net/~jiefu/8233574/webrev.01/ Looks good. > Could you or Roman help to sponsor it? I'll sponsor it. -- Thanks, -Aleksey From fujie at loongson.cn Tue Nov 5 09:09:49 2019 From: fujie at loongson.cn (Jie Fu) Date: Tue, 5 Nov 2019 17:09:49 +0800 Subject: RFR(trivial): 8233574: Shenandoah: build is broken without jfr In-Reply-To: References: <0a6ba9dc-06c6-e1ef-993d-7117ff827a09@loongson.cn> <12cfede6-de28-b459-7236-657068f8865e@loongson.cn> Message-ID: Thank you so much, Aleksey. On 2019/11/5 ??5:08, Aleksey Shipilev wrote: > I'll sponsor it. From thomas.schatzl at oracle.com Tue Nov 5 10:05:06 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 5 Nov 2019 11:05:06 +0100 Subject: RFR (S): 8233301: Implementation of JEP 366: Deprecate the ParallelScavenge + SerialOld GC Combination In-Reply-To: <70424c6c-e598-f12a-f261-a0b62cd5869a@oracle.com> References: <727338ce-845c-067f-e56e-f066d2a602dd@oracle.com> <70424c6c-e598-f12a-f261-a0b62cd5869a@oracle.com> Message-ID: <6052c853-552f-06ab-ebe6-50c6bc4da9d5@oracle.com> Hi, On 05.11.19 09:26, Per Liden wrote: > Hi Thomas, > > On 11/4/19 11:46 AM, Thomas Schatzl wrote: >> Hi, >> >> On 01.11.19 00:05, Bernd Eckenfels wrote: >>> The help message: >>> >>> Use the Parallel Old garbage collector. Deprecated. >> >> ?? product(bool, UseParallelOldGC, false, ??? \ >> ?????????? "Use the Serial Old garbage collection algorithm for old " >> ??? \ >> ?????????? "generation. Deprecated." ??? \ > > I know it's not easy to get this right, but I think the above might > perhaps be even more confusing. How about something like: > > UseParallelGC -> "Use the Parallel garbage collector" > > UseParallelOldGC -> "Use the Parallel or the Serial garbage collector > when collecting the old generation (Deprecated)." > thanks for your input and regenerated webrev. Changed to your suggestion. Bernd? Thanks, Thomas From ecki at zusammenkunft.net Tue Nov 5 10:22:09 2019 From: ecki at zusammenkunft.net (Bernd Eckenfels) Date: Tue, 5 Nov 2019 10:22:09 +0000 Subject: RFR (S): 8233301: Implementation of JEP 366: Deprecate the ParallelScavenge + SerialOld GC Combination In-Reply-To: <6052c853-552f-06ab-ebe6-50c6bc4da9d5@oracle.com> References: <727338ce-845c-067f-e56e-f066d2a602dd@oracle.com> <70424c6c-e598-f12a-f261-a0b62cd5869a@oracle.com>, <6052c853-552f-06ab-ebe6-50c6bc4da9d5@oracle.com> Message-ID: Yes, looks better to me as well, sorry for the bike shedding :) -- http://bernd.eckenfels.net ________________________________ Von: hotspot-gc-dev im Auftrag von Thomas Schatzl Gesendet: Dienstag, November 5, 2019 11:07 AM An: Per Liden; hotspot-gc-dev at openjdk.java.net Betreff: Re: RFR (S): 8233301: Implementation of JEP 366: Deprecate the ParallelScavenge + SerialOld GC Combination Hi, On 05.11.19 09:26, Per Liden wrote: > Hi Thomas, > > On 11/4/19 11:46 AM, Thomas Schatzl wrote: >> Hi, >> >> On 01.11.19 00:05, Bernd Eckenfels wrote: >>> The help message: >>> >>> Use the Parallel Old garbage collector. Deprecated. >> >> product(bool, UseParallelOldGC, false, \ >> "Use the Serial Old garbage collection algorithm for old " >> \ >> "generation. Deprecated." \ > > I know it's not easy to get this right, but I think the above might > perhaps be even more confusing. How about something like: > > UseParallelGC -> "Use the Parallel garbage collector" > > UseParallelOldGC -> "Use the Parallel or the Serial garbage collector > when collecting the old generation (Deprecated)." > thanks for your input and regenerated webrev. Changed to your suggestion. Bernd? Thanks, Thomas From erik.osterlund at oracle.com Tue Nov 5 10:40:53 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Tue, 5 Nov 2019 11:40:53 +0100 Subject: RFR: 8233299: Implementation: JEP 365: ZGC on Windows In-Reply-To: <8fbffe58-7045-52e4-687c-35cb8c146365@oracle.com> References: <8fbffe58-7045-52e4-687c-35cb8c146365@oracle.com> Message-ID: Hi Stefan, Awesome. Looks good! Thanks, /Erik On 10/31/19 11:18 AM, Stefan Karlsson wrote: > Hi all, > > Please review this patch to add ZGC support on Windows. > > https://cr.openjdk.java.net/~stefank/8233299/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8233299 > > As mentioned in the JEP (https://openjdk.java.net/jeps/365), there > were some preparation patches that needed to go in to pave the way for > this patch: > > ??? 8232601: ZGC: Parameterize the ZGranuleMap table size > ??? 8232602: ZGC: Make ZGranuleMap ZAddress agnostic > ??? 8232604: ZGC: Make ZVerifyViews mapping and unmapping precise > ??? 8232648: ZGC: Move ATTRIBUTE_ALIGNED to the front of declarations > ??? 8232649: ZGC: Add callbacks to ZMemoryManager > ??? 8232650: ZGC: Add initialization hooks for OS specific code > ??? 8232651: Add implementation of os::processor_id() for Windows > > ... they have all been pushed now. > > One important key-point to this implementation is to use the new > Windows APIs that support reservation and mapping of memory through > "placeholders":? VirtualAlloc2, VirtualFreeEx, MapViewOfFile3, and > UnmapViewOfFile2. These functions are available starting from version > 1803 of Windows 10 and Windows Server. ZGC will lookup these symbols > to determine if the Windows version supports these functions. > > > Correlating the text in the JEP with the code: > > * '"Support for multi-mapping memory". ZGC's use of colored pointers > requires support for heap multi-mapping, so that the same physical > memory can be accessed from multiple different locations in the > process address space. On Windows, paging-file backed memory provides > physical memory with an identity (a handle), which is unrelated to the > virtual address where it is mapped. Using this identity allows ZGC to > map the same physical memory into multiple locations.' > > We commit memory via paging file mappings and map views into that memory. > > The function ZMapper::create_and_commit_paging_file_mapping uses > CreateFileMappingW with SEC_RESERVE to create this mapping, > MapViewOfFile3 to map a temporary view into the mapping, VirtualAlloc2 > to commit the memory, and then UnmapViewOfFile2 to unmap the view. > > The reason to use SEC_RESERVE and the extra VirtualAlloc2, instead of > SEC_COMMIT, is to ensure that the later multi-mappings of committed > file mappings don't fail under low-memory situations. Earlier > prototypes used SEC_COMMIT and saw these kind of OOME errors when > mapping new views to already committed memory. The current > platform-independent ZGC code isn't prepared to handle OOME errors > when mapping views, so we chose this solution. > > MapViewOfFile3 is then used to multi-map into the committed memory. > > * '"Support for mapping paging-file backed memory into a reserved > address space". The Windows memory management API is not as flexible > as POSIX's mmap/munmap, especially when it comes to mapping file > backed memory into a previously reserved address space region. To do > this, ZGC will use the Windows concept of address space placeholders. > The placeholder concept was introduced in version 1803 of Windows 10 > and Windows Server. ZGC support for older versions of Windows will not > be implemented.' > > Before the placeholder APIs there was no way to first reserve a > specific virtual memory range, and then map a view of a committed > paging file over that range. The VirtuaAlloc function could be used to > first reserve and then commit anonymous memory, but nothing similar > existed for mapped views. Now with placeholders, we can create a > placeholder reservation of memory with VirtualAlloc2, and then replace > that reservation with MapViewOfFile3. When memory is unmapped, we can > use UnmapViewOfFile2 to "preserve" the placeholder memory reservation. > > > * '"Support for mapping and unmapping arbitrary parts of the heap". > ZGC's heap layout in combination with its dynamic sizing (and > re-sizing) of heap pages requires support for mapping and unmapping > arbitrary heap granules. This requirement in combination with Windows > address space placeholders requires special attention, since > placeholders must be explicitly split/coalesced by the program, as > opposed to being automatically split/coalesced by the operating system > (as on Linux).' > > Half of the preparation patches were put in place to support this. > When replacing a placeholder with a view of the backing file, we need > to exactly match the address and size of a placeholder. Also, when > unmapping a view, we need to exactly match the address and size of the > view, and replace it with a placeholder. > > To make it easier to map and unmap arbitrary parts of the heap, we > split reserved memory into ZGranuleSize-sized placeholders. So, > whenever we perform any of these operations, we know that any given > memory range could be dealt with as a number of granules. > > When memory is reserved, but not mapped, it is registered in the > ZVirtualMemoryManager. It splits memory into granule-sized placholders > when reserved memory is fetched, and coalesces placeholders when > reserved memory is handed back. > > > * '"Support for committing and uncommitting arbitrary parts of the > heap". ZGC can commit and uncommit physical memory dynamically while > the Java program is running. To support these operations the physical > memory will be divided into, and backed by, multiple paging-file > segments. Each paging-file segment corresponds to a ZGC heap granule, > and can be committed and uncommitted independently of other segments.' > > Just like we can map and unmap in granules, we want to be able to > commit and uncommit memory in granules. You can see how memory is > committed and uncommitted in granules in > ZBackingFile::commit_from_paging_file and > ZBackingFile::uncommit_from_paging_file. Each committed granule is > associated with one registered handle. When memory for a granule is > uncommitted, the handle is closed. At this point, no views exist to > the mapping and the memory is handed back to the OS. > > > Final point about ZPhysicalMemoryBacking. We've tried to make this > file similar on all OSes, with the hope to be able to combine them > when both the Windows and macOS ports have been merged. > > Thanks, > StefanK From per.liden at oracle.com Tue Nov 5 10:44:14 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 5 Nov 2019 11:44:14 +0100 Subject: RFR: 8233299: Implementation: JEP 365: ZGC on Windows In-Reply-To: <8fbffe58-7045-52e4-687c-35cb8c146365@oracle.com> References: <8fbffe58-7045-52e4-687c-35cb8c146365@oracle.com> Message-ID: <394365e7-cb06-0995-5913-de55989dce14@oracle.com> Looks good! /Per On 10/31/19 11:18 AM, Stefan Karlsson wrote: > Hi all, > > Please review this patch to add ZGC support on Windows. > > https://cr.openjdk.java.net/~stefank/8233299/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8233299 > > As mentioned in the JEP (https://openjdk.java.net/jeps/365), there were > some preparation patches that needed to go in to pave the way for this > patch: > > ??? 8232601: ZGC: Parameterize the ZGranuleMap table size > ??? 8232602: ZGC: Make ZGranuleMap ZAddress agnostic > ??? 8232604: ZGC: Make ZVerifyViews mapping and unmapping precise > ??? 8232648: ZGC: Move ATTRIBUTE_ALIGNED to the front of declarations > ??? 8232649: ZGC: Add callbacks to ZMemoryManager > ??? 8232650: ZGC: Add initialization hooks for OS specific code > ??? 8232651: Add implementation of os::processor_id() for Windows > > ... they have all been pushed now. > > One important key-point to this implementation is to use the new Windows > APIs that support reservation and mapping of memory through > "placeholders":? VirtualAlloc2, VirtualFreeEx, MapViewOfFile3, and > UnmapViewOfFile2. These functions are available starting from version > 1803 of Windows 10 and Windows Server. ZGC will lookup these symbols to > determine if the Windows version supports these functions. > > > Correlating the text in the JEP with the code: > > * '"Support for multi-mapping memory". ZGC's use of colored pointers > requires support for heap multi-mapping, so that the same physical > memory can be accessed from multiple different locations in the process > address space. On Windows, paging-file backed memory provides physical > memory with an identity (a handle), which is unrelated to the virtual > address where it is mapped. Using this identity allows ZGC to map the > same physical memory into multiple locations.' > > We commit memory via paging file mappings and map views into that memory. > > The function ZMapper::create_and_commit_paging_file_mapping uses > CreateFileMappingW with SEC_RESERVE to create this mapping, > MapViewOfFile3 to map a temporary view into the mapping, VirtualAlloc2 > to commit the memory, and then UnmapViewOfFile2 to unmap the view. > > The reason to use SEC_RESERVE and the extra VirtualAlloc2, instead of > SEC_COMMIT, is to ensure that the later multi-mappings of committed file > mappings don't fail under low-memory situations. Earlier prototypes used > SEC_COMMIT and saw these kind of OOME errors when mapping new views to > already committed memory. The current platform-independent ZGC code > isn't prepared to handle OOME errors when mapping views, so we chose > this solution. > > MapViewOfFile3 is then used to multi-map into the committed memory. > > * '"Support for mapping paging-file backed memory into a reserved > address space". The Windows memory management API is not as flexible as > POSIX's mmap/munmap, especially when it comes to mapping file backed > memory into a previously reserved address space region. To do this, ZGC > will use the Windows concept of address space placeholders. The > placeholder concept was introduced in version 1803 of Windows 10 and > Windows Server. ZGC support for older versions of Windows will not be > implemented.' > > Before the placeholder APIs there was no way to first reserve a specific > virtual memory range, and then map a view of a committed paging file > over that range. The VirtuaAlloc function could be used to first reserve > and then commit anonymous memory, but nothing similar existed for mapped > views. Now with placeholders, we can create a placeholder reservation of > memory with VirtualAlloc2, and then replace that reservation with > MapViewOfFile3. When memory is unmapped, we can use UnmapViewOfFile2 to > "preserve" the placeholder memory reservation. > > > * '"Support for mapping and unmapping arbitrary parts of the heap". > ZGC's heap layout in combination with its dynamic sizing (and re-sizing) > of heap pages requires support for mapping and unmapping arbitrary heap > granules. This requirement in combination with Windows address space > placeholders requires special attention, since placeholders must be > explicitly split/coalesced by the program, as opposed to being > automatically split/coalesced by the operating system (as on Linux).' > > Half of the preparation patches were put in place to support this. When > replacing a placeholder with a view of the backing file, we need to > exactly match the address and size of a placeholder. Also, when > unmapping a view, we need to exactly match the address and size of the > view, and replace it with a placeholder. > > To make it easier to map and unmap arbitrary parts of the heap, we split > reserved memory into ZGranuleSize-sized placeholders. So, whenever we > perform any of these operations, we know that any given memory range > could be dealt with as a number of granules. > > When memory is reserved, but not mapped, it is registered in the > ZVirtualMemoryManager. It splits memory into granule-sized placholders > when reserved memory is fetched, and coalesces placeholders when > reserved memory is handed back. > > > * '"Support for committing and uncommitting arbitrary parts of the > heap". ZGC can commit and uncommit physical memory dynamically while the > Java program is running. To support these operations the physical memory > will be divided into, and backed by, multiple paging-file segments. Each > paging-file segment corresponds to a ZGC heap granule, and can be > committed and uncommitted independently of other segments.' > > Just like we can map and unmap in granules, we want to be able to commit > and uncommit memory in granules. You can see how memory is committed and > uncommitted in granules in ZBackingFile::commit_from_paging_file and > ZBackingFile::uncommit_from_paging_file. Each committed granule is > associated with one registered handle. When memory for a granule is > uncommitted, the handle is closed. At this point, no views exist to the > mapping and the memory is handed back to the OS. > > > Final point about ZPhysicalMemoryBacking. We've tried to make this file > similar on all OSes, with the hope to be able to combine them when both > the Windows and macOS ports have been merged. > > Thanks, > StefanK From stefan.karlsson at oracle.com Tue Nov 5 10:50:45 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 5 Nov 2019 11:50:45 +0100 Subject: RFR: 8233299: Implementation: JEP 365: ZGC on Windows In-Reply-To: References: <8fbffe58-7045-52e4-687c-35cb8c146365@oracle.com> Message-ID: <0ffd72d2-47e2-ffa0-1eeb-b541623aee44@oracle.com> Thanks, Erik! StefanK On 2019-11-05 11:40, erik.osterlund at oracle.com wrote: > Hi Stefan, > > Awesome. Looks good! > > Thanks, > /Erik > > On 10/31/19 11:18 AM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to add ZGC support on Windows. >> >> https://cr.openjdk.java.net/~stefank/8233299/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8233299 >> >> As mentioned in the JEP (https://openjdk.java.net/jeps/365), there >> were some preparation patches that needed to go in to pave the way for >> this patch: >> >> ??? 8232601: ZGC: Parameterize the ZGranuleMap table size >> ??? 8232602: ZGC: Make ZGranuleMap ZAddress agnostic >> ??? 8232604: ZGC: Make ZVerifyViews mapping and unmapping precise >> ??? 8232648: ZGC: Move ATTRIBUTE_ALIGNED to the front of declarations >> ??? 8232649: ZGC: Add callbacks to ZMemoryManager >> ??? 8232650: ZGC: Add initialization hooks for OS specific code >> ??? 8232651: Add implementation of os::processor_id() for Windows >> >> ... they have all been pushed now. >> >> One important key-point to this implementation is to use the new >> Windows APIs that support reservation and mapping of memory through >> "placeholders":? VirtualAlloc2, VirtualFreeEx, MapViewOfFile3, and >> UnmapViewOfFile2. These functions are available starting from version >> 1803 of Windows 10 and Windows Server. ZGC will lookup these symbols >> to determine if the Windows version supports these functions. >> >> >> Correlating the text in the JEP with the code: >> >> * '"Support for multi-mapping memory". ZGC's use of colored pointers >> requires support for heap multi-mapping, so that the same physical >> memory can be accessed from multiple different locations in the >> process address space. On Windows, paging-file backed memory provides >> physical memory with an identity (a handle), which is unrelated to the >> virtual address where it is mapped. Using this identity allows ZGC to >> map the same physical memory into multiple locations.' >> >> We commit memory via paging file mappings and map views into that memory. >> >> The function ZMapper::create_and_commit_paging_file_mapping uses >> CreateFileMappingW with SEC_RESERVE to create this mapping, >> MapViewOfFile3 to map a temporary view into the mapping, VirtualAlloc2 >> to commit the memory, and then UnmapViewOfFile2 to unmap the view. >> >> The reason to use SEC_RESERVE and the extra VirtualAlloc2, instead of >> SEC_COMMIT, is to ensure that the later multi-mappings of committed >> file mappings don't fail under low-memory situations. Earlier >> prototypes used SEC_COMMIT and saw these kind of OOME errors when >> mapping new views to already committed memory. The current >> platform-independent ZGC code isn't prepared to handle OOME errors >> when mapping views, so we chose this solution. >> >> MapViewOfFile3 is then used to multi-map into the committed memory. >> >> * '"Support for mapping paging-file backed memory into a reserved >> address space". The Windows memory management API is not as flexible >> as POSIX's mmap/munmap, especially when it comes to mapping file >> backed memory into a previously reserved address space region. To do >> this, ZGC will use the Windows concept of address space placeholders. >> The placeholder concept was introduced in version 1803 of Windows 10 >> and Windows Server. ZGC support for older versions of Windows will not >> be implemented.' >> >> Before the placeholder APIs there was no way to first reserve a >> specific virtual memory range, and then map a view of a committed >> paging file over that range. The VirtuaAlloc function could be used to >> first reserve and then commit anonymous memory, but nothing similar >> existed for mapped views. Now with placeholders, we can create a >> placeholder reservation of memory with VirtualAlloc2, and then replace >> that reservation with MapViewOfFile3. When memory is unmapped, we can >> use UnmapViewOfFile2 to "preserve" the placeholder memory reservation. >> >> >> * '"Support for mapping and unmapping arbitrary parts of the heap". >> ZGC's heap layout in combination with its dynamic sizing (and >> re-sizing) of heap pages requires support for mapping and unmapping >> arbitrary heap granules. This requirement in combination with Windows >> address space placeholders requires special attention, since >> placeholders must be explicitly split/coalesced by the program, as >> opposed to being automatically split/coalesced by the operating system >> (as on Linux).' >> >> Half of the preparation patches were put in place to support this. >> When replacing a placeholder with a view of the backing file, we need >> to exactly match the address and size of a placeholder. Also, when >> unmapping a view, we need to exactly match the address and size of the >> view, and replace it with a placeholder. >> >> To make it easier to map and unmap arbitrary parts of the heap, we >> split reserved memory into ZGranuleSize-sized placeholders. So, >> whenever we perform any of these operations, we know that any given >> memory range could be dealt with as a number of granules. >> >> When memory is reserved, but not mapped, it is registered in the >> ZVirtualMemoryManager. It splits memory into granule-sized placholders >> when reserved memory is fetched, and coalesces placeholders when >> reserved memory is handed back. >> >> >> * '"Support for committing and uncommitting arbitrary parts of the >> heap". ZGC can commit and uncommit physical memory dynamically while >> the Java program is running. To support these operations the physical >> memory will be divided into, and backed by, multiple paging-file >> segments. Each paging-file segment corresponds to a ZGC heap granule, >> and can be committed and uncommitted independently of other segments.' >> >> Just like we can map and unmap in granules, we want to be able to >> commit and uncommit memory in granules. You can see how memory is >> committed and uncommitted in granules in >> ZBackingFile::commit_from_paging_file and >> ZBackingFile::uncommit_from_paging_file. Each committed granule is >> associated with one registered handle. When memory for a granule is >> uncommitted, the handle is closed. At this point, no views exist to >> the mapping and the memory is handed back to the OS. >> >> >> Final point about ZPhysicalMemoryBacking. We've tried to make this >> file similar on all OSes, with the hope to be able to combine them >> when both the Windows and macOS ports have been merged. >> >> Thanks, >> StefanK > From stefan.karlsson at oracle.com Tue Nov 5 10:50:53 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 5 Nov 2019 11:50:53 +0100 Subject: RFR: 8233299: Implementation: JEP 365: ZGC on Windows In-Reply-To: <394365e7-cb06-0995-5913-de55989dce14@oracle.com> References: <8fbffe58-7045-52e4-687c-35cb8c146365@oracle.com> <394365e7-cb06-0995-5913-de55989dce14@oracle.com> Message-ID: <58ff47a9-49cf-0523-41d8-0e35c7b32305@oracle.com> Thanks, Per! StefanK On 2019-11-05 11:44, Per Liden wrote: > Looks good! > > /Per > > On 10/31/19 11:18 AM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to add ZGC support on Windows. >> >> https://cr.openjdk.java.net/~stefank/8233299/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8233299 >> >> As mentioned in the JEP (https://openjdk.java.net/jeps/365), there >> were some preparation patches that needed to go in to pave the way for >> this patch: >> >> ???? 8232601: ZGC: Parameterize the ZGranuleMap table size >> ???? 8232602: ZGC: Make ZGranuleMap ZAddress agnostic >> ???? 8232604: ZGC: Make ZVerifyViews mapping and unmapping precise >> ???? 8232648: ZGC: Move ATTRIBUTE_ALIGNED to the front of declarations >> ???? 8232649: ZGC: Add callbacks to ZMemoryManager >> ???? 8232650: ZGC: Add initialization hooks for OS specific code >> ???? 8232651: Add implementation of os::processor_id() for Windows >> >> ... they have all been pushed now. >> >> One important key-point to this implementation is to use the new >> Windows APIs that support reservation and mapping of memory through >> "placeholders":? VirtualAlloc2, VirtualFreeEx, MapViewOfFile3, and >> UnmapViewOfFile2. These functions are available starting from version >> 1803 of Windows 10 and Windows Server. ZGC will lookup these symbols >> to determine if the Windows version supports these functions. >> >> >> Correlating the text in the JEP with the code: >> >> * '"Support for multi-mapping memory". ZGC's use of colored pointers >> requires support for heap multi-mapping, so that the same physical >> memory can be accessed from multiple different locations in the >> process address space. On Windows, paging-file backed memory provides >> physical memory with an identity (a handle), which is unrelated to the >> virtual address where it is mapped. Using this identity allows ZGC to >> map the same physical memory into multiple locations.' >> >> We commit memory via paging file mappings and map views into that memory. >> >> The function ZMapper::create_and_commit_paging_file_mapping uses >> CreateFileMappingW with SEC_RESERVE to create this mapping, >> MapViewOfFile3 to map a temporary view into the mapping, VirtualAlloc2 >> to commit the memory, and then UnmapViewOfFile2 to unmap the view. >> >> The reason to use SEC_RESERVE and the extra VirtualAlloc2, instead of >> SEC_COMMIT, is to ensure that the later multi-mappings of committed >> file mappings don't fail under low-memory situations. Earlier >> prototypes used SEC_COMMIT and saw these kind of OOME errors when >> mapping new views to already committed memory. The current >> platform-independent ZGC code isn't prepared to handle OOME errors >> when mapping views, so we chose this solution. >> >> MapViewOfFile3 is then used to multi-map into the committed memory. >> >> * '"Support for mapping paging-file backed memory into a reserved >> address space". The Windows memory management API is not as flexible >> as POSIX's mmap/munmap, especially when it comes to mapping file >> backed memory into a previously reserved address space region. To do >> this, ZGC will use the Windows concept of address space placeholders. >> The placeholder concept was introduced in version 1803 of Windows 10 >> and Windows Server. ZGC support for older versions of Windows will not >> be implemented.' >> >> Before the placeholder APIs there was no way to first reserve a >> specific virtual memory range, and then map a view of a committed >> paging file over that range. The VirtuaAlloc function could be used to >> first reserve and then commit anonymous memory, but nothing similar >> existed for mapped views. Now with placeholders, we can create a >> placeholder reservation of memory with VirtualAlloc2, and then replace >> that reservation with MapViewOfFile3. When memory is unmapped, we can >> use UnmapViewOfFile2 to "preserve" the placeholder memory reservation. >> >> >> * '"Support for mapping and unmapping arbitrary parts of the heap". >> ZGC's heap layout in combination with its dynamic sizing (and >> re-sizing) of heap pages requires support for mapping and unmapping >> arbitrary heap granules. This requirement in combination with Windows >> address space placeholders requires special attention, since >> placeholders must be explicitly split/coalesced by the program, as >> opposed to being automatically split/coalesced by the operating system >> (as on Linux).' >> >> Half of the preparation patches were put in place to support this. >> When replacing a placeholder with a view of the backing file, we need >> to exactly match the address and size of a placeholder. Also, when >> unmapping a view, we need to exactly match the address and size of the >> view, and replace it with a placeholder. >> >> To make it easier to map and unmap arbitrary parts of the heap, we >> split reserved memory into ZGranuleSize-sized placeholders. So, >> whenever we perform any of these operations, we know that any given >> memory range could be dealt with as a number of granules. >> >> When memory is reserved, but not mapped, it is registered in the >> ZVirtualMemoryManager. It splits memory into granule-sized placholders >> when reserved memory is fetched, and coalesces placeholders when >> reserved memory is handed back. >> >> >> * '"Support for committing and uncommitting arbitrary parts of the >> heap". ZGC can commit and uncommit physical memory dynamically while >> the Java program is running. To support these operations the physical >> memory will be divided into, and backed by, multiple paging-file >> segments. Each paging-file segment corresponds to a ZGC heap granule, >> and can be committed and uncommitted independently of other segments.' >> >> Just like we can map and unmap in granules, we want to be able to >> commit and uncommit memory in granules. You can see how memory is >> committed and uncommitted in granules in >> ZBackingFile::commit_from_paging_file and >> ZBackingFile::uncommit_from_paging_file. Each committed granule is >> associated with one registered handle. When memory for a granule is >> uncommitted, the handle is closed. At this point, no views exist to >> the mapping and the memory is handed back to the OS. >> >> >> Final point about ZPhysicalMemoryBacking. We've tried to make this >> file similar on all OSes, with the hope to be able to combine them >> when both the Windows and macOS ports have been merged. >> >> Thanks, >> StefanK From thomas.schatzl at oracle.com Tue Nov 5 11:00:30 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 5 Nov 2019 12:00:30 +0100 Subject: RFR (S): 8233301: Implementation of JEP 366: Deprecate the ParallelScavenge + SerialOld GC Combination In-Reply-To: References: <727338ce-845c-067f-e56e-f066d2a602dd@oracle.com> <70424c6c-e598-f12a-f261-a0b62cd5869a@oracle.com> <6052c853-552f-06ab-ebe6-50c6bc4da9d5@oracle.com> Message-ID: <375df6d9-215d-4c20-cea1-619c9acac185@oracle.com> Hi Bernd, On 05.11.19 11:22, Bernd Eckenfels wrote: > Yes, looks better to me as well, sorry for the bike shedding :) thanks for your review :) Thomas From stefan.johansson at oracle.com Tue Nov 5 11:35:38 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 5 Nov 2019 12:35:38 +0100 Subject: RFR (M): 8189737: Make HeapRegion not derive from Space In-Reply-To: <07ef2312-d974-be19-c887-828696a8493f@oracle.com> References: <4fdeb066-e9eb-c0e6-5fd2-5ec9a368bc23@oracle.com> <07ef2312-d974-be19-c887-828696a8493f@oracle.com> Message-ID: Hi Thomas, On 2019-11-04 17:40, Thomas Schatzl wrote: > Hi, > > On 01.11.19 00:20, Thomas Schatzl wrote: >> Hi Kim, >> >> ?? thanks for your review. >> >> On Thu, 2019-10-31 at 18:12 -0400, Kim Barrett wrote: >>>> On Oct 31, 2019, at 9:43 AM, Thomas Schatzl < >>>> thomas.schatzl at oracle.com> wrote: >>>> >>>> Hi all, >>>> >>>> ? can I get reviews for this refactoring that removes the >>>> inheritance of HeapRegion from Space? >>>> >>>> >> [...] >>>> CR: >>>> https://bugs.openjdk.java.net/browse/JDK-8189737 >>>> Webrev: >>>> http://cr.openjdk.java.net/~tschatzl/8189737/webrev/ I think this looks really good in general, so nice to get rid of this inheritance. I have one question/comment though. I took a look at the HeapRegion implementation in the SA and there it still extends CompactibleSpace (which extends Space), I think we should remove this and add _bottom and _top for HeapRegion in vmStructs_g1.hpp. This also has the effect that the PrintRegionClosure needs to be updated to not depend on space either. It is only used by G1 so I think it should simply be moved from shared to g1 and changed to not depend on space. Thanks, Stefan >>>> Testing: >>>> hs-tier-1-5 >>>> >>>> Thanks, >>>> ? Thomas > [..] > >>> >>> ------------------------------------------------------------------- >>> ----------- >>> >>> Looks good. >>> >>> I don't need a new webrev for the parameter list indentation fix. >>> >> >> I will update the webrev later in place. > > Done. Thanks for your review. > > Thomas > From zgu at redhat.com Tue Nov 5 12:33:19 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 5 Nov 2019 07:33:19 -0500 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <6c110878-a477-df8a-e566-84b113806044@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> <87b115fb-5353-b21b-3cbc-f862bd932b3e@redhat.com> <3d70db1c-c927-48f8-23ab-8937838e0302@redhat.com> <0d347d16-f870-798f-0165-1ee4dfae511b@redhat.com> <859e48d6-9af5-b4af-32ac-4b07ce92e94d@redhat.com> <6c110878-a477-df8a-e566-84b113806044@redhat.com> Message-ID: <84394d85-1b99-8139-3baf-7fbedba702c0@redhat.com> On 11/5/19 3:52 AM, Andrew Haley wrote: > On 11/4/19 6:23 PM, Zhengyu Gu wrote: >>> They are still needed for calling super class's load_at(). Even though, >>> they are not used there neither. > > Aha! Sorry, I missed that. > >> Or I should say, they are not used there right now, but may be used in >> future ... > > So add them in the future, surely. All you're doing by passing unused > args is confusing the reader. It definitely succeeded with me... > Sorry, I should just remove 'unused' comments. Okay with you? Thanks, -Zhengyu From zgu at redhat.com Tue Nov 5 12:50:11 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 5 Nov 2019 07:50:11 -0500 Subject: RFR 8233500: Shenandoah: Shenandoah load barrier should save registers before calling keep alive barrier on x86 In-Reply-To: References: <1078cf3a-3d99-4cc3-dd0f-55a63967caa5@redhat.com> Message-ID: <0044dc42-1e6b-4e90-fc49-c0cc45d1626f@redhat.com> On 11/4/19 1:39 PM, Aleksey Shipilev wrote: > On 11/4/19 7:34 PM, Zhengyu Gu wrote: >>> SATB handling is similar to G1 is doing, where's the similar code in G1? The patch adds save/restore >>> at in SBSA::load_at, but there is a similar block in SBSA::store_at, why it is not needed there? >> >> Because we do self-fixing in LRB and have to reshuffle registers. > > Okay. So AArch64 does enter()/leave(), why x86 needs the entire IU_state pushed/popped? > > My concern is that pushing/popping the entire state explodes code size (we don't care about > performance much, but we do care about hitting the stub boundaries), and probably hides some bugs > with register shuffles. > push_IU_state()/pop_IU_state(), each adds 3 instructions on x86_64, 2 on x86_32. void MacroAssembler::push_IU_state() { // Push flags first because pusha kills them pushf(); // Make sure rsp stays 16-byte aligned LP64_ONLY(subq(rsp, 8)); pusha(); } void MacroAssembler::pop_IU_state() { popa(); LP64_ONLY(addq(rsp, 8)); popf(); } -Zhengyu From aph at redhat.com Tue Nov 5 16:26:19 2019 From: aph at redhat.com (Andrew Haley) Date: Tue, 5 Nov 2019 16:26:19 +0000 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <84394d85-1b99-8139-3baf-7fbedba702c0@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> <87b115fb-5353-b21b-3cbc-f862bd932b3e@redhat.com> <3d70db1c-c927-48f8-23ab-8937838e0302@redhat.com> <0d347d16-f870-798f-0165-1ee4dfae511b@redhat.com> <859e48d6-9af5-b4af-32ac-4b07ce92e94d@redhat.com> <6c110878-a477-df8a-e566-84b113806044@redhat.com> <84394d85-1b99-8139-3baf-7fbedba702c0@redhat.com> Message-ID: <2be52de0-6f12-d989-cf69-5807b2160cb0@redhat.com> On 11/5/19 12:33 PM, Zhengyu Gu wrote: > > > On 11/5/19 3:52 AM, Andrew Haley wrote: >> On 11/4/19 6:23 PM, Zhengyu Gu wrote: >>>> They are still needed for calling super class's load_at(). Even though, >>>> they are not used there neither. >> >> Aha! Sorry, I missed that. >> >>> Or I should say, they are not used there right now, but may be used in >>> future ... >> >> So add them in the future, surely. All you're doing by passing unused >> args is confusing the reader. It definitely succeeded with me... >> > Sorry, I should just remove 'unused' comments. Okay with you? OK. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From kim.barrett at oracle.com Tue Nov 5 19:57:17 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 5 Nov 2019 14:57:17 -0500 Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for G1, Logging (3/3) In-Reply-To: <0ee0f822-a4e4-41b4-d31d-658ef2d06015@oracle.com> References: <743d16cc-499d-784b-79fc-c006643f9ec5@oracle.com> <6449da1d-6dcc-50ba-8ae8-7615e7ad35f9@oracle.com> <2c4e49d8-91f6-4109-b9a3-78b1999ccda3@oracle.com> <0ee0f822-a4e4-41b4-d31d-658ef2d06015@oracle.com> Message-ID: <9515D8F6-9D01-45AA-BAC4-24B28970F804@oracle.com> > On Nov 5, 2019, at 3:24 AM, Stefan Johansson wrote: > On 2019-11-05 07:22, sangheon.kim at oracle.com wrote: >> webrev: >> http://cr.openjdk.java.net/~sangheki/8220312/webrev.5 >> http://cr.openjdk.java.net/~sangheki/8220312/webrev.5.inc > > This all looks good, but there is an else-if statement that you can simplify a bit: > src/hotspot/share/gc/g1/g1NUMA.cpp > --- > 297 if (preferred_node_index == active_node_index) { > 298 _matched[preferred_node_index]++; > 299 } else if (preferred_node_index != active_node_index && > 300 active_node_index != G1NUMA::UnknownNodeIndex) { > 301 _mismatched[preferred_node_index]++; > 302 } > > The first condition in the else-if statement will always be true (or the if-branch will be taken) and can be removed. Looks good to me too, except for that same else-if. I don?t need a new webrev for that. From kim.barrett at oracle.com Wed Nov 6 00:57:49 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 5 Nov 2019 19:57:49 -0500 Subject: RFR (M): 8228609: G1 copy cost prediction uses used vs. actual copied bytes In-Reply-To: References: Message-ID: <3A799B0C-76B1-4145-A000-1071672BD566@oracle.com> > On Oct 22, 2019, at 1:30 PM, Thomas Schatzl wrote: > > Hi all, > > can I have reviews for this change that makes G1 calculate and the use actual amount of bytes copied for Object Copy phase estimation? > > The problem is that the "used" value that is currently used for this can differ a lot from the number of actually copied bytes during the parallel phases. > > Sources for differences are: > - TLAB sizing > - TLAB/region fragmentation > - all of that multiplied by the number of threads > > Particularly if the amount of copied data is small compared to the number of regions all this can add up and disturb the prediction quite a lot, although overall it's not that bad. > > It's only that this and other small inaccuracies add up. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8228609 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8228609/webrev/ > Testing: > hs-tier1-5 > > Thanks, > Thomas ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1ParScanThreadState.cpp 105 size_t G1ParScanThreadState::copied_words() { 106 size_t result = _surviving_words; 107 _surviving_words = 0; 108 return result; 109 } The reset behavior seems unexpected, based on the name, which looks like an accessor. I think the reset behavior is to avoid double-counting by the recording in evacuate_live_objects. That led me to consider suggesting a more appropriate place for the reset might be in G1PSTS::flush(), where the lab_waste and lab_undo_waste (that were recorded nearby) also get reset. But I don't think that flush() is happening in the right place to prevent double-counting of the waste values. Bug? ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1ParScanThreadState.cpp 319 _surviving_words += word_sz; Is it really worth having a separate accumulator for the total? It seems like we could instead have copied_words() return the sum over the _surviving_young_words. But that might not work because of the (lack of) reset in the right place, per above. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1Policy.cpp 782 double cost_per_byte_ms = (average_time_ms(G1GCPhaseTimes::ObjCopy) + average_time_ms(G1GCPhaseTimes::OptObjCopy)) / copied_bytes; [pre-existing] I think this is computing the rate at which active_workers worker threads copies bytes. What if active_workers changes? ------------------------------------------------------------------------------ From leihouyju at gmail.com Wed Nov 6 07:17:47 2019 From: leihouyju at gmail.com (Haoyu Li) Date: Wed, 6 Nov 2019 15:17:47 +0800 Subject: RFR: 8220465: Use shadow regions for faster ParallelGC full GCs In-Reply-To: <955C4FA4-FC18-446C-851A-0D04A916D88D@oracle.com> References: <4F02DD53-EA98-4A1A-B871-C6E9D9610B2C@oracle.com> <9B69AFD1-7AE2-4B50-BFCF-C9C6E2594240@oracle.com> <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com> <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com> <400df998-171a-5bbe-9f3e-01af1781afb4@oracle.com> <955C4FA4-FC18-446C-851A-0D04A916D88D@oracle.com> Message-ID: Hi Stefan, Sorry for the late update. I have attached both a full patch (shadow-region-v3.patch) and the incremental changes (shadow-region-incr.patch) in this mail, and details are as follows. > Regarding the current patch, I think that it looks good in general, but I > thought a bit more around how to share stuff between the closures and I > agree that adding those extra virtual functions doesn?t really feel worth > it. I?m wondering if a solution where we revert back to letting destination > be the ?real destination? (not ever pointing to the shadow region) and add > a copy_destination which is destination + offset. To make this work the > normal MoveAndUpdateClosure would also have an offset, but it would always > be 0. If do_addr() is then updated to use the copy_destination() in some > places we might end up with something pretty nice, but maybe I?m missing > something. > It is an excellent idea to let MoveAndUpdateClosure have an _offset equal to 0, so ShadowClosure can reuse more code from it. I have made the above changes in the new patch. I also realized that the current patch will trigger an assert because > destination is expected not to be the shadow address: > # Internal Error > (open/src/hotspot/share/gc/parallel/psParallelCompact.cpp:3045), pid=12649, > tid=12728 > # assert(src_cp->destination() == destination) failed: first live obj in > the space must match the destination > > So this also suggests that we should keep destination() returning the real > destination. > > Some other comments: > src/hotspot/share/gc/parallel/psParallelCompact.cpp > ? > 3383 void ShadowClosure::complete_region(ParCompactionManager *cm, > HeapWord *dest_addr, > 3384 PSParallelCompact::RegionData > *region_ptr) { > 3385 assert(region_ptr->shadow_state() == > ParallelCompactData::RegionData::FINISH, "Region should be finished?); > > This assertion will also trigger when running with a debug build and at > this point the shadow state should be SHADOW not FINISH. > ? > Sorry for these buggy assertions. The shadow_state in ShadowClosure::complete_region should be SHADOW instead of FINISH, and I've corrected it. Moreover, while I was testing it in the debug mode, I found another interesting case, in which a region should return to the normal path if it becomes available before invoking fill_shadow_region (the branch that shadow_region == 0 at psParallelCompact.cpp:3182). Therefore, I add a new function ParallelCompactData::RegionData::mark_normal() to handle this special case, so the assertion in MoveAndUpdateClosure::complete_region will success. src/hotspot/share/gc/parallel/psParallelCompact.hpp > ? > 632 inline bool ParallelCompactData::RegionData::mark_filled() { > 633 return Atomic::cmpxchg(FILLED, &_shadow_state, SHADOW) == SHADOW; > 634 } > > Since we never check the return value here we should make it void and > maybe instead add an assert that the return value is SHADOW. > ? > Thanks for the suggestion. I have changed mark_filled() to void. I really appreciate your reviews. If there are any issues in the patch, please let me know at any time. Thanks again! Best Regards, Haoyu Li Stefan Johansson ?2019?10?29??? ??3:03??? > Hi Haoyu, > > I?ve looked through the patch in detail now and created a new webrev at: > http://cr.openjdk.java.net/~sjohanss/8220465/01/ > > I took the liberty of removing the removal of move_and_update from your > patch since I?m addressing that separately in JDK-8233065. The webrev above > is still based on that removal, but I expect that to be pushed tomorrow or > Wednesday so that should be fine. > > I also changed the subject to make it more clear that this is now a review > of: > https://bugs.openjdk.java.net/browse/JDK-8220465 > > Regarding the current patch, I think that it looks good in general, but I > thought a bit more around how to share stuff between the closures and I > agree that adding those extra virtual functions doesn?t really feel worth > it. I?m wondering if a solution where we revert back to letting destination > be the ?real destination? (not ever pointing to the shadow region) and add > a copy_destination which is destination + offset. To make this work the > normal MoveAndUpdateClosure would also have an offset, but it would always > be 0. If do_addr() is then updated to use the copy_destination() in some > places we might end up with something pretty nice, but maybe I?m missing > something. > > I also realized that the current patch will trigger an assert because > destination is expected not to be the shadow address: > # Internal Error > (open/src/hotspot/share/gc/parallel/psParallelCompact.cpp:3045), pid=12649, > tid=12728 > # assert(src_cp->destination() == destination) failed: first live obj in > the space must match the destination > > So this also suggests that we should keep destination() returning the real > destination. > > Some other comments: > src/hotspot/share/gc/parallel/psParallelCompact.cpp > ? > 3383 void ShadowClosure::complete_region(ParCompactionManager *cm, > HeapWord *dest_addr, > 3384 PSParallelCompact::RegionData > *region_ptr) { > 3385 assert(region_ptr->shadow_state() == > ParallelCompactData::RegionData::FINISH, "Region should be finished?); > > This assertion will also trigger when running with a debug build and at > this point the shadow state should be SHADOW not FINISH. > ? > > src/hotspot/share/gc/parallel/psParallelCompact.hpp > ? > 632 inline bool ParallelCompactData::RegionData::mark_filled() { > 633 return Atomic::cmpxchg(FILLED, &_shadow_state, SHADOW) == SHADOW; > 634 } > > Since we never check the return value here we should make it void and > maybe instead add an assert that the return value is SHADOW. > ? > > When you addressed these comments, would it be possible to include both > the full patch and and the incremental changes from the current version. > That makes it easier for the reviewers to see what changed between version > of the patch. > > Thanks, > Stefan > > > 24 okt. 2019 kl. 14:16 skrev Stefan Johansson < > stefan.johansson at oracle.com>: > > > > Hi Haoyu, > > > > On 2019-10-23 17:15, Haoyu Li wrote: > >> Hi Stefan, > >> Thanks for your constructive feedback. I've addressed all the issues > you mentioned, and the updated patch is attached in this email. > > Nice, I will look at the patch next week, but I'll shortly answer your > questions right away. > > > >> During refining the patch, I have a couple of questions: > >> 1) Now the MoveAndUpdateClosure and ShadowClosure assume the > destination address is the very beginning of a region, instead of an > arbitrary address like what it used to be. However, there is an unused > function named PSParallelCompact::move_and_update() uses the > MoveAndUpdateClosure to process a region from its middle, which conflicts > with the assumption. I notice that you removed this function in your patch, > and so did I in the updated patch. Does it matter? > > Yes, I found this function during my code review and it should be > removed, but I think that should be handled as a separate issue. We can do > this removal before this patch goes in. > > > >> 2) Using the same do_addr() in MoveAndUpdateClosure and ShadowClosure > is doable, but it does not reuse all the code neatly. Because storing the > address of the shadow region in _destination requires extra virtual > functions to handle allocating blocks in the start_array and setting > addresses of deferred objects. In particular, allocate_blocks() and > set_deferred_object_for() in both closures are added. Is it worth avoiding > to use _offset to calculate the shadow_destination? > > Ok, sounds like it might be better to have specific do_addr() functions > then. I'll think some more around this when reviewing the new patch in > depth. > > > >> If there are any problems with this patch, please contact me anytime. > I'm more than happy to keep improving the code. Thanks again for reviewing. > >> > > Sound good, thanks, > > Stefan > > -------------- next part -------------- A non-text attachment was scrubbed... Name: shadow-region-incr.patch Type: text/x-patch Size: 9843 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: shadow-region-v3.patch Type: text/x-patch Size: 28686 bytes Desc: not available URL: From thomas.schatzl at oracle.com Wed Nov 6 09:22:44 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 6 Nov 2019 10:22:44 +0100 Subject: RFR (M): 8189737: Make HeapRegion not derive from Space In-Reply-To: References: <4fdeb066-e9eb-c0e6-5fd2-5ec9a368bc23@oracle.com> <07ef2312-d974-be19-c887-828696a8493f@oracle.com> Message-ID: <1a3b60e0-f41d-c422-8fe6-d7d19925c148@oracle.com> Hi Stefan, thanks for your review. On 05.11.19 12:35, Stefan Johansson wrote: > Hi Thomas, > > On 2019-11-04 17:40, Thomas Schatzl wrote: >> Hi, >> >> On 01.11.19 00:20, Thomas Schatzl wrote: >>> Hi Kim, >>> >>> ?? thanks for your review. >>> >>> On Thu, 2019-10-31 at 18:12 -0400, Kim Barrett wrote: >>>>> On Oct 31, 2019, at 9:43 AM, Thomas Schatzl < >>>>> thomas.schatzl at oracle.com> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> ? can I get reviews for this refactoring that removes the >>>>> inheritance of HeapRegion from Space? >>>>> >>>>> >>> [...] >>>>> CR: >>>>> https://bugs.openjdk.java.net/browse/JDK-8189737 >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~tschatzl/8189737/webrev/ > > I think this looks really good in general, so nice to get rid of this > inheritance. > > I have one question/comment though. I took a look at the HeapRegion > implementation in the SA and there it still extends CompactibleSpace > (which extends Space), I think we should remove this and add _bottom and > _top for HeapRegion in vmStructs_g1.hpp. This also has the effect that > the PrintRegionClosure needs to be updated to not depend on space > either. It is only used by G1 so I think it should simply be moved from > shared to g1 and changed to not depend on space. Good catch. Changed in http://cr.openjdk.java.net/~tschatzl/8189737/webrev.0_to_1/ (diff) http://cr.openjdk.java.net/~tschatzl/8189737/webrev.1 (full) I ran hs-tier6 (containing G1 SA tests) and jtreg/serviceability/sa tests with that with no issue. Thanks, Thomas From thomas.schatzl at oracle.com Wed Nov 6 09:32:13 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 6 Nov 2019 10:32:13 +0100 Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for G1, Logging (3/3) In-Reply-To: <9515D8F6-9D01-45AA-BAC4-24B28970F804@oracle.com> References: <743d16cc-499d-784b-79fc-c006643f9ec5@oracle.com> <6449da1d-6dcc-50ba-8ae8-7615e7ad35f9@oracle.com> <2c4e49d8-91f6-4109-b9a3-78b1999ccda3@oracle.com> <0ee0f822-a4e4-41b4-d31d-658ef2d06015@oracle.com> <9515D8F6-9D01-45AA-BAC4-24B28970F804@oracle.com> Message-ID: <369d92f6-1d93-b881-a8d7-4ac933ef2537@oracle.com> Hi, On 05.11.19 20:57, Kim Barrett wrote: >> On Nov 5, 2019, at 3:24 AM, Stefan Johansson wrote: >> On 2019-11-05 07:22, sangheon.kim at oracle.com wrote: >>> webrev: >>> http://cr.openjdk.java.net/~sangheki/8220312/webrev.5 >>> http://cr.openjdk.java.net/~sangheki/8220312/webrev.5.inc >> >> This all looks good, but there is an else-if statement that you can simplify a bit: >> src/hotspot/share/gc/g1/g1NUMA.cpp >> --- >> 297 if (preferred_node_index == active_node_index) { >> 298 _matched[preferred_node_index]++; >> 299 } else if (preferred_node_index != active_node_index && >> 300 active_node_index != G1NUMA::UnknownNodeIndex) { >> 301 _mismatched[preferred_node_index]++; >> 302 } >> >> The first condition in the else-if statement will always be true (or the if-branch will be taken) and can be removed. > > Looks good to me too, except for that same else-if. I don?t need a new webrev for that. > > same. Plus please rename NodeIndexCheckClosure to G1NodeIndexCheckClosure (I know that HeapRegion* does not have a G1 prefix either, but let's not add to the issue). I believe this rename can be done without a re-review from me. Thanks, Thomas From stefan.johansson at oracle.com Wed Nov 6 11:03:15 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 6 Nov 2019 12:03:15 +0100 Subject: RFR (M): 8189737: Make HeapRegion not derive from Space In-Reply-To: <1a3b60e0-f41d-c422-8fe6-d7d19925c148@oracle.com> References: <4fdeb066-e9eb-c0e6-5fd2-5ec9a368bc23@oracle.com> <07ef2312-d974-be19-c887-828696a8493f@oracle.com> <1a3b60e0-f41d-c422-8fe6-d7d19925c148@oracle.com> Message-ID: <9381220a-b034-992a-011d-adcb9859cf45@oracle.com> Hi Thomas, On 2019-11-06 10:22, Thomas Schatzl wrote: > Hi Stefan, > > ? thanks for your review. > > On 05.11.19 12:35, Stefan Johansson wrote: >> Hi Thomas, >> >> On 2019-11-04 17:40, Thomas Schatzl wrote: >>> Hi, >>> >>> On 01.11.19 00:20, Thomas Schatzl wrote: >>>> Hi Kim, >>>> >>>> ?? thanks for your review. >>>> >>>> On Thu, 2019-10-31 at 18:12 -0400, Kim Barrett wrote: >>>>>> On Oct 31, 2019, at 9:43 AM, Thomas Schatzl < >>>>>> thomas.schatzl at oracle.com> wrote: >>>>>> >>>>>> Hi all, >>>>>> >>>>>> ? can I get reviews for this refactoring that removes the >>>>>> inheritance of HeapRegion from Space? >>>>>> >>>>>> >>>> [...] >>>>>> CR: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8189737 >>>>>> Webrev: >>>>>> http://cr.openjdk.java.net/~tschatzl/8189737/webrev/ >> >> I think this looks really good in general, so nice to get rid of this >> inheritance. >> >> I have one question/comment though. I took a look at the HeapRegion >> implementation in the SA and there it still extends CompactibleSpace >> (which extends Space), I think we should remove this and add _bottom >> and _top for HeapRegion in vmStructs_g1.hpp. This also has the effect >> that the PrintRegionClosure needs to be updated to not depend on space >> either. It is only used by G1 so I think it should simply be moved >> from shared to g1 and changed to not depend on space. > > Good catch. Changed in > > http://cr.openjdk.java.net/~tschatzl/8189737/webrev.0_to_1/ (diff) > http://cr.openjdk.java.net/~tschatzl/8189737/webrev.1 (full) Thanks for fixing this, looks good. I don't think we need to add _compaction_top, from what I can see it is not used. I don't need a new webrev for this if I'm right. Thanks, Stefan > > I ran hs-tier6 (containing G1 SA tests) and jtreg/serviceability/sa > tests with that with no issue. > > Thanks, > ? Thomas From zgu at redhat.com Wed Nov 6 12:18:28 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 6 Nov 2019 07:18:28 -0500 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <2be52de0-6f12-d989-cf69-5807b2160cb0@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> <87b115fb-5353-b21b-3cbc-f862bd932b3e@redhat.com> <3d70db1c-c927-48f8-23ab-8937838e0302@redhat.com> <0d347d16-f870-798f-0165-1ee4dfae511b@redhat.com> <859e48d6-9af5-b4af-32ac-4b07ce92e94d@redhat.com> <6c110878-a477-df8a-e566-84b113806044@redhat.com> <84394d85-1b99-8139-3baf-7fbedba702c0@redhat.com> <2be52de0-6f12-d989-cf69-5807b2160cb0@redhat.com> Message-ID: <93330192-7143-ca82-9872-fe627a97772e@redhat.com> On 11/5/19 11:26 AM, Andrew Haley wrote: > On 11/5/19 12:33 PM, Zhengyu Gu wrote: >> >> >> On 11/5/19 3:52 AM, Andrew Haley wrote: >>> On 11/4/19 6:23 PM, Zhengyu Gu wrote: >>>>> They are still needed for calling super class's load_at(). Even though, >>>>> they are not used there neither. >>> >>> Aha! Sorry, I missed that. >>> >>>> Or I should say, they are not used there right now, but may be used in >>>> future ... >>> >>> So add them in the future, surely. All you're doing by passing unused >>> args is confusing the reader. It definitely succeeded with me... >>> >> Sorry, I should just remove 'unused' comments. Okay with you? > > OK. Updated: http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.02/index.html Thanks, -Zhengyu > From aph at redhat.com Wed Nov 6 12:35:54 2019 From: aph at redhat.com (Andrew Haley) Date: Wed, 6 Nov 2019 12:35:54 +0000 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <93330192-7143-ca82-9872-fe627a97772e@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> <87b115fb-5353-b21b-3cbc-f862bd932b3e@redhat.com> <3d70db1c-c927-48f8-23ab-8937838e0302@redhat.com> <0d347d16-f870-798f-0165-1ee4dfae511b@redhat.com> <859e48d6-9af5-b4af-32ac-4b07ce92e94d@redhat.com> <6c110878-a477-df8a-e566-84b113806044@redhat.com> <84394d85-1b99-8139-3baf-7fbedba702c0@redhat.com> <2be52de0-6f12-d989-cf69-5807b2160cb0@redhat.com> <93330192-7143-ca82-9872-fe627a97772e@redhat.com> Message-ID: <7e9ace3d-8d15-e87a-f01c-90fc4b6faa6a@redhat.com> On 11/6/19 12:18 PM, Zhengyu Gu wrote: > Updated: http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.02/index.html OK. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From shade at redhat.com Wed Nov 6 13:45:27 2019 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 6 Nov 2019 14:45:27 +0100 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <93330192-7143-ca82-9872-fe627a97772e@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> <87b115fb-5353-b21b-3cbc-f862bd932b3e@redhat.com> <3d70db1c-c927-48f8-23ab-8937838e0302@redhat.com> <0d347d16-f870-798f-0165-1ee4dfae511b@redhat.com> <859e48d6-9af5-b4af-32ac-4b07ce92e94d@redhat.com> <6c110878-a477-df8a-e566-84b113806044@redhat.com> <84394d85-1b99-8139-3baf-7fbedba702c0@redhat.com> <2be52de0-6f12-d989-cf69-5807b2160cb0@redhat.com> <93330192-7143-ca82-9872-fe627a97772e@redhat.com> Message-ID: <7f8dd01f-30f7-f8c9-544a-c06f2a49eea0@redhat.com> On 11/6/19 1:18 PM, Zhengyu Gu wrote: > Updated: http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.02/index.html Minor nits: *) shenandoahBarrierSetAssembler_aarch64.cpp: excess space between parentheses: 368 if (!is_reference_type(type) ) { *) shenandoahBarrierSetC1.cpp: so, native oop loads used to call to ShenandoahRuntime::load_reference_barrier_native before this refactoring? That would mean it is enabled even when "passive" is enabled (which implies -ShenandoahLRB)? Current change looks fine, but we need to recognize this is the behavioral change. Please link the issue where that regression was introduced. Otherwise looks fine to me, let Roman ack it too. -- Thanks, -Aleksey From zgu at redhat.com Wed Nov 6 14:15:55 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 6 Nov 2019 09:15:55 -0500 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <7f8dd01f-30f7-f8c9-544a-c06f2a49eea0@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> <87b115fb-5353-b21b-3cbc-f862bd932b3e@redhat.com> <3d70db1c-c927-48f8-23ab-8937838e0302@redhat.com> <0d347d16-f870-798f-0165-1ee4dfae511b@redhat.com> <859e48d6-9af5-b4af-32ac-4b07ce92e94d@redhat.com> <6c110878-a477-df8a-e566-84b113806044@redhat.com> <84394d85-1b99-8139-3baf-7fbedba702c0@redhat.com> <2be52de0-6f12-d989-cf69-5807b2160cb0@redhat.com> <93330192-7143-ca82-9872-fe627a97772e@redhat.com> <7f8dd01f-30f7-f8c9-544a-c06f2a49eea0@redhat.com> Message-ID: <0251231d-047e-0117-25b0-8ecfc9b30b7f@redhat.com> Thanks for the review, Aleksey. On 11/6/19 8:45 AM, Aleksey Shipilev wrote: > On 11/6/19 1:18 PM, Zhengyu Gu wrote: >> Updated: http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.02/index.html > > Minor nits: > > *) shenandoahBarrierSetAssembler_aarch64.cpp: excess space between parentheses: > > 368 if (!is_reference_type(type) ) { Will fix before push. > > *) shenandoahBarrierSetC1.cpp: so, native oop loads used to call to > ShenandoahRuntime::load_reference_barrier_native before this refactoring? That would mean it is > enabled even when "passive" is enabled (which implies -ShenandoahLRB)? Current change looks fine, > but we need to recognize this is the behavioral change. Please link the issue where that regression > was introduced. Correct, we don't need load_reference_barrier_native barrier if weak roots are processed at STW pauses. Added comments about this behavioral change in CR and linked to JDK-8227635. -Zhengyu > > Otherwise looks fine to me, let Roman ack it too. > From rkennke at redhat.com Wed Nov 6 14:39:58 2019 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 6 Nov 2019 15:39:58 +0100 Subject: [aarch64-port-dev ] RFR 8233401: Shenandoah: Refactor/cleanup Shenandoah load barrier code In-Reply-To: <7e9ace3d-8d15-e87a-f01c-90fc4b6faa6a@redhat.com> References: <45287c04-370c-cb0b-1603-c93fe15da3d9@redhat.com> <87b115fb-5353-b21b-3cbc-f862bd932b3e@redhat.com> <3d70db1c-c927-48f8-23ab-8937838e0302@redhat.com> <0d347d16-f870-798f-0165-1ee4dfae511b@redhat.com> <859e48d6-9af5-b4af-32ac-4b07ce92e94d@redhat.com> <6c110878-a477-df8a-e566-84b113806044@redhat.com> <84394d85-1b99-8139-3baf-7fbedba702c0@redhat.com> <2be52de0-6f12-d989-cf69-5807b2160cb0@redhat.com> <93330192-7143-ca82-9872-fe627a97772e@redhat.com> <7e9ace3d-8d15-e87a-f01c-90fc4b6faa6a@redhat.com> Message-ID: >> Updated: http://cr.openjdk.java.net/~zgu/JDK-8233401/webrev.02/index.html > > OK. Ok too. Roman From thomas.schatzl at oracle.com Wed Nov 6 15:24:54 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 6 Nov 2019 16:24:54 +0100 Subject: RFR: 8232588: G1 concurrent System.gc can return early or late In-Reply-To: <8518409B-159B-48B2-97E4-D5D4C3B2BC5A@oracle.com> References: <8518409B-159B-48B2-97E4-D5D4C3B2BC5A@oracle.com> Message-ID: Hi, On 31.10.19 21:53, Kim Barrett wrote: > RFR: 8232588: G1 concurrent System.gc can return early or late > RFR: 8233279: G1: GCLocker GC with +GCLockerInvokesConcurrent spins while cycle in progress > > Please review this refactoring and fixing of the state machine used by > G1CollectedHeap::collect for handling requests for concurrent collections. > > The handling of concurrent collection requests is now split out into a > helper function for that purpose. All of the state machine logic for > checking for completion, waiting for completions, and performing retries is > now in that new helper function, rather than being distributed between > try_collect() and various parts of the VMOp. [...] > A change is that waiting by a user-requested GC for a concurrent marking > cycle to complete used to be performed with the thread transitioned to > native and without safepoint checks on the associated monitor lock and wait. > This was noted as having been cribbed from CMS. Coleen and I looked at this > and could not come up with a reason for doing that for G1 (anymore, after > the recent spate of locking improvements), so there's a new G1-specific > monitor being used and the locking and waiting is now "normal". (This makes > the FullGCCount_lock monitor largely CMS-specific.) I do not see a reason for the extra monitor (why it is better to have a separate monitor instead of reusing the existing one with almost the same name?), it does not seem to add information. It is still used in GenCollectedHeap/Serial, so can't be removed with CMS. > For other concurrent GC requests, the only intentional change is for > _gc_locker with GCLockerInvokesConcurrent. Previously it would spin in > try_collect while there was a concurrent marking cycle in progress, also > blocking any callers of GCLocker::stall_until_clear() (JDK-8233279). Now it > returns in that situation, though it's not clear that's a great idea either. > Indeed, even when that option was introduced (for CMS, as part of fixing a > bad interaction between GCLocker GCs and +ExplicitGCInvokesConcurrent) it > was not clear it was a good idea (see JDK-6919638). Fortunately it's off by > default. JDK-8233280 has been filed to remove this option. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8233279 > https://bugs.openjdk.java.net/browse/JDK-8232588 > > Webrev: > https://cr.openjdk.java.net/~kbarrett/8232588/open.00/ > > Testing: > mach5 tier1-6 > > Local (linux-x64) testing with a program that allocates some live data in > the old gen, then has several threads all repeatedly looping on System.gc(). > Looked at output from new logging in try_collect_concurrently and verified > the interleavings of GC start/end and new log messages were as expected. > - Not sure, but maybe this logging should be moved to trace level? In either case I would suggest to improve it to cover the try_collect call too, i.e. with appropriate "attempt"/"complete"/"discard" messages. Otherwise the logging looks a bit incomplete to me, i.e. only attempts to initiate a concurrent collection cauuse messages. I understand that this might generate too many messages for gc+debug, that is the reason to potentially move to gc+trace. - feel free to ignore: in such log messages, if I add thread id for debugging purposes I tend to put the thread id value first so that if you have a log in front of you, and ask your viewer to highlight a particular id, all highlights are in the same column. I.e. .... stuff from unified logging] Message instead of having the thread-id somewhere located in the message like in this case where it is somewhere in the middle. (I kind of also prefer the raw Thread::current() value instead of the pretty name, fwiw) I do not feel strongly about this at all, just something that came to my mind. - a comment why we need a separate gc_counter_less_than method instead of a < comparison would be nice (I know, because of roll-over of these counts). - pre-existing: g1VMOperations.cpp:124-127: maybe this could be changed to } else if (g1h->should_upgrade....) { } Also please fix the indentation of the parameter list in the call to do_full_collection() in line 133. Looks good otherwise. Thanks, Thomas From thomas.schatzl at oracle.com Wed Nov 6 19:21:06 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 6 Nov 2019 20:21:06 +0100 Subject: RFR (M): 8189737: Make HeapRegion not derive from Space In-Reply-To: <9381220a-b034-992a-011d-adcb9859cf45@oracle.com> References: <4fdeb066-e9eb-c0e6-5fd2-5ec9a368bc23@oracle.com> <07ef2312-d974-be19-c887-828696a8493f@oracle.com> <1a3b60e0-f41d-c422-8fe6-d7d19925c148@oracle.com> <9381220a-b034-992a-011d-adcb9859cf45@oracle.com> Message-ID: Hi, On 06.11.19 12:03, Stefan Johansson wrote: > Hi Thomas, [...] >>> >>> I think this looks really good in general, so nice to get rid of this >>> inheritance. >>> >>> I have one question/comment though. I took a look at the HeapRegion >>> implementation in the SA and there it still extends CompactibleSpace >>> (which extends Space), I think we should remove this and add _bottom >>> and _top for HeapRegion in vmStructs_g1.hpp. This also has the effect >>> that the PrintRegionClosure needs to be updated to not depend on >>> space either. It is only used by G1 so I think it should simply be >>> moved from shared to g1 and changed to not depend on space. >> >> Good catch. Changed in >> >> http://cr.openjdk.java.net/~tschatzl/8189737/webrev.0_to_1/ (diff) >> http://cr.openjdk.java.net/~tschatzl/8189737/webrev.1 (full) > > Thanks for fixing this, looks good. I don't think we need to add > _compaction_top, from what I can see it is not used. I don't need a new > webrev for this if I'm right. Updated in place. Thanks, Thomas From sangheon.kim at oracle.com Wed Nov 6 21:58:41 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Wed, 6 Nov 2019 13:58:41 -0800 Subject: RFR(L): 8220312: Implementation: NUMA-Aware Memory Allocation for G1, Logging (3/3) In-Reply-To: <369d92f6-1d93-b881-a8d7-4ac933ef2537@oracle.com> References: <743d16cc-499d-784b-79fc-c006643f9ec5@oracle.com> <6449da1d-6dcc-50ba-8ae8-7615e7ad35f9@oracle.com> <2c4e49d8-91f6-4109-b9a3-78b1999ccda3@oracle.com> <0ee0f822-a4e4-41b4-d31d-658ef2d06015@oracle.com> <9515D8F6-9D01-45AA-BAC4-24B28970F804@oracle.com> <369d92f6-1d93-b881-a8d7-4ac933ef2537@oracle.com> Message-ID: <634ef56c-9afb-19e9-59ee-ecd6a56380c8@oracle.com> Hi all, Many thanks for your through reviews, Kim, Stefan and Thomas! You didn't request for the next webrev, but let me post for the record. Webrev: http://cr.openjdk.java.net/~sangheki/8220312/webrev.6 http://cr.openjdk.java.net/~sangheki/8220312/webrev.6.inc Testing: local build Thanks, Sangheon On 11/6/19 1:32 AM, Thomas Schatzl wrote: > Hi, > > On 05.11.19 20:57, Kim Barrett wrote: >>> On Nov 5, 2019, at 3:24 AM, Stefan Johansson >>> wrote: >>> On 2019-11-05 07:22, sangheon.kim at oracle.com wrote: >>>> webrev: >>>> http://cr.openjdk.java.net/~sangheki/8220312/webrev.5 >>>> http://cr.openjdk.java.net/~sangheki/8220312/webrev.5.inc >>> >>> This all looks good, but there is an else-if statement that you can >>> simplify a bit: >>> src/hotspot/share/gc/g1/g1NUMA.cpp >>> --- >>> 297?? if (preferred_node_index == active_node_index) { >>> 298???? _matched[preferred_node_index]++; >>> 299?? } else if (preferred_node_index != active_node_index && >>> 300????????????? active_node_index != G1NUMA::UnknownNodeIndex) { >>> 301???? _mismatched[preferred_node_index]++; >>> 302?? } >>> >>> The first condition in the else-if statement will always be true (or >>> the if-branch will be taken) and can be removed. >> >> Looks good to me too, except for that same else-if.? I don?t need a >> new webrev for that. >> >> > > same. > > Plus please rename NodeIndexCheckClosure to G1NodeIndexCheckClosure (I > know that HeapRegion* does not have a G1 prefix either, but let's not > add to the issue). > > I believe this rename can be done without a re-review from me. > > Thanks, > ? Thomas From kim.barrett at oracle.com Wed Nov 6 22:39:02 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 6 Nov 2019 17:39:02 -0500 Subject: RFR (M): 8189737: Make HeapRegion not derive from Space In-Reply-To: References: <4fdeb066-e9eb-c0e6-5fd2-5ec9a368bc23@oracle.com> <07ef2312-d974-be19-c887-828696a8493f@oracle.com> <1a3b60e0-f41d-c422-8fe6-d7d19925c148@oracle.com> <9381220a-b034-992a-011d-adcb9859cf45@oracle.com> Message-ID: > On Nov 6, 2019, at 2:21 PM, Thomas Schatzl wrote: > > Hi, > > On 06.11.19 12:03, Stefan Johansson wrote: >> Hi Thomas, > [...] >>>> >>>> I think this looks really good in general, so nice to get rid of this inheritance. >>>> >>>> I have one question/comment though. I took a look at the HeapRegion implementation in the SA and there it still extends CompactibleSpace (which extends Space), I think we should remove this and add _bottom and _top for HeapRegion in vmStructs_g1.hpp. This also has the effect that the PrintRegionClosure needs to be updated to not depend on space either. It is only used by G1 so I think it should simply be moved from shared to g1 and changed to not depend on space. >>> >>> Good catch. Changed in >>> >>> http://cr.openjdk.java.net/~tschatzl/8189737/webrev.0_to_1/ (diff) >>> http://cr.openjdk.java.net/~tschatzl/8189737/webrev.1 (full) >> Thanks for fixing this, looks good. I don't think we need to add _compaction_top, from what I can see it is not used. I don't need a new webrev for this if I'm right. > > Updated in place. > > Thanks, > Thomas Still looks good. From kim.barrett at oracle.com Thu Nov 7 00:14:03 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 6 Nov 2019 19:14:03 -0500 Subject: RFR: 8232588: G1 concurrent System.gc can return early or late In-Reply-To: References: <8518409B-159B-48B2-97E4-D5D4C3B2BC5A@oracle.com> Message-ID: > On Nov 6, 2019, at 10:24 AM, Thomas Schatzl wrote: > > Hi, > > On 31.10.19 21:53, Kim Barrett wrote: >> A change is that waiting by a user-requested GC for a concurrent marking >> cycle to complete used to be performed with the thread transitioned to >> native and without safepoint checks on the associated monitor lock and wait. >> This was noted as having been cribbed from CMS. Coleen and I looked at this >> and could not come up with a reason for doing that for G1 (anymore, after >> the recent spate of locking improvements), so there's a new G1-specific >> monitor being used and the locking and waiting is now "normal". (This makes >> the FullGCCount_lock monitor largely CMS-specific.) > > I do not see a reason for the extra monitor (why it is better to have a separate monitor instead of reusing the existing one with almost the same name?), it does not seem to add information. It is still used in GenCollectedHeap/Serial, so can't be removed with CMS. The old monitor is _safepoint_check_never and uses no_safepoint_checking locking and waiting, while the new one is _safepoint_check_always and uses corresponding locking and waiting (which only checks for safepoints from Java threads). The latter is the proper configuration for G1's usage. I didn't want to change the other one and test and such, with CMS about to go away. And I think with CMS gone there are no longer any waiters for it and it becomes rather vestigal and can probably be eliminated. I was planning to file an RFE to explore that after this change and CMS removal have happened. > - Not sure, but maybe this logging should be moved to trace level? In either case I would suggest to improve it to cover the try_collect call too, i.e. with appropriate "attempt"/"complete"/"discard" messages. Agreed on trace vs debug level. There are already ?completion" messages. Maybe what you are asking for is more detail on those? > Otherwise the logging looks a bit incomplete to me, i.e. only attempts to initiate a concurrent collection cauuse messages. I understand that this might generate too many messages for gc+debug, that is the reason to potentially move to gc+trace. I'm intentionally only minimally touching the other cases in try_collect(). Or am I misunderstanding your comment? (There are logging messages for completion, waiting, and retries, in addition to the attempts to initiate messages.) Or maybe this is more about wanting additional detail in ?completion? messages? > - feel free to ignore: in such log messages, if I add thread id for debugging purposes I tend to put the thread id value first so that if you have a log in front of you, and ask your viewer to highlight a particular id, all highlights are in the same column. > > I.e. > > .... stuff from unified logging] Message > > instead of having the thread-id somewhere located in the message like in > this case where it is somewhere in the middle. > > (I kind of also prefer the raw Thread::current() value instead of the pretty name, fwiw) I don't care that much how the thread is identified; I used the name as something human readable in a log file. The address of the Thread doesn't seem that useful though. (Or are you saying the logging system provides something for threads that I'm not aware of? Wouldn't be the first time I've overlooked some feature of logging.) I'll move it to the front of the log message though. > - a comment why we need a separate gc_counter_less_than method instead of a < comparison would be nice (I know, because of roll-over of these counts). Done. > - pre-existing: > > g1VMOperations.cpp:124-127: maybe this could be changed to > > } else if (g1h->should_upgrade....) { > } > > Also please fix the indentation of the parameter list in the call to do_full_collection() in line 133. Done. Also fixed one or two other formatting nits. And added some more comments to the new VMOp's doit. But I wonder if there might be a pre-existing bug here. The call to should_upgrade was introduced as part of JDK-8211425. This changed the previous check to add policy()->force_upgrade_to_full(), which has a default false that is overridden by G1HeterogeneousHeapPolicy to return true for _manager->has_borrowed_regions(). From reading the RFR thread discussion, the point is that if we were to proceed without the full GC here we could apparently have a heap size that is greater than MaxHeapSize. But I don't see why we couldn't be in that situation and still be able to successfully perform the requested allocation and return. So I wonder if the should_upgrade should be called unconditionally, and then attempt the requested allocation (if any). New webrevs: full: https://cr.openjdk.java.net/~kbarrett/8232588/open.01/ incr: https://cr.openjdk.java.net/~kbarrett/8232588/open.01.inc/ From thomas.schatzl at oracle.com Thu Nov 7 09:58:28 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 7 Nov 2019 10:58:28 +0100 Subject: RFR: 8232588: G1 concurrent System.gc can return early or late In-Reply-To: References: <8518409B-159B-48B2-97E4-D5D4C3B2BC5A@oracle.com> Message-ID: Hi, On 07.11.19 01:14, Kim Barrett wrote: >> On Nov 6, 2019, at 10:24 AM, Thomas Schatzl wrote: >> >> Hi >> >> On 31.10.19 21:53, Kim Barrett wrote: >>> A change is that waiting by a user-requested GC for a concurrent marking >>> cycle to complete used to be performed with the thread transitioned to >>> native and without safepoint checks on the associated monitor lock and wait. >>> This was noted as having been cribbed from CMS. Coleen and I looked at this >>> and could not come up with a reason for doing that for G1 (anymore, after >>> the recent spate of locking improvements), so there's a new G1-specific >>> monitor being used and the locking and waiting is now "normal". (This makes >>> the FullGCCount_lock monitor largely CMS-specific.) >> >> I do not see a reason for the extra monitor (why it is better to have aseparate monitor instead of reusing the existing one with almost the same name?), it does not seem to add information. It is still used in GenCollectedHeap/Serial, so can't be removed with CMS. > > The old monitor is _safepoint_check_never and uses > no_safepoint_checking locking and waiting, while the new one is > _safepoint_check_always and uses corresponding locking and waiting > (which only checks for safepoints from Java threads). The latter is > the proper configuration for G1's usage. I didn't want to change the > other one and test and such, with CMS about to go away. And I think > with CMS gone there are no longer any waiters for it and it becomes > rather vestigal and can probably be eliminated. I was planning to file > an RFE to explore that after this change and CMS removal have > happened. > Okay. >> - Not sure, but maybe this logging should be moved to trace level? In either case I would suggest to improve it to cover the try_collect call too, i.e. with appropriate "attempt"/"complete"/"discard" messages. > > Agreed on trace vs debug level. There are already ?completion" messages. Maybe > what you are asking for is more detail on those? One case I am missing is a "discard" message in the case described in g1CollectedHeap.cpp:2266. > >> Otherwise the logging looks a bit incomplete to me, i.e. only attempts to initiate a concurrent collection cauuse messages. I understand that this might generate too many messages for gc+debug, that is the reason to potentially move to gc+trace. > > I'm intentionally only minimally touching the other cases in > try_collect(). Or am I misunderstanding your comment? (There are > logging messages for completion, waiting, and retries, in addition to > the attempts to initiate messages.) Or maybe this is more about > wanting additional detail in ?completion? messages? In this case I was looking at the log messages only (on gc+debug/trace level), sorry. I found the other messages at gc+alloc=trace in the code. It would be nice to think about consolidating these messages a bit (in a separate CR) as they are somewhat different in style now (and use different log tags). While looking through the code, I found another minor issue I think: to schedule a standard evacuation pause (in line 2270+, wouldn't it be better to call do_collection_pause() instead of executing the VMOps directly? The existing code there seems to miss the check whether the VMOps gc_prologue succeeded. >> - feel free to ignore: in such log messages, if I add thread id for debugging purposes I tend to put the thread id value first so that if you have a log in front of you, and ask your viewer to highlight a particular id, all highlights are in the same column. >> >> I.e. >> >> .... stuff from unified logging] Message >> >> instead of having the thread-id somewhere located in the message like in >> this case where it is somewhere in the middle. >> >> (I kind of also prefer the raw Thread::current() value instead of the pretty name, fwiw) > > I don't care that much how the thread is identified; I used the name > as something human readable in a log file. The address of the Thread > doesn't seem that useful though. (Or are you saying the logging system > provides something for threads that I'm not aware of? Wouldn't be the > first time I've overlooked some feature of logging.) I'll move it to > the front of the log message though. Nah, it's fine to use the name, just some (obvious) rambling thought. And you're right with your suspicion, there is a decorattor for the thread id... > >> - a comment why we need a separate gc_counter_less_than method instead of a < comparison would be nice (I know, because of roll-over of these counts). > > Done. > >> - pre-existing: >> >> g1VMOperations.cpp:124-127: maybe this could be changed to >> >> } else if (g1h->should_upgrade....) { >> } >> >> Also please fix the indentation of the parameter list in the call to do_full_collection() in line 133. > > Done. Also fixed one or two other formatting nits. And added some > more comments to the new VMOp's doit. > > But I wonder if there might be a pre-existing bug here. The call to > should_upgrade was introduced as part of JDK-8211425. This changed the > previous check to add policy()->force_upgrade_to_full(), which has a > default false that is overridden by G1HeterogeneousHeapPolicy to > return true for _manager->has_borrowed_regions(). From reading the RFR > thread discussion, the point is that if we were to proceed without the > full GC here we could apparently have a heap size that is greater than > MaxHeapSize. > > But I don't see why we couldn't be in that situation and still be able > to successfully perform the requested allocation and return. > > So I wonder if the should_upgrade should be called unconditionally, > and then attempt the requested allocation (if any). > I agree here. The should_upgrade_to_full() check should be first. > New webrevs: > full: https://cr.openjdk.java.net/~kbarrett/8232588/open.01/ > incr: https://cr.openjdk.java.net/~kbarrett/8232588/open.01.inc/ > > Looks good (obviously missing the items discussed here). Thanks, Thomas From thomas.schatzl at oracle.com Thu Nov 7 10:02:21 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 7 Nov 2019 11:02:21 +0100 Subject: RFR (M): 8189737: Make HeapRegion not derive from Space In-Reply-To: References: <4fdeb066-e9eb-c0e6-5fd2-5ec9a368bc23@oracle.com> <07ef2312-d974-be19-c887-828696a8493f@oracle.com> <1a3b60e0-f41d-c422-8fe6-d7d19925c148@oracle.com> <9381220a-b034-992a-011d-adcb9859cf45@oracle.com> Message-ID: Hi, On 06.11.19 23:39, Kim Barrett wrote: >> On Nov 6, 2019, at 2:21 PM, Thomas Schatzl wrote: >> >> Hi, >> >> On 06.11.19 12:03, Stefan Johansson wrote: >>> Hi Thomas, >> [...] >>>>> >>>>> I think this looks really good in general, so nice to get rid of this inheritance. >>>>> >>>>> I have one question/comment though. I took a look at the HeapRegion implementation in the SA and there it still extends CompactibleSpace (which extends Space), I think we should remove this and add _bottom and _top for HeapRegion in vmStructs_g1.hpp. This also has the effect that the PrintRegionClosure needs to be updated to not depend on space either. It is only used by G1 so I think it should simply be moved from shared to g1 and changed to not depend on space. >>>> >>>> Good catch. Changed in >>>> >>>> http://cr.openjdk.java.net/~tschatzl/8189737/webrev.0_to_1/ (diff) >>>> http://cr.openjdk.java.net/~tschatzl/8189737/webrev.1 (full) >>> Thanks for fixing this, looks good. I don't think we need to add _compaction_top, from what I can see it is not used. I don't need a new webrev for this if I'm right. >> >> Updated in place. >> >> Thanks, >> Thomas > > Still looks good. > thanks Kim and Stefan for your reviews. Thomas From zgu at redhat.com Thu Nov 7 14:33:00 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 7 Nov 2019 09:33:00 -0500 Subject: RFR(T) 8233796: Shenandoah is broken after 8233708 Message-ID: <32fa496f-6ad9-5f07-2f4f-ad3cdb538a88@redhat.com> Please review this trivial patch to unbreak Shenandoah build. Bug: https://bugs.openjdk.java.net/browse/JDK-8233796 Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233796/webrev.00/ Thanks, -Zhengyu From rkennke at redhat.com Thu Nov 7 14:39:54 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 7 Nov 2019 15:39:54 +0100 Subject: RFR(T) 8233796: Shenandoah is broken after 8233708 In-Reply-To: <32fa496f-6ad9-5f07-2f4f-ad3cdb538a88@redhat.com> References: <32fa496f-6ad9-5f07-2f4f-ad3cdb538a88@redhat.com> Message-ID: <8593ac07-5c43-f8be-489e-3e5e697cb179@redhat.com> Ok. Roman > Please review this trivial patch to unbreak Shenandoah build. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8233796 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233796/webrev.00/ > > > Thanks, > > -Zhengyu > From zgu at redhat.com Thu Nov 7 14:55:13 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 7 Nov 2019 09:55:13 -0500 Subject: RFR(XS) 8233337: Shenandoah: Cleanup AArch64 SBSA::load_reference_barrier_not_null() Message-ID: <5de422c4-ea76-81b0-8413-d3e81f60a09d@redhat.com> Please review this cleanup patch suggested by Andrew Haley. Please see [1] for details Bug: https://bugs.openjdk.java.net/browse/JDK-8233337 Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233337/webrev.00/ Test: hotspot_gc_shenandoah (fastdebug and release) on AArch64 Linux Thanks, -Zhengyu [1] https://mail.openjdk.java.net/pipermail/shenandoah-dev/2019-October/010976.html From erik.osterlund at oracle.com Thu Nov 7 15:19:58 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 7 Nov 2019 16:19:58 +0100 Subject: RFR: 8233797: ZGC: Unify naming convention for functions using atomics Message-ID: Hi, Functions in ZGC that use atomics sometimes have an _atomic postfix in the name, and sometimes not. This enhancement is about unifying that. The proposal is to remove the _atomic postfix in situations where there is no non-atomic counterpart. This patch applies on top of 8233061. Bug: https://bugs.openjdk.java.net/browse/JDK-8233797 Webrev: http://cr.openjdk.java.net/~eosterlund/8233797/webrev.00/ Thanks, /Erik From rkennke at redhat.com Thu Nov 7 15:37:08 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 7 Nov 2019 16:37:08 +0100 Subject: [aarch64-port-dev ] RFR(XS) 8233337: Shenandoah: Cleanup AArch64 SBSA::load_reference_barrier_not_null() In-Reply-To: <5de422c4-ea76-81b0-8413-d3e81f60a09d@redhat.com> References: <5de422c4-ea76-81b0-8413-d3e81f60a09d@redhat.com> Message-ID: Looks good,thanks! Roman > Please review this cleanup patch suggested by Andrew Haley. Please see > [1] for details > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8233337 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233337/webrev.00/ > > Test: > ? hotspot_gc_shenandoah (fastdebug and release) > ? on AArch64 Linux > > Thanks, > > -Zhengyu > > > > > [1] > https://mail.openjdk.java.net/pipermail/shenandoah-dev/2019-October/010976.html > > From per.liden at oracle.com Thu Nov 7 16:19:05 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 7 Nov 2019 17:19:05 +0100 Subject: RFR: 8233797: ZGC: Unify naming convention for functions using atomics In-Reply-To: References: Message-ID: Looks good! /Per On 11/7/19 4:19 PM, Erik ?sterlund wrote: > Hi, > > Functions in ZGC that use atomics sometimes have an _atomic postfix in > the name, and sometimes not. This enhancement is about unifying that. > The proposal is to remove the _atomic postfix in situations where there > is no non-atomic counterpart. > > This patch applies on top of 8233061. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8233797 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8233797/webrev.00/ > > Thanks, > /Erik From erik.osterlund at oracle.com Thu Nov 7 17:55:53 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 7 Nov 2019 18:55:53 +0100 Subject: RFR: 8233797: ZGC: Unify naming convention for functions using atomics In-Reply-To: References: Message-ID: Hi Per, Thanks for the review! /Erik On 2019-11-07 17:19, Per Liden wrote: > Looks good! > > /Per > > On 11/7/19 4:19 PM, Erik ?sterlund wrote: >> Hi, >> >> Functions in ZGC that use atomics sometimes have an _atomic postfix >> in the name, and sometimes not. This enhancement is about unifying >> that. The proposal is to remove the _atomic postfix in situations >> where there is no non-atomic counterpart. >> >> This patch applies on top of 8233061. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8233797 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8233797/webrev.00/ >> >> Thanks, >> /Erik From zgu at redhat.com Thu Nov 7 19:01:42 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 7 Nov 2019 14:01:42 -0500 Subject: [aarch64-port-dev ] RFR 8233339: Shenandoah: Centralize load barrier decisions into ShenandoahBarrierSet In-Reply-To: References: <6ef89df6-84db-0ffe-d1fc-7ffde7e622bf@redhat.com> <4ed90469-8689-b49d-69f1-98f644e9edd0@redhat.com> <0fb9cd70-0a89-8c14-7469-55205c4c3808@redhat.com> Message-ID: <9f4e51fd-dd2c-1f74-e695-51923c75a52a@redhat.com> > > Filed: https://bugs.openjdk.java.net/browse/JDK-8233401 Rebased on top of JDK-8233401 Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233339/webrev.02/index.html Thanks, -Zhengyu > > Matter of fact, I would like to hold off this code review, till reactor > is done. > > Thanks, > > -Zhengyu > >> >> *) shenandoahBarrierSetAssembler_x86.cpp, I believe it would be more >> straightforward to save >> branching on local variable "need_load_reference_barrier" by spelling >> out the "disabled" path >> directly (in fact, I think you are almost there in >> shenandoahBarrierSetC1.cpp!): >> >> ?? if (!ShenandoahBarrierSet::need_load_reference_barrier(decorators, >> type)) { >> ???? BarrierSetAssembler::load_at(masm, decorators, type, dst, src, >> tmp1, tmp_thread); >> ???? return; >> ?? } >> >> ?? ... code that assumes need_load_reference_barrier = true follows ... >> >> ?? Register result_dst = dst; >> ?? bool use_tmp1_for_dst = false; >> >> *) shenandoahBarrierSetC1.cpp: local variable >> "need_load_reference_barrier" is not needed, there is >> only a single use >> >> *) shenandoahBarrierSetC2.cpp: this block should go all the way up: >> >> ? 557?? if >> (!ShenandoahBarrierSet::need_load_reference_barrier(decorators, type)) { >> ? 558???? return load; >> ? 559?? } >> >> *) shenandoahBarrierSet.cpp: this is just "return >> is_reference_type(type)". Saves some inversions. >> >> ?? 78?? if (!is_reference_type(type)) return false; >> ?? 79?? return true; >> >> *) shenandoahBarrierSet.cpp: should be "Should be subset of LRB": >> >> ?? 83?? assert(need_load_reference_barrier(decorators, type), "Why >> ask?"); >> >> *) shenandoahBarrierSet.cpp: seems like this assert is subsumed by the >> previous one? >> >> ??? 84?? assert(is_reference_type(type), "Why we here?"); >> >> From rkennke at redhat.com Thu Nov 7 19:41:00 2019 From: rkennke at redhat.com (Roman Kennke) Date: Thu, 7 Nov 2019 20:41:00 +0100 Subject: [aarch64-port-dev ] RFR 8233339: Shenandoah: Centralize load barrier decisions into ShenandoahBarrierSet In-Reply-To: <9f4e51fd-dd2c-1f74-e695-51923c75a52a@redhat.com> References: <6ef89df6-84db-0ffe-d1fc-7ffde7e622bf@redhat.com> <4ed90469-8689-b49d-69f1-98f644e9edd0@redhat.com> <0fb9cd70-0a89-8c14-7469-55205c4c3808@redhat.com> <9f4e51fd-dd2c-1f74-e695-51923c75a52a@redhat.com> Message-ID: <4b24e6ac-2109-4a7d-83aa-c2427343e22b@redhat.com> That looks good to me. Thanks, Roman >> >> Filed: https://bugs.openjdk.java.net/browse/JDK-8233401 > > Rebased on top of JDK-8233401 > > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233339/webrev.02/index.html > > Thanks, > > -Zhengyu > > >> >> Matter of fact, I would like to hold off this code review, till >> reactor is done. >> >> Thanks, >> >> -Zhengyu >> >>> >>> *) shenandoahBarrierSetAssembler_x86.cpp, I believe it would be more >>> straightforward to save >>> branching on local variable "need_load_reference_barrier" by spelling >>> out the "disabled" path >>> directly (in fact, I think you are almost there in >>> shenandoahBarrierSetC1.cpp!): >>> >>> ?? if (!ShenandoahBarrierSet::need_load_reference_barrier(decorators, >>> type)) { >>> ???? BarrierSetAssembler::load_at(masm, decorators, type, dst, src, >>> tmp1, tmp_thread); >>> ???? return; >>> ?? } >>> >>> ?? ... code that assumes need_load_reference_barrier = true follows ... >>> >>> ?? Register result_dst = dst; >>> ?? bool use_tmp1_for_dst = false; >>> >>> *) shenandoahBarrierSetC1.cpp: local variable >>> "need_load_reference_barrier" is not needed, there is >>> only a single use >>> >>> *) shenandoahBarrierSetC2.cpp: this block should go all the way up: >>> >>> ? 557?? if >>> (!ShenandoahBarrierSet::need_load_reference_barrier(decorators, type)) { >>> ? 558???? return load; >>> ? 559?? } >>> >>> *) shenandoahBarrierSet.cpp: this is just "return >>> is_reference_type(type)". Saves some inversions. >>> >>> ?? 78?? if (!is_reference_type(type)) return false; >>> ?? 79?? return true; >>> >>> *) shenandoahBarrierSet.cpp: should be "Should be subset of LRB": >>> >>> ?? 83?? assert(need_load_reference_barrier(decorators, type), "Why >>> ask?"); >>> >>> *) shenandoahBarrierSet.cpp: seems like this assert is subsumed by >>> the previous one? >>> >>> ??? 84?? assert(is_reference_type(type), "Why we here?"); >>> >>> From zgu at redhat.com Thu Nov 7 19:42:27 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 7 Nov 2019 14:42:27 -0500 Subject: [aarch64-port-dev ] RFR 8233339: Shenandoah: Centralize load barrier decisions into ShenandoahBarrierSet In-Reply-To: <4b24e6ac-2109-4a7d-83aa-c2427343e22b@redhat.com> References: <6ef89df6-84db-0ffe-d1fc-7ffde7e622bf@redhat.com> <4ed90469-8689-b49d-69f1-98f644e9edd0@redhat.com> <0fb9cd70-0a89-8c14-7469-55205c4c3808@redhat.com> <9f4e51fd-dd2c-1f74-e695-51923c75a52a@redhat.com> <4b24e6ac-2109-4a7d-83aa-c2427343e22b@redhat.com> Message-ID: <31415213-3464-619a-0741-ca14f7b9cbcf@redhat.com> Thanks for the review, Roman -Zhengyu On 11/7/19 2:41 PM, Roman Kennke wrote: > That looks good to me. > > Thanks, > Roman > >>> >>> Filed: https://bugs.openjdk.java.net/browse/JDK-8233401 >> >> Rebased on top of JDK-8233401 >> >> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233339/webrev.02/index.html >> >> Thanks, >> >> -Zhengyu >> >> >>> >>> Matter of fact, I would like to hold off this code review, till >>> reactor is done. >>> >>> Thanks, >>> >>> -Zhengyu >>> >>>> >>>> *) shenandoahBarrierSetAssembler_x86.cpp, I believe it would be more >>>> straightforward to save >>>> branching on local variable "need_load_reference_barrier" by spelling >>>> out the "disabled" path >>>> directly (in fact, I think you are almost there in >>>> shenandoahBarrierSetC1.cpp!): >>>> >>>> ?? if (!ShenandoahBarrierSet::need_load_reference_barrier(decorators, >>>> type)) { >>>> ???? BarrierSetAssembler::load_at(masm, decorators, type, dst, src, >>>> tmp1, tmp_thread); >>>> ???? return; >>>> ?? } >>>> >>>> ?? ... code that assumes need_load_reference_barrier = true follows ... >>>> >>>> ?? Register result_dst = dst; >>>> ?? bool use_tmp1_for_dst = false; >>>> >>>> *) shenandoahBarrierSetC1.cpp: local variable >>>> "need_load_reference_barrier" is not needed, there is >>>> only a single use >>>> >>>> *) shenandoahBarrierSetC2.cpp: this block should go all the way up: >>>> >>>> ? 557?? if >>>> (!ShenandoahBarrierSet::need_load_reference_barrier(decorators, type)) { >>>> ? 558???? return load; >>>> ? 559?? } >>>> >>>> *) shenandoahBarrierSet.cpp: this is just "return >>>> is_reference_type(type)". Saves some inversions. >>>> >>>> ?? 78?? if (!is_reference_type(type)) return false; >>>> ?? 79?? return true; >>>> >>>> *) shenandoahBarrierSet.cpp: should be "Should be subset of LRB": >>>> >>>> ?? 83?? assert(need_load_reference_barrier(decorators, type), "Why >>>> ask?"); >>>> >>>> *) shenandoahBarrierSet.cpp: seems like this assert is subsumed by >>>> the previous one? >>>> >>>> ??? 84?? assert(is_reference_type(type), "Why we here?"); >>>> >>>> > From mark.reinhold at oracle.com Thu Nov 7 20:00:39 2019 From: mark.reinhold at oracle.com (mark.reinhold at oracle.com) Date: Thu, 7 Nov 2019 12:00:39 -0800 (PST) Subject: New candidate JEP: 366: Deprecate the ParallelScavenge + SerialOld GC Combination Message-ID: <20191107200039.E54A130D3D7@eggemoggin.niobe.net> https://openjdk.java.net/jeps/366 - Mark From kim.barrett at oracle.com Thu Nov 7 21:02:37 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 7 Nov 2019 16:02:37 -0500 Subject: RFR: 8232588: G1 concurrent System.gc can return early or late In-Reply-To: References: <8518409B-159B-48B2-97E4-D5D4C3B2BC5A@oracle.com> Message-ID: <3B101D78-ED8C-4810-B711-A41AE0CC11C6@oracle.com> > On Nov 7, 2019, at 4:58 AM, Thomas Schatzl wrote: > On 07.11.19 01:14, Kim Barrett wrote: >>> On Nov 6, 2019, at 10:24 AM, Thomas Schatzl wrote: >>> >>> On 31.10.19 21:53, Kim Barrett wrote: > >>> - Not sure, but maybe this logging should be moved to trace level? In either case I would suggest to improve it to cover the try_collect call too, i.e. with appropriate "attempt"/"complete"/"discard" messages. >> Agreed on trace vs debug level. There are already ?completion" messages. Maybe >> what you are asking for is more detail on those? > > One case I am missing is a "discard" message in the case described in g1CollectedHeap.cpp:2266. That's one of the parts that I'm minimally touching as out of scope for this change. New RFE: https://bugs.openjdk.java.net/browse/JDK-8233821 >>> Otherwise the logging looks a bit incomplete to me, i.e. only attempts to initiate a concurrent collection cauuse messages. I understand that this might generate too many messages for gc+debug, that is the reason to potentially move to gc+trace. >> I'm intentionally only minimally touching the other cases in >> try_collect(). Or am I misunderstanding your comment? (There are >> logging messages for completion, waiting, and retries, in addition to >> the attempts to initiate messages.) Or maybe this is more about >> wanting additional detail in ?completion? messages? > > In this case I was looking at the log messages only (on gc+debug/trace level), sorry. I found the other messages at gc+alloc=trace in the code. > > It would be nice to think about consolidating these messages a bit (in a separate CR) as they are somewhat different in style now (and use different log tags). Can I leave it to you to file any needed RFEs for this? I think you have a clearer idea of what you want. > While looking through the code, I found another minor issue I think: to schedule a standard evacuation pause (in line 2270+, wouldn't it be better to call do_collection_pause() instead of executing the VMOps directly? > > The existing code there seems to miss the check whether the VMOps gc_prologue succeeded. I don't understand the prologue_suceeded part of do_collection_pause. If the prologue fails then the doit won't even be run so won't update succeeded from it's initial false value. The code at line 2270 would be very slightly simpler if using do_collection_pause rather than directly using the VMOp. Though I wonder if it might be better to have another VMOp that doesn't have a word-size and doesn't conditionally do an allocation. >>> - feel free to ignore: in such log messages, if I add thread id for debugging purposes I tend to put the thread id value first so that if you have a log in front of you, and ask your viewer to highlight a particular id, all highlights are in the same column. >>> >>> I.e. >>> >>> .... stuff from unified logging] Message >>> >>> instead of having the thread-id somewhere located in the message like in >>> this case where it is somewhere in the middle. >>> >>> (I kind of also prefer the raw Thread::current() value instead of the pretty name, fwiw) >> I don't care that much how the thread is identified; I used the name >> as something human readable in a log file. The address of the Thread >> doesn't seem that useful though. (Or are you saying the logging system >> provides something for threads that I'm not aware of? Wouldn't be the >> first time I've overlooked some feature of logging.) I'll move it to >> the front of the log message though. > > Nah, it's fine to use the name, just some (obvious) rambling thought. > > And you're right with your suspicion, there is a decorattor for the thread id? Yes, I found it. But it seems like log decorators are hard to use in a local fashion. At least, I didn't figure out a way to do that. >> But I wonder if there might be a pre-existing bug here. The call to >> should_upgrade was introduced as part of JDK-8211425. This changed the >> previous check to add policy()->force_upgrade_to_full(), which has a >> default false that is overridden by G1HeterogeneousHeapPolicy to >> return true for _manager->has_borrowed_regions(). From reading the RFR >> thread discussion, the point is that if we were to proceed without the >> full GC here we could apparently have a heap size that is greater than >> MaxHeapSize. >> But I don't see why we couldn't be in that situation and still be able >> to successfully perform the requested allocation and return. >> So I wonder if the should_upgrade should be called unconditionally, >> and then attempt the requested allocation (if any). > > I agree here. The should_upgrade_to_full() check should be first. https://bugs.openjdk.java.net/browse/JDK-8233822 >> New webrevs: >> full: https://cr.openjdk.java.net/~kbarrett/8232588/open.01/ >> incr: https://cr.openjdk.java.net/~kbarrett/8232588/open.01.inc/ > > Looks good (obviously missing the items discussed here). Thanks. From per.liden at oracle.com Fri Nov 8 13:38:01 2019 From: per.liden at oracle.com (Per Liden) Date: Fri, 8 Nov 2019 14:38:01 +0100 Subject: RFR: 8233061: ZGC: Enforce memory ordering in segmented bit maps In-Reply-To: <311b863b-e2dc-c56c-7115-d13afb7c4f4b@oracle.com> References: <311b863b-e2dc-c56c-7115-d13afb7c4f4b@oracle.com> Message-ID: On 10/28/19 4:53 PM, Erik ?sterlund wrote: > Hi, > > In ZGC, bitmaps are lazily cleared in a segmented fashion. In this > scheme, liveness is determined by looking at a counter, a segment bit > map and finally the flat bit map structure. The accesses for the various > stages need to be ordered properly. This patch sprinkles some > OrderAccess calls to enforce this ordering. > > Out of curiosity, I disassembled libjvm.so with and without this patch > to see if the reordering has bitten us in practice on x86_64. > Fortunately, according to my analysis, it has not; we seem to have been > lucky. But there is a lot of machine code, so I could have missed > something. However, given that we now have an AArch64 port which is > definitely affected by this problem, and compilers really are free to do > whatever they want to in the future, it seems in order to enforce this > explicitly. > > This patch depends on https://bugs.openjdk.java.net/browse/JDK-8233073 > which exposes some memory ordering aware getters on BitMap. I did not > want to just wrap the existing API in ZGC, so I split that out to a > separate RFE. > > CR: > http://cr.openjdk.java.net/~eosterlund/8233061/webrev.00/ The rebased webrev.01 looks good. /Per > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8233061 > > Thanks, > /Erik From erik.osterlund at oracle.com Fri Nov 8 14:51:15 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Fri, 8 Nov 2019 15:51:15 +0100 Subject: RFR: 8233061: ZGC: Enforce memory ordering in segmented bit maps In-Reply-To: References: <311b863b-e2dc-c56c-7115-d13afb7c4f4b@oracle.com> Message-ID: Hi Per, Thanks for the review. /Erik On 11/8/19 2:38 PM, Per Liden wrote: > > On 10/28/19 4:53 PM, Erik ?sterlund wrote: >> Hi, >> >> In ZGC, bitmaps are lazily cleared in a segmented fashion. In this >> scheme, liveness is determined by looking at a counter, a segment bit >> map and finally the flat bit map structure. The accesses for the >> various stages need to be ordered properly. This patch sprinkles some >> OrderAccess calls to enforce this ordering. >> >> Out of curiosity, I disassembled libjvm.so with and without this >> patch to see if the reordering has bitten us in practice on x86_64. >> Fortunately, according to my analysis, it has not; we seem to have >> been lucky. But there is a lot of machine code, so I could have >> missed something. However, given that we now have an AArch64 port >> which is definitely affected by this problem, and compilers really >> are free to do whatever they want to in the future, it seems in order >> to enforce this explicitly. >> >> This patch depends on >> https://bugs.openjdk.java.net/browse/JDK-8233073 which exposes some >> memory ordering aware getters on BitMap. I did not want to just wrap >> the existing API in ZGC, so I split that out to a separate RFE. >> >> CR: >> http://cr.openjdk.java.net/~eosterlund/8233061/webrev.00/ > > The rebased webrev.01 looks good. > > /Per > >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8233061 >> >> Thanks, >> /Erik From zgu at redhat.com Fri Nov 8 16:38:44 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 8 Nov 2019 11:38:44 -0500 Subject: RFR 8233850: Shenandoah: Shenandoah thread count ergonomics should be container aware Message-ID: <2b9e8e24-d4bf-9e60-aa59-1b0daf98091a@redhat.com> Please review this small patch that uses container-aware os::initial_active_processor_count() API for calculating worker thread count ergonomics. Bug: https://bugs.openjdk.java.net/browse/JDK-8233850 Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233850/webrev.00/ Test: hotspot_gc_shenandoah (fastdebug and release) Thanks, -Zhengyu From rkennke at redhat.com Fri Nov 8 16:40:15 2019 From: rkennke at redhat.com (Roman Kennke) Date: Fri, 8 Nov 2019 17:40:15 +0100 Subject: RFR 8233850: Shenandoah: Shenandoah thread count ergonomics should be container aware In-Reply-To: <2b9e8e24-d4bf-9e60-aa59-1b0daf98091a@redhat.com> References: <2b9e8e24-d4bf-9e60-aa59-1b0daf98091a@redhat.com> Message-ID: <46de5f98-5bab-73e9-c0de-f66f819bd2b3@redhat.com> Looks good to me. Thanks! Roman > Please review this small patch that uses container-aware > os::initial_active_processor_count() API for calculating worker thread > count ergonomics. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8233850 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8233850/webrev.00/ > > Test: > ? hotspot_gc_shenandoah (fastdebug and release) > > > Thanks, > > -Zhengyu > From linuxhippy at gmail.com Sat Nov 9 14:54:27 2019 From: linuxhippy at gmail.com (Clemens Eisserer) Date: Sat, 9 Nov 2019 15:54:27 +0100 Subject: State of G1's "throughput barriers"? Message-ID: Hi, With great excitement I read about the proposal to add a throughput-mode to G1 - has there been any progress on this? In some cases the throughput overhead G1 introduced compared to CMS is quite noticeable, especially for the case where 2-5s pauses are quite tolerable - i really hoped to get an option to trade a bit of latency for better throughput. Thanks, Clemens From stefan.johansson at oracle.com Mon Nov 11 15:08:10 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Mon, 11 Nov 2019 16:08:10 +0100 Subject: RFR: 8220465: Use shadow regions for faster ParallelGC full GCs In-Reply-To: References: <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com> <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com> <400df998-171a-5bbe-9f3e-01af1781afb4@oracle.com> <955C4FA4-FC18-446C-851A-0D04A916D88D@oracle.com> Message-ID: <4fce596c-a4eb-5da6-1b47-b4e2314de3c5@oracle.com> Hi Haoyu, Thanks for the updated patches, I think they look good in general, just one comment inline below. Here are some updated webrev: Full: http://cr.openjdk.java.net/~sjohanss/8220465/02 Inc: http://cr.openjdk.java.net/~sjohanss/8220465/01-02 On 2019-11-06 08:17, Haoyu Li wrote: > Hi Stefan, > > Sorry for the late update. I have attached both a full patch > (shadow-region-v3.patch) and the incremental changes > (shadow-region-incr.patch) in this mail, and details are as follows. > > Regarding the current patch, I think that it looks good in general, > but I thought a bit more around how to share stuff between the > closures and I agree that adding those extra virtual functions > doesn?t really feel worth it. I?m wondering if a solution where we > revert back to letting destination be the ?real destination? (not > ever pointing to the shadow region) and add a copy_destination which > is destination + offset. To make this work the normal > MoveAndUpdateClosure would also have an offset, but it would always > be 0. If do_addr() is then updated to use the copy_destination() in > some places we might end up with something pretty nice, but maybe > I?m missing something. > > > It is an excellent idea to let MoveAndUpdateClosure have an _offset > equal to 0, so ShadowClosure can reuse more code from it. I have made > the above changes in the new patch. Yes, using this approach looks very nice. > > I also realized that the current patch will trigger an assert > because destination is expected not to be the shadow address: > #? Internal Error > (open/src/hotspot/share/gc/parallel/psParallelCompact.cpp:3045), > pid=12649, tid=12728 > #? assert(src_cp->destination() == destination) failed: first live > obj in the space must match the destination > > So this also suggests that we should keep destination() returning > the real destination. > > Some other comments: > src/hotspot/share/gc/parallel/psParallelCompact.cpp > ? > 3383 void ShadowClosure::complete_region(ParCompactionManager *cm, > HeapWord *dest_addr, > 3384 > ?PSParallelCompact::RegionData *region_ptr) { > 3385? ?assert(region_ptr->shadow_state() == > ParallelCompactData::RegionData::FINISH, "Region should be finished?); > > This assertion will also trigger when running with a debug build and > at this point the shadow state should be SHADOW not FINISH. > ? > > > Sorry for these buggy assertions. The shadow_state in > ShadowClosure::complete_region should be SHADOW instead of FINISH, and > I've corrected it. Moreover, while I was testing it in the debug mode, I > found another interesting case, in which a region should return to the > normal path if it becomes available before invoking fill_shadow_region > (the branch that shadow_region == 0 at psParallelCompact.cpp:3182). > Therefore, I add a new function > ParallelCompactData::RegionData::mark_normal() to handle this special > case, so the assertion in MoveAndUpdateClosure::complete_region will > success. Nice, I think it would make sense to used cmpxchg in mark_normal() as well and assert that the returned value is SHADOW. Thanks, Stefan > > src/hotspot/share/gc/parallel/psParallelCompact.hpp > ? > ?632 inline bool ParallelCompactData::RegionData::mark_filled() { > ?633? ?return Atomic::cmpxchg(FILLED, &_shadow_state, SHADOW) == > SHADOW; > ?634 } > > Since we never check the return value here we should make it void > and maybe instead add an assert that the return value is SHADOW. > ? > > > Thanks for the suggestion. I have changed mark_filled() to void. > > I really appreciate your reviews. If there are any issues in the patch, > please let me know at any time. Thanks again! > Best Regards, > Haoyu Li > > Stefan Johansson > ?2019?10?29??? ??3:03??? > > Hi Haoyu, > > I?ve looked through the patch in detail now and created a new webrev at: > http://cr.openjdk.java.net/~sjohanss/8220465/01/ > > I took the liberty of removing the removal of move_and_update from > your patch since I?m addressing that separately in JDK-8233065. The > webrev above is still based on that removal, but I expect that to be > pushed tomorrow or Wednesday so that should be fine. > > I also changed the subject to make it more clear that this is now a > review of: > https://bugs.openjdk.java.net/browse/JDK-8220465 > > Regarding the current patch, I think that it looks good in general, > but I thought a bit more around how to share stuff between the > closures and I agree that adding those extra virtual functions > doesn?t really feel worth it. I?m wondering if a solution where we > revert back to letting destination be the ?real destination? (not > ever pointing to the shadow region) and add a copy_destination which > is destination + offset. To make this work the normal > MoveAndUpdateClosure would also have an offset, but it would always > be 0. If do_addr() is then updated to use the copy_destination() in > some places we might end up with something pretty nice, but maybe > I?m missing something. > > I also realized that the current patch will trigger an assert > because destination is expected not to be the shadow address: > #? Internal Error > (open/src/hotspot/share/gc/parallel/psParallelCompact.cpp:3045), > pid=12649, tid=12728 > #? assert(src_cp->destination() == destination) failed: first live > obj in the space must match the destination > > So this also suggests that we should keep destination() returning > the real destination. > > Some other comments: > src/hotspot/share/gc/parallel/psParallelCompact.cpp > ? > 3383 void ShadowClosure::complete_region(ParCompactionManager *cm, > HeapWord *dest_addr, > 3384 > ?PSParallelCompact::RegionData *region_ptr) { > 3385? ?assert(region_ptr->shadow_state() == > ParallelCompactData::RegionData::FINISH, "Region should be finished?); > > This assertion will also trigger when running with a debug build and > at this point the shadow state should be SHADOW not FINISH. > ? > > src/hotspot/share/gc/parallel/psParallelCompact.hpp > ? > ?632 inline bool ParallelCompactData::RegionData::mark_filled() { > ?633? ?return Atomic::cmpxchg(FILLED, &_shadow_state, SHADOW) == > SHADOW; > ?634 } > > Since we never check the return value here we should make it void > and maybe instead add an assert that the return value is SHADOW. > ? > > When you addressed these comments, would it be possible to include > both the full patch and and the incremental changes from the current > version. That makes it easier for the reviewers to see what changed > between version of the patch. > > Thanks, > Stefan > > > 24 okt. 2019 kl. 14:16 skrev Stefan Johansson > >: > > > > Hi Haoyu, > > > > On 2019-10-23 17:15, Haoyu Li wrote: > >> Hi Stefan, > >> Thanks for your constructive feedback. I've addressed all the > issues you mentioned, and the updated patch is attached in this email. > > Nice, I will look at the patch next week, but I'll shortly answer > your questions right away. > > > >> During refining the patch, I have a couple of questions: > >> 1) Now the MoveAndUpdateClosure and ShadowClosure assume the > destination address is the very beginning of a region, instead of an > arbitrary address like what it used to be. However, there is an > unused function named PSParallelCompact::move_and_update() uses the > MoveAndUpdateClosure to process a region from its middle, which > conflicts with the assumption. I notice that you removed this > function in your patch, and so did I in the updated patch. Does it > matter? > > Yes, I found this function during my code review and it should be > removed, but I think that should be handled as a separate issue. We > can do this removal before this patch goes in. > > > >> 2) Using the same do_addr() in MoveAndUpdateClosure and > ShadowClosure is doable, but it does not reuse all the code neatly. > Because storing the address of the shadow region in _destination > requires extra virtual functions to handle allocating blocks in the > start_array and setting addresses of deferred objects. In > particular, allocate_blocks() and set_deferred_object_for() in both > closures are added. Is it worth avoiding to use _offset to calculate > the shadow_destination? > > Ok, sounds like it might be better to have specific do_addr() > functions then. I'll think some more around this when reviewing the > new patch in depth. > > > >> If there are any problems with this patch, please contact me > anytime. I'm more than happy to keep improving the code. Thanks > again for reviewing. > >> > > Sound good, thanks, > > Stefan > From thomas.schatzl at oracle.com Mon Nov 11 15:44:27 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 11 Nov 2019 16:44:27 +0100 Subject: RFR (XS): 8233792: TestG1ParallelPhases.java fails with phase NonYoungFreeCSet not found (2) Message-ID: Hi all, can I have reviews for this small test fix to avoid this test failing sometimes? The tests tries to force mixed gcs to wait for a particular JFR event only sent at that time. The way it forces those is a bit wrong: in some cases it may happen that at the time it starts a concurrent marking, another one just finished without being able to clean out the old gen. This means that the initial mark for this forced marking will be upgraded to a full gc (which also ends that marking), and the following forced young collections are young-only gcs only. In total, no mixed gcs happen in that case, and so that JFR event is never sent. The fix is to make sure that before forcing mixed gc (which works) we force the heap into a state where the upgrade to full gc may not happen - by forcing a full gc. Without the patch the fails like 6 times in 4000 runs, with the change it does not fail after 3k runs. CR: https://bugs.openjdk.java.net/browse/JDK-8233792 Webrev: http://cr.openjdk.java.net/~tschatzl/8233792/webrev/ Testing: see above. Thanks, Thomas From leo.korinth at oracle.com Mon Nov 11 16:45:43 2019 From: leo.korinth at oracle.com (Leo Korinth) Date: Mon, 11 Nov 2019 17:45:43 +0100 Subject: RFR (XS): 8233792: TestG1ParallelPhases.java fails with phase NonYoungFreeCSet not found (2) In-Reply-To: References: Message-ID: <6d4a1476-51c2-094c-859c-0e8719a4fd6d@oracle.com> On 11/11/2019 16:44, Thomas Schatzl wrote: > Hi all, > > ? can I have reviews for this small test fix to avoid this test failing > sometimes? > > The tests tries to force mixed gcs to wait for a particular JFR event > only sent at that time. > The way it forces those is a bit wrong: in some cases it may happen that > at the time it starts a concurrent marking, another one just finished > without being able to clean out the old gen. This means that the initial > mark for this forced marking will be upgraded to a full gc (which also > ends that marking), and the following forced young collections are > young-only gcs only. > > In total, no mixed gcs happen in that case, and so that JFR event is > never sent. > > The fix is to make sure that before forcing mixed gc (which works) we > force the heap into a state where the upgrade to full gc may not happen > - by forcing a full gc. > > Without the patch the fails like 6 times in 4000 runs, with the change > it does not fail after 3k runs. Looks good, thanks for finding and fixing this Thomas! After looking at your fix, I also took a look at TestLogging.java and TestOldGenCollectionUsage.java (from which I copied the code). Those test cases does, in addition to this test case, also allocate in a loop at the end and provokes a mixed gc by also setting -XX:G1MixedGCLiveThresholdPercent=100. Even if those tests do "work", I think those test cases ought to be cleaned up so that they do not fool more people like me. If you think that is a good idea, I will create an enhancement for cleanup of those test cases. Thanks, Leo > CR: > https://bugs.openjdk.java.net/browse/JDK-8233792 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8233792/webrev/ > Testing: > see above. > > Thanks, > ? Thomas From kim.barrett at oracle.com Mon Nov 11 19:40:54 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 11 Nov 2019 14:40:54 -0500 Subject: RFR: 8232588: G1 concurrent System.gc can return early or late In-Reply-To: <3B101D78-ED8C-4810-B711-A41AE0CC11C6@oracle.com> References: <8518409B-159B-48B2-97E4-D5D4C3B2BC5A@oracle.com> <3B101D78-ED8C-4810-B711-A41AE0CC11C6@oracle.com> Message-ID: > On Nov 7, 2019, at 4:02 PM, Kim Barrett wrote: > >> On Nov 7, 2019, at 4:58 AM, Thomas Schatzl wrote: >>> New webrevs: >>> full: https://cr.openjdk.java.net/~kbarrett/8232588/open.01/ >>> incr: https://cr.openjdk.java.net/~kbarrett/8232588/open.01.inc/ >> >> Looks good (obviously missing the items discussed here). After sending out the open.01* set of webrevs I noticed the comment describing should_do_concurrent_full_gc had fallen out of date a while ago (should have been updated by JDK-8212657), and could also use a bit of tidying up. I'd like to deal with this now, since it's minimal and I'm waiting on a 2nd reviewer. Changes are to add missing _g1_periodic_collection to the list and regularize punctuation there, and improve wording of the description. Not bothering with a webrev for now; here's the diff: diff -r f95bdf58fc7e -r a772f9ce0594 src/hotspot/share/gc/g1/g1CollectedHeap.hpp --- a/src/hotspot/share/gc/g1/g1CollectedHeap.hpp Wed Nov 06 18:52:16 2019 -0500 +++ b/src/hotspot/share/gc/g1/g1CollectedHeap.hpp Mon Nov 11 14:36:22 2019 -0500 @@ -256,14 +256,14 @@ G1HRPrinter _hr_printer; - // It decides whether an explicit GC should start a concurrent cycle - // instead of doing a STW GC. Currently, a concurrent cycle is - // explicitly started if: - // (a) cause == _gc_locker and +GCLockerInvokesConcurrent, or - // (b) cause == _g1_humongous_allocation - // (c) cause == _java_lang_system_gc and +ExplicitGCInvokesConcurrent. - // (d) cause == _dcmd_gc_run and +ExplicitGCInvokesConcurrent. - // (e) cause == _wb_conc_mark + // Return true if an explicit GC should start a concurrent cycle instead + // of doing a STW full GC. A concurrent cycle should be started if: + // (a) cause == _gc_locker and +GCLockerInvokesConcurrent, + // (b) cause == _g1_humongous_allocation, + // (c) cause == _java_lang_system_gc and +ExplicitGCInvokesConcurrent, + // (d) cause == _dcmd_gc_run and +ExplicitGCInvokesConcurrent, + // (e) cause == _wb_conc_mark, + // (f) cause == _g1_periodic_collection and +G1PeriodicGCInvokesConcurrent. bool should_do_concurrent_full_gc(GCCause::Cause cause); // Attempt to start a concurrent cycle with the indicated cause. From kim.barrett at oracle.com Mon Nov 11 19:42:54 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 11 Nov 2019 14:42:54 -0500 Subject: RFR (XS): 8233792: TestG1ParallelPhases.java fails with phase NonYoungFreeCSet not found (2) In-Reply-To: References: Message-ID: > On Nov 11, 2019, at 10:44 AM, Thomas Schatzl wrote: > > Hi all, > > can I have reviews for this small test fix to avoid this test failing sometimes? > > The tests tries to force mixed gcs to wait for a particular JFR event only sent at that time. > The way it forces those is a bit wrong: in some cases it may happen that at the time it starts a concurrent marking, another one just finished without being able to clean out the old gen. This means that the initial mark for this forced marking will be upgraded to a full gc (which also ends that marking), and the following forced young collections are young-only gcs only. > > In total, no mixed gcs happen in that case, and so that JFR event is never sent. > > The fix is to make sure that before forcing mixed gc (which works) we force the heap into a state where the upgrade to full gc may not happen - by forcing a full gc. > > Without the patch the fails like 6 times in 4000 runs, with the change it does not fail after 3k runs. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8233792 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8233792/webrev/ > Testing: > see above. > > Thanks, > Thomas Looks good. From per.liden at oracle.com Mon Nov 11 20:41:59 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 11 Nov 2019 21:41:59 +0100 Subject: RFR: 8230661: ZGC: Stop reloading oops in load barriers In-Reply-To: References: <954589ac-147d-9419-f5eb-2a4fceb61cf5@oracle.com> Message-ID: Erik, Stefan and I discussed the original patch and agreed to make some adjustments. The changes are: 1) Broke out self healing logic into ZBarrier::self_heal(), and restructured ZBarrier::barrier() and ZBarrier::weak_barrier() to use that function. 2) Made sure the slow path is only ever executed once, even when self healing is re-applied. 3) In the resurrection blocked window, allow weak oop refs to be healed to the good state, not just the remapped state. 4) Moved the handshake in ZUnload::unload() out into ZHeap, to make it more front-and-center, since it's no longer just needed for class unloading. Diff: http://cr.openjdk.java.net/~pliden/8230661/webrev.1-diff Full: http://cr.openjdk.java.net/~pliden/8230661/webrev.1 Testing: Passed tier 1-7 With these adjustments this patch looks good to me! cheers, Per On 10/28/19 5:44 PM, Erik ?sterlund wrote: > Oops. CR link was a bug link. For anyone that couldn't figure out what > the CR link could possibly be, here it is: > http://cr.openjdk.java.net/~eosterlund/8230661/webrev.00/ > > /Erik > > On 2019-10-28 17:38, Erik ?sterlund wrote: >> Hi, >> >> In ZGC, an oop is first loaded somewhere, by e.g. JIT compiled code. >> Then it passes a load barrier that typically does not take a slow >> path. But when it does take a slow path, the oop is sometimes >> reloaded, at historically three different places, and now two places. >> >> 1) We used to do that as part of the mechanism that transferred >> execution to the slow path because it was easier to write that stub >> code if the original oop died. Since then, the compiler slow paths >> have been rewritten to not reload the oop. >> >> 2) Once in the slow path, we sometimes reload weak oops during the >> resurrection block window, because there used to be a race when it >> closed. After concurrent class unloading integrated, there is a >> thread-local handshake before closing the resurrection block window. >> Therefore, that race no longer exists (when class unloading is used). >> >> 3) Once the final oop of a slow path has been determined, self-healing >> kicks in. The self-healing CAS may fail. When it does, the oop is >> reloaded. But this is completely unnecessary. >> >> With obstacle 1 gone, and 2 and 3 having no reason to be in the code >> any more, I propose to get rid of all reloading of the oops in the >> slow paths, so that it becomes easier to reason about the code. The >> object captured by the original load, is then always the same object >> as the object found after the load barrier completes, although >> possibly with a new bit representation. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8230661 >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8230661 >> >> Thanks, >> /Erik > From manc at google.com Tue Nov 12 02:26:29 2019 From: manc at google.com (Man Cao) Date: Mon, 11 Nov 2019 18:26:29 -0800 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting Message-ID: Hi all, Can I have reviews for an updated implementation for batching card refinement? RFE: https://bugs.openjdk.java.net/browse/JDK-8087198 Webrev: https://cr.openjdk.java.net/~manc/8087198/webrev.00/ Old review thread is: http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2015-June/013798.html. Major differences from the 2015 webrev: - New version does not save the MemRegions for the cards in a buffer. I noticed considerable memory overhead with BigRamTester if we save the MemRegions. - New version handles SuspendibleThreadSetJoiner::should_yield() in a more timely fashion. Instead of forcing refining all buffered cards, the new version can abandon the buffered cards. - New version only batches and sorts the cards, not joining and prefetching. I have not investigated whether joining and prefetching help much. I think it is OK to investigate them in a separate RFE later. Please refer to the RFE page for some performance results. For correctness, tested with: - Submit repo: tier1 - Local fastdebug build: tier2 - Fastdebug stress testing DaCapo h2 and BigRamTester with following option combinations in addition to -XX:+VerifyRememberedSets: default options -XX:-G1UseAdaptiveConcRefinement -XX:G1UpdateBufferSize=4 -XX:G1ConcRefinementGreenZone=0 -XX:G1ConcRefinementYellowZone=1 -XX:G1ConcRefinementThreads=0 -Man From erik.osterlund at oracle.com Tue Nov 12 08:14:29 2019 From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=) Date: Tue, 12 Nov 2019 09:14:29 +0100 Subject: RFR: 8230661: ZGC: Stop reloading oops in load barriers In-Reply-To: References: Message-ID: <7F589002-E4BD-4A25-A26C-423E57D9AE7D@oracle.com> Hi Per, That proposal looks good to me too. Thanks, /Erik > On 11 Nov 2019, at 21:42, Per Liden wrote: > > ?Erik, Stefan and I discussed the original patch and agreed to make some adjustments. The changes are: > > 1) Broke out self healing logic into ZBarrier::self_heal(), and restructured ZBarrier::barrier() and ZBarrier::weak_barrier() to use that function. > > 2) Made sure the slow path is only ever executed once, even when self healing is re-applied. > > 3) In the resurrection blocked window, allow weak oop refs to be healed to the good state, not just the remapped state. > > 4) Moved the handshake in ZUnload::unload() out into ZHeap, to make it more front-and-center, since it's no longer just needed for class unloading. > > Diff: http://cr.openjdk.java.net/~pliden/8230661/webrev.1-diff > Full: http://cr.openjdk.java.net/~pliden/8230661/webrev.1 > Testing: Passed tier 1-7 > > With these adjustments this patch looks good to me! > > cheers, > Per > >> On 10/28/19 5:44 PM, Erik ?sterlund wrote: >> Oops. CR link was a bug link. For anyone that couldn't figure out what the CR link could possibly be, here it is: >> http://cr.openjdk.java.net/~eosterlund/8230661/webrev.00/ >> /Erik >>> On 2019-10-28 17:38, Erik ?sterlund wrote: >>> Hi, >>> >>> In ZGC, an oop is first loaded somewhere, by e.g. JIT compiled code. Then it passes a load barrier that typically does not take a slow path. But when it does take a slow path, the oop is sometimes reloaded, at historically three different places, and now two places. >>> >>> 1) We used to do that as part of the mechanism that transferred execution to the slow path because it was easier to write that stub code if the original oop died. Since then, the compiler slow paths have been rewritten to not reload the oop. >>> >>> 2) Once in the slow path, we sometimes reload weak oops during the resurrection block window, because there used to be a race when it closed. After concurrent class unloading integrated, there is a thread-local handshake before closing the resurrection block window. Therefore, that race no longer exists (when class unloading is used). >>> >>> 3) Once the final oop of a slow path has been determined, self-healing kicks in. The self-healing CAS may fail. When it does, the oop is reloaded. But this is completely unnecessary. >>> >>> With obstacle 1 gone, and 2 and 3 having no reason to be in the code any more, I propose to get rid of all reloading of the oops in the slow paths, so that it becomes easier to reason about the code. The object captured by the original load, is then always the same object as the object found after the load barrier completes, although possibly with a new bit representation. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8230661 >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8230661 >>> >>> Thanks, >>> /Erik From thomas.schatzl at oracle.com Tue Nov 12 08:44:32 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Nov 2019 09:44:32 +0100 Subject: RFR (XS): 8233792: TestG1ParallelPhases.java fails with phase NonYoungFreeCSet not found (2) In-Reply-To: <6d4a1476-51c2-094c-859c-0e8719a4fd6d@oracle.com> References: <6d4a1476-51c2-094c-859c-0e8719a4fd6d@oracle.com> Message-ID: <940752d7-28b8-a20b-8780-e333be0ee5eb@oracle.com> Hi Kim, Leo, thanks for your reviews :) On 11.11.19 17:45, Leo Korinth wrote: > On 11/11/2019 16:44, Thomas Schatzl wrote: >> Hi all, >> >> ?? can I have reviews for this small test fix to avoid this test >> failing sometimes? >>[...] >> The fix is to make sure that before forcing mixed gc (which works) we >> force the heap into a state where the upgrade to full gc may not >> happen - by forcing a full gc. >> >> Without the patch the fails like 6 times in 4000 runs, with the change >> it does not fail after 3k runs. > > Looks good, thanks for finding and fixing this Thomas! > > After looking at your fix, I also took a look at TestLogging.java and > TestOldGenCollectionUsage.java (from which I copied the code). Those > test cases does, in addition to this test case, also allocate in a loop > at the end and provokes a mixed gc by also setting > -XX:G1MixedGCLiveThresholdPercent=100. Even if those tests do "work", I > think those test cases ought to be cleaned up so that they do not fool > more people like me. If you think that is a good idea, I will create an > enhancement for cleanup of those test cases. Sure, go ahead. Thanks, Thomas From thomas.schatzl at oracle.com Tue Nov 12 08:53:05 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Nov 2019 09:53:05 +0100 Subject: RFR: 8232588: G1 concurrent System.gc can return early or late In-Reply-To: References: <8518409B-159B-48B2-97E4-D5D4C3B2BC5A@oracle.com> <3B101D78-ED8C-4810-B711-A41AE0CC11C6@oracle.com> Message-ID: <4a1e8c99-56a6-21fd-08b1-db2498a218c0@oracle.com> Hi, On 11.11.19 20:40, Kim Barrett wrote: >> On Nov 7, 2019, at 4:02 PM, Kim Barrett wrote: >> >>> On Nov 7, 2019, at 4:58 AM, Thomas Schatzl wrote: >>>> New webrevs: >>>> full: https://cr.openjdk.java.net/~kbarrett/8232588/open.01/ >>>> incr: https://cr.openjdk.java.net/~kbarrett/8232588/open.01.inc/ >>> >>> Looks good (obviously missing the items discussed here). > > After sending out the open.01* set of webrevs I noticed the comment > describing should_do_concurrent_full_gc had fallen out of date a while > ago (should have been updated by JDK-8212657), and could also use a > bit of tidying up. I'd like to deal with this now, since it's minimal > and I'm waiting on a 2nd reviewer. > > Changes are to add missing _g1_periodic_collection to the list and > regularize punctuation there, and improve wording of the description. > Not bothering with a webrev for now; here's the diff: > > diff -r f95bdf58fc7e -r a772f9ce0594 src/hotspot/share/gc/g1/g1CollectedHeap.hpp > --- a/src/hotspot/share/gc/g1/g1CollectedHeap.hpp Wed Nov 06 18:52:16 2019 -0500 > +++ b/src/hotspot/share/gc/g1/g1CollectedHeap.hpp Mon Nov 11 14:36:22 2019 -0500 > @@ -256,14 +256,14 @@ > > G1HRPrinter _hr_printer; > [...] > + // (f) cause == _g1_periodic_collection and +G1PeriodicGCInvokesConcurrent. > bool should_do_concurrent_full_gc(GCCause::Cause cause); > > // Attempt to start a concurrent cycle with the indicated cause. > still looks good. Thomas From thomas.schatzl at oracle.com Tue Nov 12 09:06:36 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Nov 2019 10:06:36 +0100 Subject: RFR (M): 8228609: G1 copy cost prediction uses used vs. actual copied bytes In-Reply-To: <3A799B0C-76B1-4145-A000-1071672BD566@oracle.com> References: <3A799B0C-76B1-4145-A000-1071672BD566@oracle.com> Message-ID: <96c45857-03b3-40c4-ca90-2ec7bc2fa2d8@oracle.com> Hi Kim, sorry for the late reply - I have been working on updating this, but other things went in-between. Sorry. On 06.11.19 01:57, Kim Barrett wrote: >> On Oct 22, 2019, at 1:30 PM, Thomas Schatzl wrote: >> >> Hi all, >> >> can I have reviews for this change that makes G1 calculate and the use actual amount of bytes copied for Object Copy phase estimation? >> >> The problem is that the "used" value that is currently used for this can differ a lot from the number of actually copied bytes during the parallel phases. >> >> Sources for differences are: >> - TLAB sizing >> - TLAB/region fragmentation >> - all of that multiplied by the number of threads >> >> Particularly if the amount of copied data is small compared to the number of regions all this can add up and disturb the prediction quite a lot, although overall it's not that bad. >> >> It's only that this and other small inaccuracies add up. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8228609 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8228609/webrev/ >> Testing: >> hs-tier1-5 >> >> Thanks, >> Thomas > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/g1ParScanThreadState.cpp > 105 size_t G1ParScanThreadState::copied_words() { > 106 size_t result = _surviving_words; > 107 _surviving_words = 0; > 108 return result; > 109 } > > The reset behavior seems unexpected, based on the name, which looks > like an accessor. > > I think the reset behavior is to avoid double-counting by the > recording in evacuate_live_objects. That led me to consider suggesting > a more appropriate place for the reset might be in G1PSTS::flush(), > where the lab_waste and lab_undo_waste (that were recorded nearby) > also get reset. But I don't think that flush() is happening in the > right place to prevent double-counting of the waste values. Bug? Bug. Fixed with the new version. > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/g1ParScanThreadState.cpp > 319 _surviving_words += word_sz; > > Is it really worth having a separate accumulator for the total? It > seems like we could instead have copied_words() return the sum over > the _surviving_young_words. > > But that might not work because of the (lack of) reset in the right > place, per above. > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/g1Policy.cpp > 782 double cost_per_byte_ms = (average_time_ms(G1GCPhaseTimes::ObjCopy) + average_time_ms(G1GCPhaseTimes::OptObjCopy)) / copied_bytes; > > [pre-existing] > > I think this is computing the rate at which active_workers worker > threads copies bytes. What if active_workers changes? > That is an existing issue, and I believe more metrics than that do not take the number of threads into account properly. There is also the issue that this value is not necessarily scaling linearly with the number of threads (it includes e.g. work stealing time), so simple linear interpolation will not work. I filed https://bugs.openjdk.java.net/browse/JDK-8233985. Given your above comments I changed the code to collect these values in the "Merge Per-Thread State" phase as we only need a total value anyway. At this point the necessary calculations are free, as we already iterate over all _surviving_young_words entries. Webrev: http://cr.openjdk.java.net/~tschatzl/8228609/webrev.0_to_1/ (diff) http://cr.openjdk.java.net/~tschatzl/8228609/webrev.1/ (full) Testing: hs-tier1-5 Thanks, Thomas From thomas.schatzl at oracle.com Tue Nov 12 09:41:57 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Nov 2019 10:41:57 +0100 Subject: State of G1's "throughput barriers"? In-Reply-To: References: Message-ID: Hi Clemens, On 09.11.19 15:54, Clemens Eisserer wrote: > Hi, > > With great excitement I read about the proposal to add a > throughput-mode to G1 - has there been any progress on this? > In some cases the throughput overhead G1 introduced compared to CMS is > quite noticeable, especially for the case where 2-5s pauses are quite > tolerable - i really hoped to get an option to trade a bit of latency > for better throughput. > > Thanks, Clemens > the throughput barrier effort (or actually: optimize G1 when disabling refinement) afaict consists of two main steps: - changing the existing barrier so that the throughput barrier is not completely different, allowing some interesting further optimizations. This is mostly JDK-8087198 (currently out for review), and JDK-8226731, which improves the existing barrier already a bit. - enabling the smaller barrier if concurrent refinement is disabled (via a new -XX:-G1UseConcRefinement). This is JDK-8134303 and ultimately JDK-8226197). Man Cao from Google is working on this, and it seems that we'll get at least the first big part soon. :) Maybe Man wants to chime in for further details. Thanks, Thomas From thomas.schatzl at oracle.com Tue Nov 12 10:23:44 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Nov 2019 11:23:44 +0100 Subject: RFR (XS): 8233597: Clean up code in G1Analytics::compute_pause_time_ratio Message-ID: Hi all, can I have reviews for this change that cleans up some code equivalent to the clamp() method introduced in JDK-8233702? CR: https://bugs.openjdk.java.net/browse/JDK-8233597 Webrev: http://cr.openjdk.java.net/~tschatzl/8233597/webrev/ Testing: hs-tier1-5 with other patches Thanks, Thomas From leo.korinth at oracle.com Tue Nov 12 12:33:43 2019 From: leo.korinth at oracle.com (Leo Korinth) Date: Tue, 12 Nov 2019 13:33:43 +0100 Subject: RFR (XS): 8233597: Clean up code in G1Analytics::compute_pause_time_ratio In-Reply-To: References: Message-ID: <91c49faa-71d6-1ef4-df47-9835028154f5@oracle.com> On 12/11/2019 11:23, Thomas Schatzl wrote: > Hi all, > > ? can I have reviews for this change that cleans up some code > equivalent to the clamp() method introduced in JDK-8233702? I can find no change in behaviour. Great clean-up! Thanks, Leo > CR: > https://bugs.openjdk.java.net/browse/JDK-8233597 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8233597/webrev/ > Testing: > hs-tier1-5 with other patches > > Thanks, > ? Thomas From stefan.johansson at oracle.com Tue Nov 12 13:19:52 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 12 Nov 2019 14:19:52 +0100 Subject: RFR: 8232588: G1 concurrent System.gc can return early or late In-Reply-To: References: <8518409B-159B-48B2-97E4-D5D4C3B2BC5A@oracle.com> <3B101D78-ED8C-4810-B711-A41AE0CC11C6@oracle.com> Message-ID: Hi Kim, Thanks for cleaning up this part of the code. Looks good overall but I have a couple of small comments below. On 2019-11-11 20:40, Kim Barrett wrote: >> On Nov 7, 2019, at 4:02 PM, Kim Barrett wrote: >> >>> On Nov 7, 2019, at 4:58 AM, Thomas Schatzl wrote: >>>> New webrevs: >>>> full: https://cr.openjdk.java.net/~kbarrett/8232588/open.01/ src/hotspot/share/gc/g1/g1CollectedHeap.cpp --- 2089 // Return true if (x < y) with allowance for wraparound. 2090 static bool gc_counter_less_than(uint x, uint y) { 2091 return (x - y) > (UINT_MAX/2); 2092 } This code makes me have to think to much, with regards to what it does. What do you think about using size_t instead of uint and just do simple comparisons? --- src/hotspot/share/runtime/mutexLocker.hpp --- 71 extern Monitor* G1FullGCCount_lock; // in support of "concurrent" full gc I think the name and the comment could be improve now when it's G1 specific. Something like G1OldGCCount_lock, to me Full signals that it's only STW. --- Thanks, Stefan >>>> incr: https://cr.openjdk.java.net/~kbarrett/8232588/open.01.inc/ >>> >>> Looks good (obviously missing the items discussed here). > > After sending out the open.01* set of webrevs I noticed the comment > describing should_do_concurrent_full_gc had fallen out of date a while > ago (should have been updated by JDK-8212657), and could also use a > bit of tidying up. I'd like to deal with this now, since it's minimal > and I'm waiting on a 2nd reviewer. > > Changes are to add missing _g1_periodic_collection to the list and > regularize punctuation there, and improve wording of the description. > Not bothering with a webrev for now; here's the diff: > > diff -r f95bdf58fc7e -r a772f9ce0594 src/hotspot/share/gc/g1/g1CollectedHeap.hpp > --- a/src/hotspot/share/gc/g1/g1CollectedHeap.hpp Wed Nov 06 18:52:16 2019 -0500 > +++ b/src/hotspot/share/gc/g1/g1CollectedHeap.hpp Mon Nov 11 14:36:22 2019 -0500 > @@ -256,14 +256,14 @@ > > G1HRPrinter _hr_printer; > > - // It decides whether an explicit GC should start a concurrent cycle > - // instead of doing a STW GC. Currently, a concurrent cycle is > - // explicitly started if: > - // (a) cause == _gc_locker and +GCLockerInvokesConcurrent, or > - // (b) cause == _g1_humongous_allocation > - // (c) cause == _java_lang_system_gc and +ExplicitGCInvokesConcurrent. > - // (d) cause == _dcmd_gc_run and +ExplicitGCInvokesConcurrent. > - // (e) cause == _wb_conc_mark > + // Return true if an explicit GC should start a concurrent cycle instead > + // of doing a STW full GC. A concurrent cycle should be started if: > + // (a) cause == _gc_locker and +GCLockerInvokesConcurrent, > + // (b) cause == _g1_humongous_allocation, > + // (c) cause == _java_lang_system_gc and +ExplicitGCInvokesConcurrent, > + // (d) cause == _dcmd_gc_run and +ExplicitGCInvokesConcurrent, > + // (e) cause == _wb_conc_mark, > + // (f) cause == _g1_periodic_collection and +G1PeriodicGCInvokesConcurrent. > bool should_do_concurrent_full_gc(GCCause::Cause cause); > > // Attempt to start a concurrent cycle with the indicated cause. > From stefan.johansson at oracle.com Tue Nov 12 13:56:00 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 12 Nov 2019 14:56:00 +0100 Subject: RFR (XS): 8233597: Clean up code in G1Analytics::compute_pause_time_ratio In-Reply-To: <91c49faa-71d6-1ef4-df47-9835028154f5@oracle.com> References: <91c49faa-71d6-1ef4-df47-9835028154f5@oracle.com> Message-ID: <87a0844a-29e8-d806-7a64-4cea2fcf965f@oracle.com> On 2019-11-12 13:33, Leo Korinth wrote: > On 12/11/2019 11:23, Thomas Schatzl wrote: >> Hi all, >> >> ?? can I have reviews for this change that cleans up some code >> equivalent to the clamp() method introduced in JDK-8233702? > > I can find no change in behaviour. Great clean-up! >+ 1 Thanks for cleaning up the code :) Stefan > Thanks, > Leo > >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8233597 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8233597/webrev/ >> Testing: >> hs-tier1-5 with other patches >> >> Thanks, >> ?? Thomas From thomas.schatzl at oracle.com Tue Nov 12 14:10:01 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Nov 2019 15:10:01 +0100 Subject: RFR (M): 8227434: G1 predictions may over/underflow with high variance input Message-ID: <8112fc27-69a4-0249-de00-54e907ee38e4@oracle.com> Hi all, can I have reviews for this change that tries to fix possible underflows and overflows in our predictor use in case there is high variance input? I did not analyze for every case whether the issue actually happened, but changed the get_new_prediction() calls to something I believe is appropriate for the given sequence. Cases where there has already been some clamping going on were obvious of course. It's a bit boring to review... CR: https://bugs.openjdk.java.net/browse/JDK-8227434 Webrev: http://cr.openjdk.java.net/~tschatzl/8227434/webrev/ Testing: hs-tier1-5 Thanks, Thomas From stefan.karlsson at oracle.com Tue Nov 12 14:21:20 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 12 Nov 2019 15:21:20 +0100 Subject: RFR: 8233061: ZGC: Enforce memory ordering in segmented bit maps In-Reply-To: References: <311b863b-e2dc-c56c-7115-d13afb7c4f4b@oracle.com> Message-ID: Hi Erik, This looks good and should be pushed. Some comments for further cleanups in this area. 1) I think it would be nice to be explicit about the memory order used in is_segment_live, instead of relying on the default value: inline bool ZLiveMap::is_segment_live(BitMap::idx_t segment) const { return segment_live_bits().par_at(segment); } inline bool ZLiveMap::set_segment_live_atomic(BitMap::idx_t segment) { return segment_live_bits().par_set_bit(segment, memory_order_release); } This way it's more apparent that the writer intentionally chose memory_order_acquire and didn't forget to specify the memory order to use. 2) I think would should be more explicit and use _bitmap.par_at(index, memory_order_relaxed) here: inline bool ZLiveMap::get(size_t index) const { BitMap::idx_t segment = index_to_segment(index); return is_marked() && // Page is marked is_segment_live(segment) && // Segment is marked _bitmap.at(index); // Object is marked } Thanks, StefanK On 2019-11-08 15:51, erik.osterlund at oracle.com wrote: > Hi Per, > > Thanks for the review. > > /Erik > > On 11/8/19 2:38 PM, Per Liden wrote: >> >> On 10/28/19 4:53 PM, Erik ?sterlund wrote: >>> Hi, >>> >>> In ZGC, bitmaps are lazily cleared in a segmented fashion. In this >>> scheme, liveness is determined by looking at a counter, a segment bit >>> map and finally the flat bit map structure. The accesses for the >>> various stages need to be ordered properly. This patch sprinkles some >>> OrderAccess calls to enforce this ordering. >>> >>> Out of curiosity, I disassembled libjvm.so with and without this >>> patch to see if the reordering has bitten us in practice on x86_64. >>> Fortunately, according to my analysis, it has not; we seem to have >>> been lucky. But there is a lot of machine code, so I could have >>> missed something. However, given that we now have an AArch64 port >>> which is definitely affected by this problem, and compilers really >>> are free to do whatever they want to in the future, it seems in order >>> to enforce this explicitly. >>> >>> This patch depends on >>> https://bugs.openjdk.java.net/browse/JDK-8233073 which exposes some >>> memory ordering aware getters on BitMap. I did not want to just wrap >>> the existing API in ZGC, so I split that out to a separate RFE. >>> >>> CR: >>> http://cr.openjdk.java.net/~eosterlund/8233061/webrev.00/ >> >> The rebased webrev.01 looks good. >> >> /Per >> >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8233061 >>> >>> Thanks, >>> /Erik > From stefan.karlsson at oracle.com Tue Nov 12 14:24:27 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 12 Nov 2019 15:24:27 +0100 Subject: RFR: 8233797: ZGC: Unify naming convention for functions using atomics In-Reply-To: References: Message-ID: <749bd51e-30f2-b9c1-a853-cbca525564bd@oracle.com> Looks good. StefanK On 2019-11-07 16:19, Erik ?sterlund wrote: > Hi, > > Functions in ZGC that use atomics sometimes have an _atomic postfix in > the name, and sometimes not. This enhancement is about unifying that. > The proposal is to remove the _atomic postfix in situations where there > is no non-atomic counterpart. > > This patch applies on top of 8233061. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8233797 > > Webrev: > http://cr.openjdk.java.net/~eosterlund/8233797/webrev.00/ > > Thanks, > /Erik From thomas.schatzl at oracle.com Tue Nov 12 14:42:14 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Nov 2019 15:42:14 +0100 Subject: RFR (S): 8234000: Make HeapRegion::bottom/end/hrm_index const Message-ID: <0bd70faf-e418-a719-3f48-51877b0d59f1@oracle.com> Hi all, can I have reviews for this very small change that removes the possibility to change the bottom/end/hrm_index members in HeapRegion? This is not required functionality, so I thought it would be good to clean up. CR: https://bugs.openjdk.java.net/browse/JDK-8234000 Webrev: http://cr.openjdk.java.net/~tschatzl/8234000/webrev/ Testing: local compilation, hs-tier1 Thanks, Thomas From erik.osterlund at oracle.com Tue Nov 12 15:04:44 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Tue, 12 Nov 2019 16:04:44 +0100 Subject: RFR: 8233061: ZGC: Enforce memory ordering in segmented bit maps In-Reply-To: References: <311b863b-e2dc-c56c-7115-d13afb7c4f4b@oracle.com> Message-ID: Hi Stefan, Thank you for the review. Let's revisit later if we want to polish passing in default memory orderings in our code or not. Thanks, /Erik On 11/12/19 3:21 PM, Stefan Karlsson wrote: > Hi Erik, > > This looks good and should be pushed. > > Some comments for further cleanups in this area. > > 1) I think it would be nice to be explicit about the memory order used > in is_segment_live, instead of relying on the default value: > > ? inline bool ZLiveMap::is_segment_live(BitMap::idx_t segment) const { > ??? return segment_live_bits().par_at(segment); > ? } > > ? inline bool ZLiveMap::set_segment_live_atomic(BitMap::idx_t segment) { > ??? return segment_live_bits().par_set_bit(segment, > memory_order_release); > ? } > > This way it's more apparent that the writer intentionally chose > memory_order_acquire and didn't forget to specify the memory order to > use. > > 2) I think would should be more explicit and use _bitmap.par_at(index, > memory_order_relaxed) here: > > inline bool ZLiveMap::get(size_t index) const { > ? BitMap::idx_t segment = index_to_segment(index); > ? return is_marked() &&????????????? // Page is marked > ???????? is_segment_live(segment) && // Segment is marked > ???????? _bitmap.at(index);????????? // Object is marked > } > > Thanks, > StefanK > > On 2019-11-08 15:51, erik.osterlund at oracle.com wrote: >> Hi Per, >> >> Thanks for the review. >> >> /Erik >> >> On 11/8/19 2:38 PM, Per Liden wrote: >>> >>> On 10/28/19 4:53 PM, Erik ?sterlund wrote: >>>> Hi, >>>> >>>> In ZGC, bitmaps are lazily cleared in a segmented fashion. In this >>>> scheme, liveness is determined by looking at a counter, a segment >>>> bit map and finally the flat bit map structure. The accesses for >>>> the various stages need to be ordered properly. This patch >>>> sprinkles some OrderAccess calls to enforce this ordering. >>>> >>>> Out of curiosity, I disassembled libjvm.so with and without this >>>> patch to see if the reordering has bitten us in practice on x86_64. >>>> Fortunately, according to my analysis, it has not; we seem to have >>>> been lucky. But there is a lot of machine code, so I could have >>>> missed something. However, given that we now have an AArch64 port >>>> which is definitely affected by this problem, and compilers really >>>> are free to do whatever they want to in the future, it seems in >>>> order to enforce this explicitly. >>>> >>>> This patch depends on >>>> https://bugs.openjdk.java.net/browse/JDK-8233073 which exposes some >>>> memory ordering aware getters on BitMap. I did not want to just >>>> wrap the existing API in ZGC, so I split that out to a separate RFE. >>>> >>>> CR: >>>> http://cr.openjdk.java.net/~eosterlund/8233061/webrev.00/ >>> >>> The rebased webrev.01 looks good. >>> >>> /Per >>> >>>> >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8233061 >>>> >>>> Thanks, >>>> /Erik >> From erik.osterlund at oracle.com Tue Nov 12 15:04:57 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Tue, 12 Nov 2019 16:04:57 +0100 Subject: RFR: 8233797: ZGC: Unify naming convention for functions using atomics In-Reply-To: <749bd51e-30f2-b9c1-a853-cbca525564bd@oracle.com> References: <749bd51e-30f2-b9c1-a853-cbca525564bd@oracle.com> Message-ID: <4ecb5536-edaa-5830-1704-784b2863ecd9@oracle.com> Hi Stefan, Thanks for the review. /Erik On 11/12/19 3:24 PM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2019-11-07 16:19, Erik ?sterlund wrote: >> Hi, >> >> Functions in ZGC that use atomics sometimes have an _atomic postfix >> in the name, and sometimes not. This enhancement is about unifying >> that. The proposal is to remove the _atomic postfix in situations >> where there is no non-atomic counterpart. >> >> This patch applies on top of 8233061. >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8233797 >> >> Webrev: >> http://cr.openjdk.java.net/~eosterlund/8233797/webrev.00/ >> >> Thanks, >> /Erik From stefan.karlsson at oracle.com Tue Nov 12 15:05:08 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 12 Nov 2019 16:05:08 +0100 Subject: RFR: 8230661: ZGC: Stop reloading oops in load barriers In-Reply-To: References: <954589ac-147d-9419-f5eb-2a4fceb61cf5@oracle.com> Message-ID: <98c51c5f-96c5-fe3e-551b-02a50e2ecdc6@oracle.com> Hi, This looks good to me. As a separate patch, I think that we should: 1) swap places between the calls to ZResurrection::unblock and _unload.purge(). There every stale weak/phantom reference has been cleaned out at this point, and there's no need to hold the resurrection blocked windows open longer than necessary. 2) change the implementation of ZResurrection to use Atomic::load/store now that we rely on the handshakes to prevent someone from having a weak oop to a "dead" object while finding out that the resurrection has been unblocked. Thanks, StefanK On 2019-11-11 21:41, Per Liden wrote: > Erik, Stefan and I discussed the original patch and agreed to make some > adjustments. The changes are: > > 1) Broke out self healing logic into ZBarrier::self_heal(), and > restructured ZBarrier::barrier() and ZBarrier::weak_barrier() to use > that function. > > 2) Made sure the slow path is only ever executed once, even when self > healing is re-applied. > > 3) In the resurrection blocked window, allow weak oop refs to be healed > to the good state, not just the remapped state. > > 4) Moved the handshake in ZUnload::unload() out into ZHeap, to make it > more front-and-center, since it's no longer just needed for class > unloading. > > Diff: http://cr.openjdk.java.net/~pliden/8230661/webrev.1-diff > Full: http://cr.openjdk.java.net/~pliden/8230661/webrev.1 > Testing: Passed tier 1-7 > > With these adjustments this patch looks good to me! > > cheers, > Per > > On 10/28/19 5:44 PM, Erik ?sterlund wrote: >> Oops. CR link was a bug link. For anyone that couldn't figure out what >> the CR link could possibly be, here it is: >> http://cr.openjdk.java.net/~eosterlund/8230661/webrev.00/ >> >> /Erik >> >> On 2019-10-28 17:38, Erik ?sterlund wrote: >>> Hi, >>> >>> In ZGC, an oop is first loaded somewhere, by e.g. JIT compiled code. >>> Then it passes a load barrier that typically does not take a slow >>> path. But when it does take a slow path, the oop is sometimes >>> reloaded, at historically three different places, and now two places. >>> >>> 1) We used to do that as part of the mechanism that transferred >>> execution to the slow path because it was easier to write that stub >>> code if the original oop died. Since then, the compiler slow paths >>> have been rewritten to not reload the oop. >>> >>> 2) Once in the slow path, we sometimes reload weak oops during the >>> resurrection block window, because there used to be a race when it >>> closed. After concurrent class unloading integrated, there is a >>> thread-local handshake before closing the resurrection block window. >>> Therefore, that race no longer exists (when class unloading is used). >>> >>> 3) Once the final oop of a slow path has been determined, >>> self-healing kicks in. The self-healing CAS may fail. When it does, >>> the oop is reloaded. But this is completely unnecessary. >>> >>> With obstacle 1 gone, and 2 and 3 having no reason to be in the code >>> any more, I propose to get rid of all reloading of the oops in the >>> slow paths, so that it becomes easier to reason about the code. The >>> object captured by the original load, is then always the same object >>> as the object found after the load barrier completes, although >>> possibly with a new bit representation. >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8230661 >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8230661 >>> >>> Thanks, >>> /Erik >> From leihouyju at gmail.com Tue Nov 12 15:11:36 2019 From: leihouyju at gmail.com (Haoyu Li) Date: Tue, 12 Nov 2019 23:11:36 +0800 Subject: RFR: 8220465: Use shadow regions for faster ParallelGC full GCs In-Reply-To: <4fce596c-a4eb-5da6-1b47-b4e2314de3c5@oracle.com> References: <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com> <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com> <400df998-171a-5bbe-9f3e-01af1781afb4@oracle.com> <955C4FA4-FC18-446C-851A-0D04A916D88D@oracle.com> <4fce596c-a4eb-5da6-1b47-b4e2314de3c5@oracle.com> Message-ID: Hi Stefan, Thanks for your advice! Nice, I think it would make sense to used cmpxchg in mark_normal() as > well and assert that the returned value is SHADOW. I've changed mark_normal() to use Atomic::cmpxchg and added an assertion. Please find the changes in the attached patchs. Thanks! Best Regards, Haoyu Li, Stefan Johansson ?2019?11?11??? ??11:10??? > Hi Haoyu, > > Thanks for the updated patches, I think they look good in general, just > one comment inline below. > > Here are some updated webrev: > Full: http://cr.openjdk.java.net/~sjohanss/8220465/02 > Inc: http://cr.openjdk.java.net/~sjohanss/8220465/01-02 > > On 2019-11-06 08:17, Haoyu Li wrote: > > Hi Stefan, > > > > Sorry for the late update. I have attached both a full patch > > (shadow-region-v3.patch) and the incremental changes > > (shadow-region-incr.patch) in this mail, and details are as follows. > > > > Regarding the current patch, I think that it looks good in general, > > but I thought a bit more around how to share stuff between the > > closures and I agree that adding those extra virtual functions > > doesn?t really feel worth it. I?m wondering if a solution where we > > revert back to letting destination be the ?real destination? (not > > ever pointing to the shadow region) and add a copy_destination which > > is destination + offset. To make this work the normal > > MoveAndUpdateClosure would also have an offset, but it would always > > be 0. If do_addr() is then updated to use the copy_destination() in > > some places we might end up with something pretty nice, but maybe > > I?m missing something. > > > > > > It is an excellent idea to let MoveAndUpdateClosure have an _offset > > equal to 0, so ShadowClosure can reuse more code from it. I have made > > the above changes in the new patch. > Yes, using this approach looks very nice. > > > > > I also realized that the current patch will trigger an assert > > because destination is expected not to be the shadow address: > > # Internal Error > > (open/src/hotspot/share/gc/parallel/psParallelCompact.cpp:3045), > > pid=12649, tid=12728 > > # assert(src_cp->destination() == destination) failed: first live > > obj in the space must match the destination > > > > So this also suggests that we should keep destination() returning > > the real destination. > > > > Some other comments: > > src/hotspot/share/gc/parallel/psParallelCompact.cpp > > ? > > 3383 void ShadowClosure::complete_region(ParCompactionManager *cm, > > HeapWord *dest_addr, > > 3384 > > PSParallelCompact::RegionData *region_ptr) { > > 3385 assert(region_ptr->shadow_state() == > > ParallelCompactData::RegionData::FINISH, "Region should be > finished?); > > > > This assertion will also trigger when running with a debug build and > > at this point the shadow state should be SHADOW not FINISH. > > ? > > > > > > Sorry for these buggy assertions. The shadow_state in > > ShadowClosure::complete_region should be SHADOW instead of FINISH, and > > I've corrected it. Moreover, while I was testing it in the debug mode, I > > found another interesting case, in which a region should return to the > > normal path if it becomes available before invoking fill_shadow_region > > (the branch that shadow_region == 0 at psParallelCompact.cpp:3182). > > Therefore, I add a new function > > ParallelCompactData::RegionData::mark_normal() to handle this special > > case, so the assertion in MoveAndUpdateClosure::complete_region will > > success. > Nice, I think it would make sense to used cmpxchg in mark_normal() as > well and assert that the returned value is SHADOW. > > Thanks, > Stefan > > > > > src/hotspot/share/gc/parallel/psParallelCompact.hpp > > ? > > 632 inline bool ParallelCompactData::RegionData::mark_filled() { > > 633 return Atomic::cmpxchg(FILLED, &_shadow_state, SHADOW) == > > SHADOW; > > 634 } > > > > Since we never check the return value here we should make it void > > and maybe instead add an assert that the return value is SHADOW. > > ? > > > > > > Thanks for the suggestion. I have changed mark_filled() to void. > > > > I really appreciate your reviews. If there are any issues in the patch, > > please let me know at any time. Thanks again! > > Best Regards, > > Haoyu Li > > > > Stefan Johansson > > ?2019?10?29??? ??3:03??? > > > > Hi Haoyu, > > > > I?ve looked through the patch in detail now and created a new webrev > at: > > http://cr.openjdk.java.net/~sjohanss/8220465/01/ > > > > I took the liberty of removing the removal of move_and_update from > > your patch since I?m addressing that separately in JDK-8233065. The > > webrev above is still based on that removal, but I expect that to be > > pushed tomorrow or Wednesday so that should be fine. > > > > I also changed the subject to make it more clear that this is now a > > review of: > > https://bugs.openjdk.java.net/browse/JDK-8220465 > > > > Regarding the current patch, I think that it looks good in general, > > but I thought a bit more around how to share stuff between the > > closures and I agree that adding those extra virtual functions > > doesn?t really feel worth it. I?m wondering if a solution where we > > revert back to letting destination be the ?real destination? (not > > ever pointing to the shadow region) and add a copy_destination which > > is destination + offset. To make this work the normal > > MoveAndUpdateClosure would also have an offset, but it would always > > be 0. If do_addr() is then updated to use the copy_destination() in > > some places we might end up with something pretty nice, but maybe > > I?m missing something. > > > > I also realized that the current patch will trigger an assert > > because destination is expected not to be the shadow address: > > # Internal Error > > (open/src/hotspot/share/gc/parallel/psParallelCompact.cpp:3045), > > pid=12649, tid=12728 > > # assert(src_cp->destination() == destination) failed: first live > > obj in the space must match the destination > > > > So this also suggests that we should keep destination() returning > > the real destination. > > > > Some other comments: > > src/hotspot/share/gc/parallel/psParallelCompact.cpp > > ? > > 3383 void ShadowClosure::complete_region(ParCompactionManager *cm, > > HeapWord *dest_addr, > > 3384 > > PSParallelCompact::RegionData *region_ptr) { > > 3385 assert(region_ptr->shadow_state() == > > ParallelCompactData::RegionData::FINISH, "Region should be > finished?); > > > > This assertion will also trigger when running with a debug build and > > at this point the shadow state should be SHADOW not FINISH. > > ? > > > > src/hotspot/share/gc/parallel/psParallelCompact.hpp > > ? > > 632 inline bool ParallelCompactData::RegionData::mark_filled() { > > 633 return Atomic::cmpxchg(FILLED, &_shadow_state, SHADOW) == > > SHADOW; > > 634 } > > > > Since we never check the return value here we should make it void > > and maybe instead add an assert that the return value is SHADOW. > > ? > > > > When you addressed these comments, would it be possible to include > > both the full patch and and the incremental changes from the current > > version. That makes it easier for the reviewers to see what changed > > between version of the patch. > > > > Thanks, > > Stefan > > > > > 24 okt. 2019 kl. 14:16 skrev Stefan Johansson > > >: > > > > > > Hi Haoyu, > > > > > > On 2019-10-23 17:15, Haoyu Li wrote: > > >> Hi Stefan, > > >> Thanks for your constructive feedback. I've addressed all the > > issues you mentioned, and the updated patch is attached in this > email. > > > Nice, I will look at the patch next week, but I'll shortly answer > > your questions right away. > > > > > >> During refining the patch, I have a couple of questions: > > >> 1) Now the MoveAndUpdateClosure and ShadowClosure assume the > > destination address is the very beginning of a region, instead of an > > arbitrary address like what it used to be. However, there is an > > unused function named PSParallelCompact::move_and_update() uses the > > MoveAndUpdateClosure to process a region from its middle, which > > conflicts with the assumption. I notice that you removed this > > function in your patch, and so did I in the updated patch. Does it > > matter? > > > Yes, I found this function during my code review and it should be > > removed, but I think that should be handled as a separate issue. We > > can do this removal before this patch goes in. > > > > > >> 2) Using the same do_addr() in MoveAndUpdateClosure and > > ShadowClosure is doable, but it does not reuse all the code neatly. > > Because storing the address of the shadow region in _destination > > requires extra virtual functions to handle allocating blocks in the > > start_array and setting addresses of deferred objects. In > > particular, allocate_blocks() and set_deferred_object_for() in both > > closures are added. Is it worth avoiding to use _offset to calculate > > the shadow_destination? > > > Ok, sounds like it might be better to have specific do_addr() > > functions then. I'll think some more around this when reviewing the > > new patch in depth. > > > > > >> If there are any problems with this patch, please contact me > > anytime. I'm more than happy to keep improving the code. Thanks > > again for reviewing. > > >> > > > Sound good, thanks, > > > Stefan > > > From thomas.schatzl at oracle.com Tue Nov 12 15:24:13 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Nov 2019 16:24:13 +0100 Subject: RFR (M): 8233919: Incrementally calculate the occupied cards in a heap region remembered set Message-ID: <23b30b41-b109-4d05-606f-fa6a87a07897@oracle.com> Hi all, may I have reviews for this change that ultimately makes sure that the number of occupied cards in a remembered set is only growing by providing a per-OtherRegionsTable count that is atomically updated when adding a remembered set entry. Note that this count may not be completely accurate due to races when deleting a PerRegionTable (which is a known issue) from an OtherRegionsTable; but that is no different than before. This helps improving the predictions in the young gen remset sampling thread, and increase the performance of getting the occupancy count. Based on JDK-8233997, and JDK-8233998 also out for review. CR: https://bugs.openjdk.java.net/browse/JDK-8233919 Webrev: http://cr.openjdk.java.net/~tschatzl/8233919/webrev/ Testing: hs-tier1-5, Thanks, Thomas From thomas.schatzl at oracle.com Tue Nov 12 15:23:46 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Nov 2019 16:23:46 +0100 Subject: RFR (XXS): 8233997: Some members of HeapRegion are not cleared in HeapRegion::hr_clear() Message-ID: <0d249a19-eb7d-9c91-7034-1ab030302019@oracle.com> Hi all, can I get reviews for this small change that fixes some reinitialization problem of HeapRegions with the young remset sampling thread? So the young remset sampling thread recalculates young gen remset sizes, and it happens that without this fix it may read the (old) remset size of the previous use of a given young gen region (also because of JDK-8233998). The fix is to properly clear the remaining members not cleared yet in hr_clear. CR: https://bugs.openjdk.java.net/browse/JDK-8233997 Webrev: http://cr.openjdk.java.net/~tschatzl/8233997/webrev/ Testing: hs-tier1-5 with other similar patches, specjvm2008 specjvm.validation with -XX:G1ConcRefinementServiceIntervalMillis=1 (that one causes issues quickly if you assert that the remembered sets for regions only grows) Thanks, Thomas From thomas.schatzl at oracle.com Tue Nov 12 15:24:05 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Nov 2019 16:24:05 +0100 Subject: RFR (S): 8233998: New young regions registered too early in collection set Message-ID: Hi, can I have reviews for this change that changes the place in which new mutator regions are published in the collection set list? Previously a new eden region has been published before some data that would be read by the young gen sampling thread could be visible. This change simply does the member updates before adding the regions to the collection set. CR: https://bugs.openjdk.java.net/browse/JDK-8233998 Webrev: http://cr.openjdk.java.net/~tschatzl/8233998/webrev/ Testing: hs-tier1-5 with other similar patches, specjvm2008 specjvm.validation with -XX:G1ConcRefinementServiceIntervalMillis=1 (that one causes issues quickly if you assert that the remembered sets for regions only grows) Thanks, Thomas From thomas.schatzl at oracle.com Tue Nov 12 15:23:58 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Nov 2019 16:23:58 +0100 Subject: RFR (M): 8233588: Clean up SurvRateGroup Message-ID: <1ab7fd39-fc0c-79ee-e925-fc8b88e32177@oracle.com> Hi all, can I have some reviews for this change that cleans up the SurvRateGroup class. In particular, while working with it I found that it contains two members that are duplicates of others. This removed a few methods, which in turn made some others obsolete. Further I tried to improve encapsulation so that not everyone needs to know all details down to SurvRateGroup. That's why this change is a bit larger than you'd probably expect, but it contains a significant amount of code deletion! :) CR: https://bugs.openjdk.java.net/browse/JDK-8233588 Webrev: http://cr.openjdk.java.net/~tschatzl/8233588/webrev/ Testing: hs-tier1-5 Thanks, Thomas From erik.osterlund at oracle.com Tue Nov 12 15:39:16 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Tue, 12 Nov 2019 16:39:16 +0100 Subject: RFR: 8230661: ZGC: Stop reloading oops in load barriers In-Reply-To: <98c51c5f-96c5-fe3e-551b-02a50e2ecdc6@oracle.com> References: <954589ac-147d-9419-f5eb-2a4fceb61cf5@oracle.com> <98c51c5f-96c5-fe3e-551b-02a50e2ecdc6@oracle.com> Message-ID: <2ec0eb63-e8d1-3acd-2288-177007d7bede@oracle.com> Hi Stefan, Thanks for the review. I will file an RFE about the proposed enhancements (which I agree about). /Erik On 11/12/19 4:05 PM, Stefan Karlsson wrote: > Hi, > > This looks good to me. > > As a separate patch, I think that we should: > > 1) swap places between the calls to ZResurrection::unblock and > _unload.purge(). There every stale weak/phantom reference has been > cleaned out at this point, and there's no need to hold the > resurrection blocked windows open longer than necessary. > > 2) change the implementation of ZResurrection to use > Atomic::load/store now that we rely on the handshakes to prevent > someone from having a weak oop to a "dead" object while finding out > that the resurrection has been unblocked. > > Thanks, > StefanK > > On 2019-11-11 21:41, Per Liden wrote: >> Erik, Stefan and I discussed the original patch and agreed to make >> some adjustments. The changes are: >> >> 1) Broke out self healing logic into ZBarrier::self_heal(), and >> restructured ZBarrier::barrier() and ZBarrier::weak_barrier() to use >> that function. >> >> 2) Made sure the slow path is only ever executed once, even when self >> healing is re-applied. >> >> 3) In the resurrection blocked window, allow weak oop refs to be >> healed to the good state, not just the remapped state. >> >> 4) Moved the handshake in ZUnload::unload() out into ZHeap, to make >> it more front-and-center, since it's no longer just needed for class >> unloading. >> >> Diff: http://cr.openjdk.java.net/~pliden/8230661/webrev.1-diff >> Full: http://cr.openjdk.java.net/~pliden/8230661/webrev.1 >> Testing: Passed tier 1-7 >> >> With these adjustments this patch looks good to me! >> >> cheers, >> Per >> >> On 10/28/19 5:44 PM, Erik ?sterlund wrote: >>> Oops. CR link was a bug link. For anyone that couldn't figure out >>> what the CR link could possibly be, here it is: >>> http://cr.openjdk.java.net/~eosterlund/8230661/webrev.00/ >>> >>> /Erik >>> >>> On 2019-10-28 17:38, Erik ?sterlund wrote: >>>> Hi, >>>> >>>> In ZGC, an oop is first loaded somewhere, by e.g. JIT compiled >>>> code. Then it passes a load barrier that typically does not take a >>>> slow path. But when it does take a slow path, the oop is sometimes >>>> reloaded, at historically three different places, and now two places. >>>> >>>> 1) We used to do that as part of the mechanism that transferred >>>> execution to the slow path because it was easier to write that stub >>>> code if the original oop died. Since then, the compiler slow paths >>>> have been rewritten to not reload the oop. >>>> >>>> 2) Once in the slow path, we sometimes reload weak oops during the >>>> resurrection block window, because there used to be a race when it >>>> closed. After concurrent class unloading integrated, there is a >>>> thread-local handshake before closing the resurrection block >>>> window. Therefore, that race no longer exists (when class unloading >>>> is used). >>>> >>>> 3) Once the final oop of a slow path has been determined, >>>> self-healing kicks in. The self-healing CAS may fail. When it does, >>>> the oop is reloaded. But this is completely unnecessary. >>>> >>>> With obstacle 1 gone, and 2 and 3 having no reason to be in the >>>> code any more, I propose to get rid of all reloading of the oops in >>>> the slow paths, so that it becomes easier to reason about the code. >>>> The object captured by the original load, is then always the same >>>> object as the object found after the load barrier completes, >>>> although possibly with a new bit representation. >>>> >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8230661 >>>> >>>> CR: >>>> https://bugs.openjdk.java.net/browse/JDK-8230661 >>>> >>>> Thanks, >>>> /Erik >>> From erik.osterlund at oracle.com Tue Nov 12 16:00:20 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Tue, 12 Nov 2019 17:00:20 +0100 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting In-Reply-To: References: Message-ID: <869e9b18-2e35-95c4-a753-4e5c3697b1db@oracle.com> Hi Man, Wow - a blast from the past! Since this is based on my old prototype patch from 2015, I thought I should probably provide some feedback. First of all, thanks for productizing my idea. The idea why I saved the MemRegions in my prototype, as far as I can remember, was so that I could coalesce consecutive ranges, which to me had the perceived benefit of avoiding redundant lookups for e.g. which object is the first from the block offset table, when you have consecutive cards. It might have been a premature optimization, I'm not sure. But I thought that I should at least explain what the thought was (if I remember correctly - it was a while ago). Investigating if that's worth it or not as a separate thing seems reasonable. Looks good to me, and thanks for taking this further. Thanks, /Erik On 11/12/19 3:26 AM, Man Cao wrote: > Hi all, > > Can I have reviews for an updated implementation for batching card > refinement? > RFE: https://bugs.openjdk.java.net/browse/JDK-8087198 > Webrev: https://cr.openjdk.java.net/~manc/8087198/webrev.00/ > > Old review thread is: > http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2015-June/013798.html. > Major differences from the 2015 webrev: > - New version does not save the MemRegions for the cards in a buffer. I > noticed considerable memory overhead with BigRamTester if we save the > MemRegions. > - New version handles SuspendibleThreadSetJoiner::should_yield() in a more > timely fashion. Instead of forcing refining all buffered cards, the new > version can abandon the buffered cards. > - New version only batches and sorts the cards, not joining and > prefetching. I have not investigated whether joining and prefetching help > much. I think it is OK to investigate them in a separate RFE later. > Please refer to the RFE page for some performance results. > > For correctness, tested with: > - Submit repo: tier1 > - Local fastdebug build: tier2 > - Fastdebug stress testing DaCapo h2 and BigRamTester with following > option combinations in addition to -XX:+VerifyRememberedSets: > default options > -XX:-G1UseAdaptiveConcRefinement -XX:G1UpdateBufferSize=4 > -XX:G1ConcRefinementGreenZone=0 -XX:G1ConcRefinementYellowZone=1 > -XX:G1ConcRefinementThreads=0 > > -Man From kim.barrett at oracle.com Tue Nov 12 17:26:42 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 12 Nov 2019 12:26:42 -0500 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting In-Reply-To: References: Message-ID: <34BD822F-20DD-43E3-956F-09622EB62148@oracle.com> > On Nov 11, 2019, at 9:26 PM, Man Cao wrote: > > Hi all, > > Can I have reviews for an updated implementation for batching card > refinement? > RFE: https://bugs.openjdk.java.net/browse/JDK-8087198 > Webrev: https://cr.openjdk.java.net/~manc/8087198/webrev.00/ > > Old review thread is: > http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2015-June/013798.html. > Major differences from the 2015 webrev: > - New version does not save the MemRegions for the cards in a buffer. I > noticed considerable memory overhead with BigRamTester if we save the > MemRegions. > - New version handles SuspendibleThreadSetJoiner::should_yield() in a more > timely fashion. Instead of forcing refining all buffered cards, the new > version can abandon the buffered cards. > - New version only batches and sorts the cards, not joining and > prefetching. I have not investigated whether joining and prefetching help > much. I think it is OK to investigate them in a separate RFE later. > Please refer to the RFE page for some performance results. > > For correctness, tested with: > - Submit repo: tier1 > - Local fastdebug build: tier2 > - Fastdebug stress testing DaCapo h2 and BigRamTester with following > option combinations in addition to -XX:+VerifyRememberedSets: > default options > -XX:-G1UseAdaptiveConcRefinement -XX:G1UpdateBufferSize=4 > -XX:G1ConcRefinementGreenZone=0 -XX:G1ConcRefinementYellowZone=1 > -XX:G1ConcRefinementThreads=0 > > -Man Some initial thoughts, not a full review yet. The approach looks good. It might also simplify some ideas I've been playing with in the background. I think the decision to defer investigation of additional batching, joining, and prefetching is fine. I'd like to see more performance testing. I'll probably do some, once I think the change looks more settled. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1RemSet.cpp 1264 bool G1RemSet::clean_card_before_refine(CardValue*& card_ptr) { ... 1279 // If the card is no longer dirty, nothing to do. 1280 if (*card_ptr != G1CardTable::dirty_card_val()) { 1281 return false; 1282 } Having not looked at this code for a while, I started to wonder why this card value check wasn't being done up front. Then I remembered that we might uncommit parts of the card table covering uncommitted regions. A comment about that might save time for future readers. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp 229 class G1RefineBufferedCards : public StackObj { ... 234 CardTable::CardValue** const _cards; I think the temporary _cards buffer isn't needed. collect_and_clean_cards could use two-finger compaction of the _node_buffer in place. (Similar to the SATB buffer filtering.) abandon_cards then doesn't need to memcpy card pointers back into the _node_buffer either (though that's very rare, so not important for performance). This will obviously also affect the iteration ranges and such in various places. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp 279 void abandon_cards(size_t num_collected) { I think this function is poorly named. We're not abandoning the cards. We are instead abandoning the refinement of the cards in part of the buffer. Something like keep_unrefined_cards might be better. ------------------------------------------------------------------------------ 290 G1RefineBufferedCards(BufferNode* node, The initializer list is indented in an unusual way. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp 315 // This fence serves two purposes. ... 316 // ... Second, we can't proceed with 317 // processing a region until after the read of the region's top in 318 // collect_and_clean_cards(), ... This change re-reads top() after the fence. In the old-gen region case that's okay, because top() is stable while concurrent with the mutator (we never allocate old in that phase). Similarly, archive region tops are stable because we don't allocate in them at all. I *think* re-read is okay for humongous regions too, but haven't fully convinced myself of that yet. (The argument would be that if the comparison before the barrier (while cleaning) passed, then the value is subsequently stable. I've not yet grovelled through code to verify that.) I think the argument for that needs to be made in the commentary. If it's not okay, then a temporary buffer to pass scan_limit values from cleaning to refining might be needed. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1ConcurrentRefineThread.cpp 114 ResourceMark rm; I don't think this is the right place for the ResourceMark. I think it belongs in G1RefineBufferedCards. Wrapping a ResourceMark around an unbounded iteration isn't really safe; if the allocations and frees are all nested properly it will work, but if not it can explode. All this assumes the ResourceMark to deal with the temporary buffer allocated by G1RefineBufferedCards is still needed. If there are no temporary buffers... ------------------------------------------------------------------------------ From kim.barrett at oracle.com Tue Nov 12 18:23:30 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 12 Nov 2019 13:23:30 -0500 Subject: RFR: 8232588: G1 concurrent System.gc can return early or late In-Reply-To: References: <8518409B-159B-48B2-97E4-D5D4C3B2BC5A@oracle.com> <3B101D78-ED8C-4810-B711-A41AE0CC11C6@oracle.com> Message-ID: <36C18183-7786-4226-911A-6244F01B7C57@oracle.com> > On Nov 12, 2019, at 8:19 AM, Stefan Johansson wrote: > > Hi Kim, > > Thanks for cleaning up this part of the code. Looks good overall but I have a couple of small comments below. Thanks. > On 2019-11-11 20:40, Kim Barrett wrote: >>> On Nov 7, 2019, at 4:02 PM, Kim Barrett wrote: >>> >>>> On Nov 7, 2019, at 4:58 AM, Thomas Schatzl wrote: >>>>> New webrevs: >>>>> full: https://cr.openjdk.java.net/~kbarrett/8232588/open.01/ > > src/hotspot/share/gc/g1/g1CollectedHeap.cpp > --- > 2089 // Return true if (x < y) with allowance for wraparound. > 2090 static bool gc_counter_less_than(uint x, uint y) { > 2091 return (x - y) > (UINT_MAX/2); > 2092 } > > This code makes me have to think to much, with regards to what it does. What do you think about using size_t instead of uint and just do simple comparisons? The fanout from changing these counters from uint to size_t is pretty large. It also doesn't help for a long-running 32bit application. I thought about adding a utility for this kind of thing, since this isn?t the only occurrence. But I ran into problems for which I don?t currently have a solution (fixed by C++14). > src/hotspot/share/runtime/mutexLocker.hpp > --- > 71 extern Monitor* G1FullGCCount_lock; // in support of "concurrent" full gc > > I think the name and the comment could be improve now when it's G1 specific. Something like G1OldGCCount_lock, to me Full signals that it's only STW. The overloading of "full" (concurrent full vs stw full) occurs in many places. But this particular area is using "old" rather than ?full?, so I've changed the name. New webrevs: full: https://cr.openjdk.java.net/~kbarrett/8232588/open.02/ incr: https://cr.openjdk.java.net/~kbarrett/8232588/open.02.inc/ From thomas.schatzl at oracle.com Tue Nov 12 19:27:37 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 12 Nov 2019 20:27:37 +0100 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting In-Reply-To: <34BD822F-20DD-43E3-956F-09622EB62148@oracle.com> References: <34BD822F-20DD-43E3-956F-09622EB62148@oracle.com> Message-ID: <5c700c0a-954e-6187-67b4-cc523cace95c@oracle.com> Hi, On 12.11.19 18:26, Kim Barrett wrote: >> On Nov 11, 2019, at 9:26 PM, Man Cao wrote: >> >> Hi all, >> >> Can I have reviews for an updated implementation for batching card >> refinement? >> RFE: https://bugs.openjdk.java.net/browse/JDK-8087198 >> Webrev: https://cr.openjdk.java.net/~manc/8087198/webrev.00/ >> >> Old review thread is: >> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2015-June/013798.html. >> Major differences from the 2015 webrev: >> - New version does not save the MemRegions for the cards in a buffer. I >> noticed considerable memory overhead with BigRamTester if we save the >> MemRegions. >> - New version handles SuspendibleThreadSetJoiner::should_yield() in a more >> timely fashion. Instead of forcing refining all buffered cards, the new >> version can abandon the buffered cards. >> - New version only batches and sorts the cards, not joining and >> prefetching. I have not investigated whether joining and prefetching help >> much. I think it is OK to investigate them in a separate RFE later. >> Please refer to the RFE page for some performance results. >> >> For correctness, tested with: >> - Submit repo: tier1 >> - Local fastdebug build: tier2 >> - Fastdebug stress testing DaCapo h2 and BigRamTester with following >> option combinations in addition to -XX:+VerifyRememberedSets: >> default options >> -XX:-G1UseAdaptiveConcRefinement -XX:G1UpdateBufferSize=4 >> -XX:G1ConcRefinementGreenZone=0 -XX:G1ConcRefinementYellowZone=1 >> -XX:G1ConcRefinementThreads=0 >> >> -Man > > Some initial thoughts, not a full review yet. Same here. > > The approach looks good. It might also simplify some ideas I've been > playing with in the background. > > I think the decision to defer investigation of additional batching, > joining, and prefetching is fine. I already split this CR into two. > > I'd like to see more performance testing. I'll probably do some, once > I think the change looks more settled. From a functional POV hs-tier1-5 pass with the change. > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/g1ConcurrentRefineThread.cpp > 114 ResourceMark rm; > > I don't think this is the right place for the ResourceMark. I think > it belongs in G1RefineBufferedCards. Wrapping a ResourceMark around > an unbounded iteration isn't really safe; if the allocations and frees > are all nested properly it will work, but if not it can explode. > > All this assumes the ResourceMark to deal with the temporary buffer > allocated by G1RefineBufferedCards is still needed. If there are no > temporary buffers... > I also noticed those, there are two places where ResourceMarks without any obvious use are missing, probably leftovers of debugging code? -------------------------------- The code snippet: 365 G1RefineBufferedCards buffered_cards(node, 366 buffer_size(), 367 counter); 368 bool result = buffered_cards.refine(worker_id); could probably be wrapped into a method as it is done twice. The initialization of the G1RefineBufferedCards breaks a bit the flow of the code without later use of the buffered_cards variable. Maybe it is used in follow-up patches in these places? Thanks, Thomas From kim.barrett at oracle.com Tue Nov 12 20:03:40 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 12 Nov 2019 15:03:40 -0500 Subject: RFR (XXS): 8233997: Some members of HeapRegion are not cleared in HeapRegion::hr_clear() In-Reply-To: <0d249a19-eb7d-9c91-7034-1ab030302019@oracle.com> References: <0d249a19-eb7d-9c91-7034-1ab030302019@oracle.com> Message-ID: > On Nov 12, 2019, at 10:23 AM, Thomas Schatzl wrote: > > Hi all, > > can I get reviews for this small change that fixes some reinitialization problem of HeapRegions with the young remset sampling thread? > > So the young remset sampling thread recalculates young gen remset sizes, and it happens that without this fix it may read the (old) remset size of the previous use of a given young gen region (also because of JDK-8233998). > > The fix is to properly clear the remaining members not cleared yet in hr_clear. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8233997 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8233997/webrev/ > Testing: > hs-tier1-5 with other similar patches, specjvm2008 specjvm.validation with -XX:G1ConcRefinementServiceIntervalMillis=1 (that one causes issues quickly if you assert that the remembered sets for regions only grows) > > Thanks, > Thomas Looks good. From kim.barrett at oracle.com Tue Nov 12 20:27:59 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 12 Nov 2019 15:27:59 -0500 Subject: RFR (S): 8234000: Make HeapRegion::bottom/end/hrm_index const In-Reply-To: <0bd70faf-e418-a719-3f48-51877b0d59f1@oracle.com> References: <0bd70faf-e418-a719-3f48-51877b0d59f1@oracle.com> Message-ID: <67C4DC64-6A3F-42A0-B288-51C814DFE3E5@oracle.com> > On Nov 12, 2019, at 9:42 AM, Thomas Schatzl wrote: > > Hi all, > > can I have reviews for this very small change that removes the possibility to change the bottom/end/hrm_index members in HeapRegion? > > This is not required functionality, so I thought it would be good to clean up. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8234000 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8234000/webrev/ > Testing: > local compilation, hs-tier1 > > Thanks, > Thomas Looks good. From stefan.johansson at oracle.com Tue Nov 12 20:56:53 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 12 Nov 2019 21:56:53 +0100 Subject: RFR (S): 8234000: Make HeapRegion::bottom/end/hrm_index const In-Reply-To: <67C4DC64-6A3F-42A0-B288-51C814DFE3E5@oracle.com> References: <0bd70faf-e418-a719-3f48-51877b0d59f1@oracle.com> <67C4DC64-6A3F-42A0-B288-51C814DFE3E5@oracle.com> Message-ID: <4F7CA490-E9BF-4910-A339-9912C7F0A578@oracle.com> Nice cleanup Thomas, > 12 nov. 2019 kl. 21:27 skrev Kim Barrett : > >> On Nov 12, 2019, at 9:42 AM, Thomas Schatzl wrote: >> >> Hi all, >> >> can I have reviews for this very small change that removes the possibility to change the bottom/end/hrm_index members in HeapRegion? >> >> This is not required functionality, so I thought it would be good to clean up. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8234000 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8234000/webrev/ >> Testing: >> local compilation, hs-tier1 >> >> Thanks, >> Thomas > > Looks good. > Looks good, Stefan From stefan.johansson at oracle.com Tue Nov 12 21:12:28 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 12 Nov 2019 22:12:28 +0100 Subject: RFR (XXS): 8233997: Some members of HeapRegion are not cleared in HeapRegion::hr_clear() In-Reply-To: <0d249a19-eb7d-9c91-7034-1ab030302019@oracle.com> References: <0d249a19-eb7d-9c91-7034-1ab030302019@oracle.com> Message-ID: <89152154-5BB2-40E2-983D-17B909FDBBF6@oracle.com> Hi Thomas, > 12 nov. 2019 kl. 16:23 skrev Thomas Schatzl : > > Hi all, > > can I get reviews for this small change that fixes some reinitialization problem of HeapRegions with the young remset sampling thread? > > So the young remset sampling thread recalculates young gen remset sizes, and it happens that without this fix it may read the (old) remset size of the previous use of a given young gen region (also because of JDK-8233998). > > The fix is to properly clear the remaining members not cleared yet in hr_clear. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8233997 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8233997/webrev/ Looks good, Stefan > Testing: > hs-tier1-5 with other similar patches, specjvm2008 specjvm.validation with -XX:G1ConcRefinementServiceIntervalMillis=1 (that one causes issues quickly if you assert that the remembered sets for regions only grows) > > Thanks, > Thomas From stefan.johansson at oracle.com Tue Nov 12 22:03:58 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 12 Nov 2019 23:03:58 +0100 Subject: RFR: 8232588: G1 concurrent System.gc can return early or late In-Reply-To: <36C18183-7786-4226-911A-6244F01B7C57@oracle.com> References: <8518409B-159B-48B2-97E4-D5D4C3B2BC5A@oracle.com> <3B101D78-ED8C-4810-B711-A41AE0CC11C6@oracle.com> <36C18183-7786-4226-911A-6244F01B7C57@oracle.com> Message-ID: <4ADB4679-F1B6-4D4F-8748-A5D956B1039B@oracle.com> > 12 nov. 2019 kl. 19:23 skrev Kim Barrett : > >> On Nov 12, 2019, at 8:19 AM, Stefan Johansson wrote: >> >> Hi Kim, >> >> Thanks for cleaning up this part of the code. Looks good overall but I have a couple of small comments below. > > Thanks. > >> On 2019-11-11 20:40, Kim Barrett wrote: >>>> On Nov 7, 2019, at 4:02 PM, Kim Barrett wrote: >>>> >>>>> On Nov 7, 2019, at 4:58 AM, Thomas Schatzl wrote: >>>>>> New webrevs: >>>>>> full: https://cr.openjdk.java.net/~kbarrett/8232588/open.01/ >> >> src/hotspot/share/gc/g1/g1CollectedHeap.cpp >> --- >> 2089 // Return true if (x < y) with allowance for wraparound. >> 2090 static bool gc_counter_less_than(uint x, uint y) { >> 2091 return (x - y) > (UINT_MAX/2); >> 2092 } >> >> This code makes me have to think to much, with regards to what it does. What do you think about using size_t instead of uint and just do simple comparisons? > > The fanout from changing these counters from uint to size_t is pretty > large. It also doesn't help for a long-running 32bit application. I see. > > I thought about adding a utility for this kind of thing, since this isn?t the only occurrence. > But I ran into problems for which I don?t currently have a solution (fixed by C++14). Would a generic utility be implemented differently? To me this only works because we know that the values never differ much (1 or 2 I think), but I might be missing something. Another approach would be to reset those values in a well defined manner to ensure they never wrap. From what I can see we only use them to ensure this state machine is doing what we expect, or is it possible that we can end up in a situation where we never could reset the values? > >> src/hotspot/share/runtime/mutexLocker.hpp >> --- >> 71 extern Monitor* G1FullGCCount_lock; // in support of "concurrent" full gc >> >> I think the name and the comment could be improve now when it's G1 specific. Something like G1OldGCCount_lock, to me Full signals that it's only STW. > > The overloading of "full" (concurrent full vs stw full) occurs in many > places. But this particular area is using "old" rather than ?full?, so > I've changed the name. > > New webrevs: > full: https://cr.openjdk.java.net/~kbarrett/8232588/open.02/ > incr: https://cr.openjdk.java.net/~kbarrett/8232588/open.02.inc/ Looks good, Stefan From kim.barrett at oracle.com Tue Nov 12 23:15:58 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 12 Nov 2019 18:15:58 -0500 Subject: RFR: 8232588: G1 concurrent System.gc can return early or late In-Reply-To: <4ADB4679-F1B6-4D4F-8748-A5D956B1039B@oracle.com> References: <8518409B-159B-48B2-97E4-D5D4C3B2BC5A@oracle.com> <3B101D78-ED8C-4810-B711-A41AE0CC11C6@oracle.com> <36C18183-7786-4226-911A-6244F01B7C57@oracle.com> <4ADB4679-F1B6-4D4F-8748-A5D956B1039B@oracle.com> Message-ID: <35F216D3-722E-4C6E-9BA0-2C53D1FBC2F9@oracle.com> > On Nov 12, 2019, at 5:03 PM, Stefan Johansson wrote: > >> 12 nov. 2019 kl. 19:23 skrev Kim Barrett : >> I thought about adding a utility for this kind of thing, since this isn?t the only occurrence. >> But I ran into problems for which I don?t currently have a solution (fixed by C++14). > > Would a generic utility be implemented differently? To me this only works because we know that the values never differ much (1 or 2 I think), but I might be missing something. One of the constraints on using such a utility is that one ?knows? the difference ?never" gets ?too large?. Another place doing this sort of thing is GlobalCounter. I think I?ve seen others, but couldn?t find any just now. A generic utility should be templated over the integer type, and would use std::numeric_limits::max(). But that doesn?t currently work on Solaris, where that expression produces a reference to a library that we don?t link to. C++14 (or maybe C++11) made that expression constexpr, so *must* be known at compile-time. > Another approach would be to reset those values in a well defined manner to ensure they never wrap. From what I can see we only use them to ensure this state machine is doing what we expect, or is it possible that we can end up in a situation where we never could reset the values? It?s true that the current values of those counters never differ by more than 1 or 2. But the values captured by some thread that gets stalled might differ by more. And I don?t see a good reset place because of the values captured by various threads. >> New webrevs: >> full: https://cr.openjdk.java.net/~kbarrett/8232588/open.02/ >> incr: https://cr.openjdk.java.net/~kbarrett/8232588/open.02.inc/ > > Looks good, > Stefan Thanks. From stefan.johansson at oracle.com Wed Nov 13 08:11:32 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 13 Nov 2019 09:11:32 +0100 Subject: RFR: 8232588: G1 concurrent System.gc can return early or late In-Reply-To: <35F216D3-722E-4C6E-9BA0-2C53D1FBC2F9@oracle.com> References: <8518409B-159B-48B2-97E4-D5D4C3B2BC5A@oracle.com> <3B101D78-ED8C-4810-B711-A41AE0CC11C6@oracle.com> <36C18183-7786-4226-911A-6244F01B7C57@oracle.com> <4ADB4679-F1B6-4D4F-8748-A5D956B1039B@oracle.com> <35F216D3-722E-4C6E-9BA0-2C53D1FBC2F9@oracle.com> Message-ID: On 2019-11-13 00:15, Kim Barrett wrote: >> On Nov 12, 2019, at 5:03 PM, Stefan Johansson wrote: >> >>> 12 nov. 2019 kl. 19:23 skrev Kim Barrett : >>> I thought about adding a utility for this kind of thing, since this isn?t the only occurrence. >>> But I ran into problems for which I don?t currently have a solution (fixed by C++14). >> >> Would a generic utility be implemented differently? To me this only works because we know that the values never differ much (1 or 2 I think), but I might be missing something. > > One of the constraints on using such a utility is that one ?knows? the difference ?never" gets ?too large?. > Another place doing this sort of thing is GlobalCounter. I think I?ve seen others, but couldn?t find any > just now. > > A generic utility should be templated over the integer type, and would use std::numeric_limits::max(). > But that doesn?t currently work on Solaris, where that expression produces a reference to a library > that we don?t link to. C++14 (or maybe C++11) made that expression constexpr, so *must* be known > at compile-time. > Thanks for the explanation. >> Another approach would be to reset those values in a well defined manner to ensure they never wrap. From what I can see we only use them to ensure this state machine is doing what we expect, or is it possible that we can end up in a situation where we never could reset the values? > > It?s true that the current values of those counters never differ by more than 1 or 2. But the values > captured by some thread that gets stalled might differ by more. And I don?t see a good reset > place because of the values captured by various threads. In that case I think we should go with your proposed solution. Thanks, Stefan > >>> New webrevs: >>> full: https://cr.openjdk.java.net/~kbarrett/8232588/open.02/ >>> incr: https://cr.openjdk.java.net/~kbarrett/8232588/open.02.inc/ >> >> Looks good, >> Stefan > > Thanks. > From thomas.schatzl at oracle.com Wed Nov 13 08:32:42 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 13 Nov 2019 09:32:42 +0100 Subject: RFR (XXS): 8233997: Some members of HeapRegion are not cleared in HeapRegion::hr_clear() In-Reply-To: <89152154-5BB2-40E2-983D-17B909FDBBF6@oracle.com> References: <0d249a19-eb7d-9c91-7034-1ab030302019@oracle.com> <89152154-5BB2-40E2-983D-17B909FDBBF6@oracle.com> Message-ID: Hi Stefan, Kim, thanks for your reviews. Thomas On 12.11.19 22:12, Stefan Johansson wrote: > Hi Thomas, > >> 12 nov. 2019 kl. 16:23 skrev Thomas Schatzl : >> >> Hi all, >> >> can I get reviews for this small change that fixes some reinitialization problem of HeapRegions with the young remset sampling thread? >> >> So the young remset sampling thread recalculates young gen remset sizes, and it happens that without this fix it may read the (old) remset size of the previous use of a given young gen region (also because of JDK-8233998). >> >> The fix is to properly clear the remaining members not cleared yet in hr_clear. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8233997 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8233997/webrev/ > > Looks good, > Stefan > >> Testing: >> hs-tier1-5 with other similar patches, specjvm2008 specjvm.validation with -XX:G1ConcRefinementServiceIntervalMillis=1 (that one causes issues quickly if you assert that the remembered sets for regions only grows) >> >> Thanks, >> Thomas > From thomas.schatzl at oracle.com Wed Nov 13 08:33:15 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 13 Nov 2019 09:33:15 +0100 Subject: RFR (S): 8234000: Make HeapRegion::bottom/end/hrm_index const In-Reply-To: <4F7CA490-E9BF-4910-A339-9912C7F0A578@oracle.com> References: <0bd70faf-e418-a719-3f48-51877b0d59f1@oracle.com> <67C4DC64-6A3F-42A0-B288-51C814DFE3E5@oracle.com> <4F7CA490-E9BF-4910-A339-9912C7F0A578@oracle.com> Message-ID: <400dd930-7d74-66f1-f86a-467370506cae@oracle.com> Hi Stefan, Kim, thanks for your reviews. Thomas On 12.11.19 21:56, Stefan Johansson wrote: > Nice cleanup Thomas, > >> 12 nov. 2019 kl. 21:27 skrev Kim Barrett : >> >>> On Nov 12, 2019, at 9:42 AM, Thomas Schatzl wrote: >>> >>> Hi all, >>> >>> can I have reviews for this very small change that removes the possibility to change the bottom/end/hrm_index members in HeapRegion? >>> >>> This is not required functionality, so I thought it would be good to clean up. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8234000 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8234000/webrev/ >>> Testing: >>> local compilation, hs-tier1 >>> >>> Thanks, >>> Thomas >> >> Looks good. >> > > Looks good, > Stefan > From stefan.johansson at oracle.com Wed Nov 13 09:17:11 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 13 Nov 2019 10:17:11 +0100 Subject: RFR (M): 8233306: Sort members in G1's HeapRegion after removal of Space dependency In-Reply-To: <568f7bca-3c39-f554-b557-953e5f7f157c@oracle.com> References: <568f7bca-3c39-f554-b557-953e5f7f157c@oracle.com> Message-ID: Hi Thomas, On 2019-10-31 14:47, Thomas Schatzl wrote: > Hi all, > > ?after the change to HeapRegion in JDK-8233306 the declaration fo the > HeapRegion class is a bit messed up (merging G1ContiguousSpace, adding a > few members needed from ContiguousSpace). > > This change tries to fix this as much as possible by shuffling around > stuff (i.e. grouping allocation related methods, evacuation related > methods, some helper pointers in HeapRegion, etc). > > Depends on JDK-8189737 also out for review. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8233306 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8233306/webrev/ Looks good, Stefan > Testing: > hs-tier1-5 > > Thanks, > ? Thomas From stefan.johansson at oracle.com Wed Nov 13 10:02:46 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 13 Nov 2019 11:02:46 +0100 Subject: RFR (M): 8228609: G1 copy cost prediction uses used vs. actual copied bytes In-Reply-To: <96c45857-03b3-40c4-ca90-2ec7bc2fa2d8@oracle.com> References: <3A799B0C-76B1-4145-A000-1071672BD566@oracle.com> <96c45857-03b3-40c4-ca90-2ec7bc2fa2d8@oracle.com> Message-ID: <6a80c62b-fac7-88fb-7064-b3c4c5c3e5e3@oracle.com> Hi Thomas, On 2019-11-12 10:06, Thomas Schatzl wrote: > ... > > > Given your above comments I changed the code to collect these values in > the "Merge Per-Thread State" phase as we only need a total value anyway. > At this point the necessary calculations are free, as we already iterate > over all _surviving_young_words entries. > > Webrev: > http://cr.openjdk.java.net/~tschatzl/8228609/webrev.0_to_1/ (diff) > http://cr.openjdk.java.net/~tschatzl/8228609/webrev.1/ (full) I think this looks good, Stefan > Testing: > hs-tier1-5 > > Thanks, > ? Thomas From manc at google.com Thu Nov 14 01:59:17 2019 From: manc at google.com (Man Cao) Date: Wed, 13 Nov 2019 17:59:17 -0800 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting In-Reply-To: <5c700c0a-954e-6187-67b4-cc523cace95c@oracle.com> References: <34BD822F-20DD-43E3-956F-09622EB62148@oracle.com> <5c700c0a-954e-6187-67b4-cc523cace95c@oracle.com> Message-ID: Thanks for the reviews and testing, very helpful! I have addressed all comments, new webrevs: http://cr.openjdk.java.net/~manc/8087198/webrev.01/ Incremental: https://cr.openjdk.java.net/~manc/8087198/webrev.00-01.inc/ Some tests involving G1AddMetaspaceDependency are crashing in debug/fastdebug builds, due to https://bugs.openjdk.java.net/browse/JDK-8234127 and G1UpdateBufferSize=1. I'm working on fixing this bug for the hashtable. I also see some Windows build failure on Submit repo, not sure if it is related to JDK-8234127, could someone find the logs for this run? Job: mach5-one-manc-JDK-8087198-1-20191113-2315-6689774 I also tested the performance for sorting order of the cards on BigRamTester, below are the CPU time for refinement threads, averaged across 34 trials with 95% confidence intervals in brackets: base (no batching): 437656.676 [425374.295, 449939.057] batching and decreasing order: 424714.529 [412841.728, 436587.330] batching and increasing order: 459918.294 [448483.304, 471353.284] Additional replies below. ------------------------------------------------------------------------------ > First of all, thanks for productizing my idea. > The idea why I saved the MemRegions in my prototype, as far as I can > remember, was so that I could coalesce consecutive ranges You are welcome, Erik. Thanks for the explanation! ------------------------------------------------------------------------------ > collect_and_clean_cards could use two-finger compaction of the > _node_buffer in place. (Similar to the SATB buffer filtering.) Great! I should have thought of this after removing the MemRegions. Now it's doing the two-finger compaction, and no more memcpy and ResourceMark. ------------------------------------------------------------------------------ > I think this function is poorly named. We're not abandoning the > cards. We are instead abandoning the refinement of the cards in part > of the buffer. Something like keep_unrefined_cards might be better. Renamed to redirty_unrefined_cards. ------------------------------------------------------------------------------ > This change re-reads top() after the fence. > ... > I *think* re-read is okay for humongous regions too, but haven't fully > convinced myself of that yet. I thought about this further and convinced myself it is safe. I added some DEBUG_ONLY code to check the two reads of top() should return the same value, by using a KVHashtable. I'm not sure if such checking code is desirable. Please advise. Maybe I should use ResourceHashtable as Ioi suggested in JDK-8234127. ------------------------------------------------------------------------------ > I already split this CR into two. > From a functional POV hs-tier1-5 pass with the change. Thanks for the work! Let's finalize what to do with the DEBUG_ONLY KVHashtable first, then do another round of testing. ------------------------------------------------------------------------------ > Maybe it is used in follow-up patches in these places? I haven't got to the follow-up patches yet. I will create a CR for the epoch synchronization protocol, as a subtask for JDK-8226731. Then I'll describe its implementation details and a few yet unresolved corner cases, so we can think about them together. -Man From kim.barrett at oracle.com Thu Nov 14 04:36:45 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 13 Nov 2019 23:36:45 -0500 Subject: RFR: 8233280: Remove GCLockerInvokesConcurrent Message-ID: Please review this change to obsolete the GCLockerInvokesConcurrent product option, as a step toward removal. Normally removal of a product option involves a 3-step process of deprecate, obsolete, and remove. In this case, for reasons discussed in the CR and the associated CSR, we propose to skip the deprecation step and go straight to obsolete. CR: https://bugs.openjdk.java.net/browse/JDK-8233280 Webrev: https://cr.openjdk.java.net/~kbarrett/8233280/open.00/ Testing: Local (linux-x64) build and hotspot:tier1. Searched the entire source tree (including Oracle's closed part) for references, finding none except the new entry in the special_jvm_flags table. Verified this option is not mentioned in the GC Tuning Guide. From stefan.johansson at oracle.com Thu Nov 14 08:25:37 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Thu, 14 Nov 2019 09:25:37 +0100 Subject: RFR: 8233280: Remove GCLockerInvokesConcurrent In-Reply-To: References: Message-ID: Hi Kim, On 2019-11-14 05:36, Kim Barrett wrote: > Please review this change to obsolete the GCLockerInvokesConcurrent > product option, as a step toward removal. Normally removal of a > product option involves a 3-step process of deprecate, obsolete, and > remove. In this case, for reasons discussed in the CR and the > associated CSR, we propose to skip the deprecation step and go > straight to obsolete. > I fully agree with the reasoning in the CR and CSR. > CR: > https://bugs.openjdk.java.net/browse/JDK-8233280 > > Webrev: > https://cr.openjdk.java.net/~kbarrett/8233280/open.00/ > Looks good, Stefan > Testing: > Local (linux-x64) build and hotspot:tier1. > > Searched the entire source tree (including Oracle's closed part) for > references, finding none except the new entry in the special_jvm_flags > table. > > Verified this option is not mentioned in the GC Tuning Guide. > From thomas.schatzl at oracle.com Thu Nov 14 10:17:11 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 14 Nov 2019 11:17:11 +0100 Subject: RFR: 8233280: Remove GCLockerInvokesConcurrent In-Reply-To: References: Message-ID: Hi, On 14.11.19 05:36, Kim Barrett wrote: > Please review this change to obsolete the GCLockerInvokesConcurrent > product option, as a step toward removal. Normally removal of a > product option involves a 3-step process of deprecate, obsolete, and > remove. In this case, for reasons discussed in the CR and the > associated CSR, we propose to skip the deprecation step and go > straight to obsolete. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8233280 > > Webrev: > https://cr.openjdk.java.net/~kbarrett/8233280/open.00/ > > Testing: > Local (linux-x64) build and hotspot:tier1. > > Searched the entire source tree (including Oracle's closed part) for > references, finding none except the new entry in the special_jvm_flags > table. > > Verified this option is not mentioned in the GC Tuning Guide. > looks good. Thomas From thomas.schatzl at oracle.com Thu Nov 14 12:42:37 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 14 Nov 2019 13:42:37 +0100 Subject: RFR (M): 8233306: Sort members in G1's HeapRegion after removal of Space dependency In-Reply-To: References: <568f7bca-3c39-f554-b557-953e5f7f157c@oracle.com> Message-ID: <412699eb-1ae0-edd8-1acd-ce7673872101@oracle.com> Hi Stefan, On 13.11.19 10:17, Stefan Johansson wrote: > Hi Thomas, > > On 2019-10-31 14:47, Thomas Schatzl wrote: >> Hi all, >> >> ??after the change to HeapRegion in JDK-8233306 the declaration fo the >> HeapRegion class is a bit messed up (merging G1ContiguousSpace, adding >> a few members needed from ContiguousSpace). >> >> This change tries to fix this as much as possible by shuffling around >> stuff (i.e. grouping allocation related methods, evacuation related >> methods, some helper pointers in HeapRegion, etc). >> >> Depends on JDK-8189737 also out for review. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8233306 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8233306/webrev/ > Looks good, > Stefan > thanks for your review. Fyi, there has been one merge issue with latest NUMA changes: in heapRegion.cpp, in the initializer list of HeapRegion::HeapRegion, NUMA added a _node_index member at the end. This caused the merge logic to bail out because the context of the source hunk and the current code did not exactly match. I updated the webrev. Thanks, Thomas From kim.barrett at oracle.com Thu Nov 14 20:49:53 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 14 Nov 2019 15:49:53 -0500 Subject: RFR: 8233280: Remove GCLockerInvokesConcurrent In-Reply-To: References: Message-ID: <7B4E49A0-85CC-42C5-AD01-890172A47178@oracle.com> > On Nov 14, 2019, at 3:25 AM, Stefan Johansson wrote: > > Hi Kim, > > On 2019-11-14 05:36, Kim Barrett wrote: >> Please review this change to obsolete the GCLockerInvokesConcurrent >> product option, as a step toward removal. Normally removal of a >> product option involves a 3-step process of deprecate, obsolete, and >> remove. In this case, for reasons discussed in the CR and the >> associated CSR, we propose to skip the deprecation step and go >> straight to obsolete. >> > I fully agree with the reasoning in the CR and CSR. Thanks. Can you review the CSR? > >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8233280 >> Webrev: >> https://cr.openjdk.java.net/~kbarrett/8233280/open.00/ > Looks good, > Stefan > >> Testing: >> Local (linux-x64) build and hotspot:tier1. >> Searched the entire source tree (including Oracle's closed part) for >> references, finding none except the new entry in the special_jvm_flags >> table. >> Verified this option is not mentioned in the GC Tuning Guide. From kim.barrett at oracle.com Thu Nov 14 20:50:04 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 14 Nov 2019 15:50:04 -0500 Subject: RFR: 8233280: Remove GCLockerInvokesConcurrent In-Reply-To: References: Message-ID: <944A4C94-0130-44F7-A692-3FD2EE06B549@oracle.com> > On Nov 14, 2019, at 5:17 AM, Thomas Schatzl wrote: > > Hi, > > On 14.11.19 05:36, Kim Barrett wrote: >> Please review this change to obsolete the GCLockerInvokesConcurrent >> product option, as a step toward removal. Normally removal of a >> product option involves a 3-step process of deprecate, obsolete, and >> remove. In this case, for reasons discussed in the CR and the >> associated CSR, we propose to skip the deprecation step and go >> straight to obsolete. >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8233280 >> Webrev: >> https://cr.openjdk.java.net/~kbarrett/8233280/open.00/ >> Testing: >> Local (linux-x64) build and hotspot:tier1. >> Searched the entire source tree (including Oracle's closed part) for >> references, finding none except the new entry in the special_jvm_flags >> table. >> Verified this option is not mentioned in the GC Tuning Guide. > > looks good. > > Thomas Thanks. From manc at google.com Fri Nov 15 02:02:11 2019 From: manc at google.com (Man Cao) Date: Thu, 14 Nov 2019 18:02:11 -0800 Subject: State of G1's "throughput barriers"? In-Reply-To: References: Message-ID: Hi Clemens, Great to hear more interests on this! Yes, I'm still working on it. The most challenging part is JDK-8226731, which requires implementing a new synchronization mechanism between Java mutator threads and GC refinement threads. There are some unresolved issues, and I will draft a separate CR for this synchronization mechanism soon. As for the throughput barrier itself (JDK-8226197 or JDK-8230187), it is actually simpler to implement. I have implemented a prototype for JDK11 and deployed for some production services internally. However, from a code health and maintenance perspective, it is much cleaner to push it after JDK-8226731 is resolved. -Man On Tue, Nov 12, 2019 at 1:42 AM Thomas Schatzl wrote: > Hi Clemens, > > On 09.11.19 15:54, Clemens Eisserer wrote: > > Hi, > > > > With great excitement I read about the proposal to add a > > throughput-mode to G1 - has there been any progress on this? > > In some cases the throughput overhead G1 introduced compared to CMS is > > quite noticeable, especially for the case where 2-5s pauses are quite > > tolerable - i really hoped to get an option to trade a bit of latency > > for better throughput. > > > > Thanks, Clemens > > > > the throughput barrier effort (or actually: optimize G1 when > disabling refinement) afaict consists of two main steps: > > - changing the existing barrier so that the throughput barrier is not > completely different, allowing some interesting further optimizations. > > This is mostly JDK-8087198 (currently out for review), and JDK-8226731, > which improves the existing barrier already a bit. > > - enabling the smaller barrier if concurrent refinement is disabled (via > a new -XX:-G1UseConcRefinement). This is JDK-8134303 and ultimately > JDK-8226197). > > Man Cao from Google is working on this, and it seems that we'll get at > least the first big part soon. :) > > Maybe Man wants to chime in for further details. > > Thanks, > Thomas > From manc at google.com Fri Nov 15 02:11:48 2019 From: manc at google.com (Man Cao) Date: Thu, 14 Nov 2019 18:11:48 -0800 Subject: RFR (S): 8234208: Logging reports zero total refined cards under "Before GC RS summary" Message-ID: Hi all, Can I have reviews for this fix for GC logging in G1RemSetSummary? Webrev: https://cr.openjdk.java.net/~manc/8234208/webrev.00/ Bug: https://bugs.openjdk.java.net/browse/JDK-8234208 This fix makes G1RemSetSummary independent of G1Policy, which in my opinion is cleaner than reordering the callsites of policy()->record_collection_pause_start() and rem_set()->print_periodic_summary_info(). The cost of summing up the total number of refined cards should be negligible. I also removed a null check for _rs_threads_vtimes, which is unnecessary after JDK-8183226. -Man From kim.barrett at oracle.com Fri Nov 15 02:22:50 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 14 Nov 2019 21:22:50 -0500 Subject: RFR (S): 8234208: Logging reports zero total refined cards under "Before GC RS summary" In-Reply-To: References: Message-ID: > On Nov 14, 2019, at 9:11 PM, Man Cao wrote: > > Hi all, > > Can I have reviews for this fix for GC logging in G1RemSetSummary? > Webrev: https://cr.openjdk.java.net/~manc/8234208/webrev.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8234208 > > This fix makes G1RemSetSummary independent of G1Policy, which in my opinion > is cleaner than reordering the > callsites of policy()->record_collection_pause_start() and > rem_set()->print_periodic_summary_info(). The cost of summing up the total > number of refined cards should be negligible. > > I also removed a null check for _rs_threads_vtimes, which is unnecessary > after JDK-8183226. > > -Man ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1RemSetSummary.cpp 53 _total_mutator_refined_cards = G1BarrierSet::dirty_card_queue_set().total_mutator_refined_cards(); Should be #including g1BarrierSet.hpp for this. ------------------------------------------------------------------------------ Looks good, other than that. I don't need another webrev for the additional #include. From fujie at loongson.cn Fri Nov 15 07:21:20 2019 From: fujie at loongson.cn (Jie Fu) Date: Fri, 15 Nov 2019 15:21:20 +0800 Subject: RFR: 8234232: [TESTBUG] gc/shenandoah/jvmti/TestHeapDump.java fails with -Xcomp Message-ID: <7769ecf2-a539-29ac-dcc4-21b71ae960c6@loongson.cn> Hi all, May I get reviews for this small fix? JBS:??? https://bugs.openjdk.java.net/browse/JDK-8234232 Webrev: http://cr.openjdk.java.net/~jiefu/8234232/webrev.00/ And I need a sponsor. Thanks a lot. Best regards, Jie From thomas.schatzl at oracle.com Fri Nov 15 09:01:47 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 15 Nov 2019 10:01:47 +0100 Subject: RFR (S): 8234208: Logging reports zero total refined cards under "Before GC RS summary" In-Reply-To: References: Message-ID: <74c0dc30-b357-b1cf-4244-046ee808b983@oracle.com> Hi, On 15.11.19 03:11, Man Cao wrote: > Hi all, > > Can I have reviews for this fix for GC logging in G1RemSetSummary? > Webrev: https://cr.openjdk.java.net/~manc/8234208/webrev.00/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8234208 > > This fix makes G1RemSetSummary independent of G1Policy, which in my opinion > is cleaner than reordering the > callsites of policy()->record_collection_pause_start() and > rem_set()->print_periodic_summary_info(). The cost of summing up the total > number of refined cards should be negligible. > > I also removed a null check for _rs_threads_vtimes, which is unnecessary > after JDK-8183226. > looks good, sans Kim's suggestion. I do not need a re-review either. Thanks, Thomas From stefan.johansson at oracle.com Fri Nov 15 09:21:50 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 15 Nov 2019 10:21:50 +0100 Subject: RFR: 8220465: Use shadow regions for faster ParallelGC full GCs In-Reply-To: References: <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com> <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com> <400df998-171a-5bbe-9f3e-01af1781afb4@oracle.com> <955C4FA4-FC18-446C-851A-0D04A916D88D@oracle.com> <4fce596c-a4eb-5da6-1b47-b4e2314de3c5@oracle.com> Message-ID: <26ff7fee-2457-53df-c933-37feb3c87ae9@oracle.com> Hi Haoyu, Thanks for the updates, here are new webrevs: Full: http://cr.openjdk.java.net/~sjohanss/8220465/03/ Inc: http://cr.openjdk.java.net/~sjohanss/8220465/02-03/ I have asked more people to look at the change now, so you might await some more feedback. Thanks, Stefan On 2019-11-12 16:11, Haoyu Li wrote: > Hi Stefan, > > Thanks for your advice! > > Nice, I think it would make sense to used cmpxchg in mark_normal() as > well and assert that the returned value is SHADOW. > > I've changed mark_normal() to use Atomic::cmpxchg and added an > assertion. Please find the changes in the attached patchs. Thanks! > Best Regards, > Haoyu Li, > > Stefan Johansson > ?2019?11?11??? ??11:10??? > > Hi Haoyu, > > Thanks for the updated patches, I think they look good in general, just > one comment inline below. > > Here are some updated webrev: > Full: http://cr.openjdk.java.net/~sjohanss/8220465/02 > Inc: http://cr.openjdk.java.net/~sjohanss/8220465/01-02 > > On 2019-11-06 08:17, Haoyu Li wrote: > > Hi Stefan, > > > > Sorry for the late update. I have attached both a full patch > > (shadow-region-v3.patch) and the incremental changes > > (shadow-region-incr.patch) in this mail, and details are as follows. > > > >? ? ?Regarding the current patch, I think that it looks good in > general, > >? ? ?but I thought a bit more around how to share stuff between the > >? ? ?closures and I agree that adding those extra virtual functions > >? ? ?doesn?t really feel worth it. I?m wondering if a solution > where we > >? ? ?revert back to letting destination be the ?real destination? (not > >? ? ?ever pointing to the shadow region) and add a > copy_destination which > >? ? ?is destination + offset. To make this work the normal > >? ? ?MoveAndUpdateClosure would also have an offset, but it would > always > >? ? ?be 0. If do_addr() is then updated to use the > copy_destination() in > >? ? ?some places we might end up with something pretty nice, but maybe > >? ? ?I?m missing something. > > > > > > It is an excellent idea to let MoveAndUpdateClosure have an _offset > > equal to 0, so ShadowClosure can reuse more code from it. I have > made > > the above changes in the new patch. > Yes, using this approach looks very nice. > > > > >? ? ?I also realized that the current patch will trigger an assert > >? ? ?because destination is expected not to be the shadow address: > >? ? ?#? Internal Error > >? ? ?(open/src/hotspot/share/gc/parallel/psParallelCompact.cpp:3045), > >? ? ?pid=12649, tid=12728 > >? ? ?#? assert(src_cp->destination() == destination) failed: first > live > >? ? ?obj in the space must match the destination > > > >? ? ?So this also suggests that we should keep destination() returning > >? ? ?the real destination. > > > >? ? ?Some other comments: > >? ? ?src/hotspot/share/gc/parallel/psParallelCompact.cpp > >? ? ?? > >? ? ?3383 void ShadowClosure::complete_region(ParCompactionManager > *cm, > >? ? ?HeapWord *dest_addr, > >? ? ?3384 > >? ? ? ?PSParallelCompact::RegionData *region_ptr) { > >? ? ?3385? ?assert(region_ptr->shadow_state() == > >? ? ?ParallelCompactData::RegionData::FINISH, "Region should be > finished?); > > > >? ? ?This assertion will also trigger when running with a debug > build and > >? ? ?at this point the shadow state should be SHADOW not FINISH. > >? ? ?? > > > > > > Sorry for these buggy assertions. The shadow_state in > > ShadowClosure::complete_region should be SHADOW instead of > FINISH, and > > I've corrected it. Moreover, while I was testing it in the debug > mode, I > > found another interesting case, in which a region should return > to the > > normal path if it becomes available before invoking > fill_shadow_region > > (the branch that shadow_region == 0 at psParallelCompact.cpp:3182). > > Therefore, I add a new function > > ParallelCompactData::RegionData::mark_normal() to handle this > special > > case, so the assertion in MoveAndUpdateClosure::complete_region will > > success. > Nice, I think it would make sense to used cmpxchg in mark_normal() as > well and assert that the returned value is SHADOW. > > Thanks, > Stefan > > > > >? ? ?src/hotspot/share/gc/parallel/psParallelCompact.hpp > >? ? ?? > >? ? ? ?632 inline bool > ParallelCompactData::RegionData::mark_filled() { > >? ? ? ?633? ?return Atomic::cmpxchg(FILLED, &_shadow_state, SHADOW) == > >? ? ?SHADOW; > >? ? ? ?634 } > > > >? ? ?Since we never check the return value here we should make it void > >? ? ?and maybe instead add an assert that the return value is SHADOW. > >? ? ?? > > > > > > Thanks for the suggestion. I have changed mark_filled() to void. > > > > I really appreciate your reviews. If there are any issues in the > patch, > > please let me know at any time. Thanks again! > > Best Regards, > > Haoyu Li > > > > Stefan Johansson > > >> ?2019?10?29??? ??3:03 > ??? > > > >? ? ?Hi Haoyu, > > > >? ? ?I?ve looked through the patch in detail now and created a new > webrev at: > > http://cr.openjdk.java.net/~sjohanss/8220465/01/ > > > >? ? ?I took the liberty of removing the removal of move_and_update > from > >? ? ?your patch since I?m addressing that separately in > JDK-8233065. The > >? ? ?webrev above is still based on that removal, but I expect > that to be > >? ? ?pushed tomorrow or Wednesday so that should be fine. > > > >? ? ?I also changed the subject to make it more clear that this is > now a > >? ? ?review of: > > https://bugs.openjdk.java.net/browse/JDK-8220465 > > > >? ? ?Regarding the current patch, I think that it looks good in > general, > >? ? ?but I thought a bit more around how to share stuff between the > >? ? ?closures and I agree that adding those extra virtual functions > >? ? ?doesn?t really feel worth it. I?m wondering if a solution > where we > >? ? ?revert back to letting destination be the ?real destination? (not > >? ? ?ever pointing to the shadow region) and add a > copy_destination which > >? ? ?is destination + offset. To make this work the normal > >? ? ?MoveAndUpdateClosure would also have an offset, but it would > always > >? ? ?be 0. If do_addr() is then updated to use the > copy_destination() in > >? ? ?some places we might end up with something pretty nice, but maybe > >? ? ?I?m missing something. > > > >? ? ?I also realized that the current patch will trigger an assert > >? ? ?because destination is expected not to be the shadow address: > >? ? ?#? Internal Error > >? ? ?(open/src/hotspot/share/gc/parallel/psParallelCompact.cpp:3045), > >? ? ?pid=12649, tid=12728 > >? ? ?#? assert(src_cp->destination() == destination) failed: first > live > >? ? ?obj in the space must match the destination > > > >? ? ?So this also suggests that we should keep destination() returning > >? ? ?the real destination. > > > >? ? ?Some other comments: > >? ? ?src/hotspot/share/gc/parallel/psParallelCompact.cpp > >? ? ?? > >? ? ?3383 void ShadowClosure::complete_region(ParCompactionManager > *cm, > >? ? ?HeapWord *dest_addr, > >? ? ?3384 > >? ? ? ?PSParallelCompact::RegionData *region_ptr) { > >? ? ?3385? ?assert(region_ptr->shadow_state() == > >? ? ?ParallelCompactData::RegionData::FINISH, "Region should be > finished?); > > > >? ? ?This assertion will also trigger when running with a debug > build and > >? ? ?at this point the shadow state should be SHADOW not FINISH. > >? ? ?? > > > >? ? ?src/hotspot/share/gc/parallel/psParallelCompact.hpp > >? ? ?? > >? ? ? ?632 inline bool > ParallelCompactData::RegionData::mark_filled() { > >? ? ? ?633? ?return Atomic::cmpxchg(FILLED, &_shadow_state, SHADOW) == > >? ? ?SHADOW; > >? ? ? ?634 } > > > >? ? ?Since we never check the return value here we should make it void > >? ? ?and maybe instead add an assert that the return value is SHADOW. > >? ? ?? > > > >? ? ?When you addressed these comments, would it be possible to > include > >? ? ?both the full patch and and the incremental changes from the > current > >? ? ?version. That makes it easier for the reviewers to see what > changed > >? ? ?between version of the patch. > > > >? ? ?Thanks, > >? ? ?Stefan > > > >? ? ? > 24 okt. 2019 kl. 14:16 skrev Stefan Johansson > >? ? ? > >>: > >? ? ? > > >? ? ? > Hi Haoyu, > >? ? ? > > >? ? ? > On 2019-10-23 17:15, Haoyu Li wrote: > >? ? ? >> Hi Stefan, > >? ? ? >> Thanks for your constructive feedback. I've addressed all the > >? ? ?issues you mentioned, and the updated patch is attached in > this email. > >? ? ? > Nice, I will look at the patch next week, but I'll shortly > answer > >? ? ?your questions right away. > >? ? ? > > >? ? ? >> During refining the patch, I have a couple of questions: > >? ? ? >> 1) Now the MoveAndUpdateClosure and ShadowClosure assume the > >? ? ?destination address is the very beginning of a region, > instead of an > >? ? ?arbitrary address like what it used to be. However, there is an > >? ? ?unused function named PSParallelCompact::move_and_update() > uses the > >? ? ?MoveAndUpdateClosure to process a region from its middle, which > >? ? ?conflicts with the assumption. I notice that you removed this > >? ? ?function in your patch, and so did I in the updated patch. > Does it > >? ? ?matter? > >? ? ? > Yes, I found this function during my code review and it > should be > >? ? ?removed, but I think that should be handled as a separate > issue. We > >? ? ?can do this removal before this patch goes in. > >? ? ? > > >? ? ? >> 2) Using the same do_addr() in MoveAndUpdateClosure and > >? ? ?ShadowClosure is doable, but it does not reuse all the code > neatly. > >? ? ?Because storing the address of the shadow region in _destination > >? ? ?requires extra virtual functions to handle allocating blocks > in the > >? ? ?start_array and setting addresses of deferred objects. In > >? ? ?particular, allocate_blocks() and set_deferred_object_for() > in both > >? ? ?closures are added. Is it worth avoiding to use _offset to > calculate > >? ? ?the shadow_destination? > >? ? ? > Ok, sounds like it might be better to have specific do_addr() > >? ? ?functions then. I'll think some more around this when > reviewing the > >? ? ?new patch in depth. > >? ? ? > > >? ? ? >> If there are any problems with this patch, please contact me > >? ? ?anytime. I'm more than happy to keep improving the code. Thanks > >? ? ?again for reviewing. > >? ? ? >> > >? ? ? > Sound good, thanks, > >? ? ? > Stefan > > > From stefan.johansson at oracle.com Fri Nov 15 09:38:42 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 15 Nov 2019 10:38:42 +0100 Subject: RFR: 8141637: Parallelize single threaded heap region iteration during Pre Evacuate Collection Set Message-ID: Hi, Please review this fix to parallelize parts of the G1 young gc preparation. Issue: https://bugs.openjdk.java.net/browse/JDK-8141637 Webrev: http://cr.openjdk.java.net/~sjohanss/8141637/00/ Summary To sub-phases of the "Pre Evacuate Collection Set" phase, "Region Register" and "Prepare Heap Roots" previously did single threaded iteration of all regions in the heap. For heaps with a lot of regions this become a problem so this change groups these iterations into one task that is executed in parallel. Testing Manual performance testing show good results on large heaps and no problem on small heaps. Functional tests using mach5 tier 1-4. Thanks, Stefan From thomas.schatzl at oracle.com Fri Nov 15 09:49:35 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 15 Nov 2019 10:49:35 +0100 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting In-Reply-To: References: <34BD822F-20DD-43E3-956F-09622EB62148@oracle.com> <5c700c0a-954e-6187-67b4-cc523cace95c@oracle.com> Message-ID: Hi, On 14.11.19 02:59, Man Cao wrote: > Thanks for the reviews and testing, very helpful! > > I have addressed all comments, new webrevs: > http://cr.openjdk.java.net/~manc/8087198/webrev.01/ > Incremental: https://cr.openjdk.java.net/~manc/8087198/webrev.00-01.inc/ > > Some tests involving G1AddMetaspaceDependency are crashing in > debug/fastdebug builds, due to > https://bugs.openjdk.java.net/browse/JDK-8234127 and G1UpdateBufferSize=1. > I'm working on fixing this bug for the hashtable. > I also see some Windows build failure on Submit repo, not sure if it is > related to JDK-8234127, could someone find the logs for this run? > Job: mach5-one-manc-JDK-8087198-1-20191113-2315-6689774 > > I also tested the performance for sorting order of the cards on > BigRamTester, below are the CPU time for refinement threads, averaged > across 34 trials with 95% confidence intervals in brackets: > base (no batching): 437656.676 [425374.295, 449939.057] > batching and decreasing order: 424714.529 [412841.728, 436587.330] > batching and increasing order: 459918.294 [448483.304, 471353.284] I suggest improving the comment in g1DirtyCardQueue.cpp that "tests showed that this order is preferable to not sorting or increasing address order" instead of the more unspecific "improves performance". > Additional replies below. > ------------------------------------------------------------------------------ >> collect_and_clean_cards could use two-finger compaction of the >> _node_buffer in place. (Similar to the SATB buffer filtering.) > Great! I should have thought of this after removing the MemRegions. > Now it's doing the two-finger compaction, and no more memcpy and > ResourceMark. :) > ------------------------------------------------------------------------------ >> This change re-reads top() after the fence. >> ... >> I *think* re-read is okay for humongous regions too, but haven't fully >> convinced myself of that yet. > I thought about this further and convinced myself it is safe. > I added some DEBUG_ONLY code to check the two reads of top() > should return the same value, by using a KVHashtable. > I'm not sure if such checking code is desirable. Please advise. > Maybe I should use ResourceHashtable as Ioi suggested in JDK-8234127. The assert looks safe (and the code valid) because in case of a safepoint (e.g. remark) the card is re-evaluated with the potentially new top anyway. I.e. there can not be a safepoint with potential removal of the humongous regions between cleaning and actual refining. Other old gen region's top never change. The worst that can happen is: clean_cards(X) // X is a buffer; reads top1 = top re-enqueue(X) clean_cards(X) // re-reads top2 = top refine_cards(X) // uses top2 (The latter two steps can also occur in the safepoint during the card scanning if this is an evacuating safepoint) So not sure if the KVHashTable check is useful here, but I'm not opposing it either. > ------------------------------------------------------------------------------ >> I already split this CR into two. >> From a functional POV hs-tier1-5 pass with the change. > Thanks for the work! Let's finalize what to do with the DEBUG_ONLY > KVHashtable first, then do another round of testing. Okay. > ------------------------------------------------------------------------------ >> Maybe it is used in follow-up patches in these places? > I haven't got to the follow-up patches yet. > I will create a CR for the epoch synchronization protocol, as a subtask for > JDK-8226731. > Then I'll describe its implementation details and a few yet unresolved > corner cases, so we can think about them together. That is a good idea - it is often easier to think about the issues given specific code than some explanation. One nit: in G1RefineBufferedCards::refine_cleaned_cards, use "++" instead of "+=1". Looks good otherwise. Thanks, Thomas From thomas.schatzl at oracle.com Fri Nov 15 10:35:08 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 15 Nov 2019 11:35:08 +0100 Subject: RFR: 8141637: Parallelize single threaded heap region iteration during Pre Evacuate Collection Set In-Reply-To: References: Message-ID: <96d72eba-7ad9-30a0-18d2-944e6cf826be@oracle.com> Hi, On 15.11.19 10:38, Stefan Johansson wrote: > Hi, > > Please review this fix to parallelize parts of the G1 young gc preparation. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8141637 > Webrev: http://cr.openjdk.java.net/~sjohanss/8141637/00/ > > Summary > To sub-phases of the "Pre Evacuate Collection Set" phase, "Region > Register" and "Prepare Heap Roots" previously did single threaded > iteration of all regions in the heap. For heaps with a lot of regions > this become a problem so this change groups these iterations into one > task that is executed in parallel. > > Testing > Manual performance testing show good results on large heaps and no > problem on small heaps. Functional tests using mach5 tier 1-4. > A few nits: - G1RemSet::prepare_region_for_scannning: for naming consistency I would prefer "prepare_region_for_scan" Also there is now a near-name clash with prepare_for_scan_heap_roots(uint region_idx). Maybe rename the latter to "disable_scan_for_region" or something similar (which is a pre-existing bad name), with a better comment? - please move the comment for G1RemSet::prepare_region_for_scanning() in the cpp file to the hpp file. - existing: in G1RemSet::prepare_region_for_scanning: please add a comment in the code path for regions in the collection set that "we do not need to disable scanning for these regions as the default is to not scan". Actually I think it is useful to move the clear_scan_top() call for all regions to here from G1RemSetScanState::prepare(), which makes it clear that we reset it for all regions here. (Obviating the need for this comment). Also the remaining reseet of G1RemSetScanstate::_collection_set_iter_state from G1RemSetScanstate::prepare could be moved here (or calling a method that initializes a given region in G1RemSetScanState). - move the declaration of G1RemSet::prepare_region_for_scanning() below cleanup_after_scan_heap_roots. The comment of prepare_for_scan_heap_roots() references the cleanup method, as following the prepare method. I.e. the new method declaration is confusingly placed inbetween. - unnecessary newlines: before G1PrepareRegionsClosure::humongous_region_is_candidate(), in G1PrepareEvacuationTask::G1PrepareEvacuationTask() (probably just collapse the end bracket in the same line as the opening bracket). After the closing bracket of G1PrepareRregionsClosure::do_heap_region(). Thanks, Thomas From zgu at redhat.com Fri Nov 15 12:29:39 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 15 Nov 2019 07:29:39 -0500 Subject: RFR: 8234232: [TESTBUG] gc/shenandoah/jvmti/TestHeapDump.java fails with -Xcomp In-Reply-To: <7769ecf2-a539-29ac-dcc4-21b71ae960c6@loongson.cn> References: <7769ecf2-a539-29ac-dcc4-21b71ae960c6@loongson.cn> Message-ID: <46e98d9d-328d-e216-3319-4086f231bf0d@redhat.com> Looks good. Please send me final patch, I will push for you. Thanks, -Zhengyu On 11/15/19 2:21 AM, Jie Fu wrote: > Hi all, > > May I get reviews for this small fix? > > JBS:??? https://bugs.openjdk.java.net/browse/JDK-8234232 > Webrev: http://cr.openjdk.java.net/~jiefu/8234232/webrev.00/ > > And I need a sponsor. > > Thanks a lot. > Best regards, > Jie > > From fujie at loongson.cn Fri Nov 15 12:46:08 2019 From: fujie at loongson.cn (Jie Fu) Date: Fri, 15 Nov 2019 20:46:08 +0800 Subject: RFR: 8234232: [TESTBUG] gc/shenandoah/jvmti/TestHeapDump.java fails with -Xcomp In-Reply-To: <46e98d9d-328d-e216-3319-4086f231bf0d@redhat.com> References: <7769ecf2-a539-29ac-dcc4-21b71ae960c6@loongson.cn> <46e98d9d-328d-e216-3319-4086f231bf0d@redhat.com> Message-ID: <7b88d4d4-b6e5-fed1-224a-bff086b4d244@loongson.cn> Thanks Zhengyu for your review and sponsoring. Please see the patch attached. Thanks a lot. Best regards, Jie On 2019/11/15 ??8:29, Zhengyu Gu wrote: > Looks good. > > Please send me final patch, I will push for you. > > Thanks, > > -Zhengyu > > On 11/15/19 2:21 AM, Jie Fu wrote: >> Hi all, >> >> May I get reviews for this small fix? >> >> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8234232 >> Webrev: http://cr.openjdk.java.net/~jiefu/8234232/webrev.00/ >> >> And I need a sponsor. >> >> Thanks a lot. >> Best regards, >> Jie >> >> > -------------- next part -------------- # HG changeset patch # User jiefu # Date 1573821566 -28800 # Fri Nov 15 20:39:26 2019 +0800 # Node ID 805888f0415f3519606d18ee0e61380c664e1e5c # Parent 6f42d2a19117e61a10362315ffda6875187e3a20 8234232: [TESTBUG] gc/shenandoah/jvmti/TestHeapDump.java fails with -Xcomp Reviewed-by: zgu diff --git a/test/hotspot/jtreg/gc/shenandoah/jvmti/TestHeapDump.java b/test/hotspot/jtreg/gc/shenandoah/jvmti/TestHeapDump.java --- a/test/hotspot/jtreg/gc/shenandoah/jvmti/TestHeapDump.java +++ b/test/hotspot/jtreg/gc/shenandoah/jvmti/TestHeapDump.java @@ -1,5 +1,5 @@ /* - * Copyright (c) 2017, 2018, Red Hat, Inc. All rights reserved. + * Copyright (c) 2017, 2019, Red Hat, Inc. All rights reserved. * * This code is free software; you can redistribute it and/or modify it * under the terms of the GNU General Public License version 2 only, as @@ -40,6 +40,8 @@ * @run main/othervm/native/timeout=300 -agentlib:TestHeapDump -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC -Xmx128m -XX:ShenandoahGCHeuristics=aggressive -XX:-UseCompressedOops TestHeapDump */ +import java.lang.ref.Reference; + public class TestHeapDump { private static final int NUM_ITER = 10000; @@ -86,6 +88,8 @@ throw new RuntimeException("Expected " + EXPECTED_OBJECTS + " objects, but got " + numObjs); } } + Reference.reachabilityFence(array); + Reference.reachabilityFence(localRoot); } // We look for the instances of this class during the heap scan From zgu at redhat.com Fri Nov 15 13:21:41 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 15 Nov 2019 08:21:41 -0500 Subject: RFR: 8234232: [TESTBUG] gc/shenandoah/jvmti/TestHeapDump.java fails with -Xcomp In-Reply-To: <7b88d4d4-b6e5-fed1-224a-bff086b4d244@loongson.cn> References: <7769ecf2-a539-29ac-dcc4-21b71ae960c6@loongson.cn> <46e98d9d-328d-e216-3319-4086f231bf0d@redhat.com> <7b88d4d4-b6e5-fed1-224a-bff086b4d244@loongson.cn> Message-ID: <246d760f-66e1-8869-04cc-0827ecc7a757@redhat.com> Pushed: http://hg.openjdk.java.net/jdk/jdk/rev/8c4c358272a9 -Zhengyu On 11/15/19 7:46 AM, Jie Fu wrote: > Thanks Zhengyu for your review and sponsoring. > > Please see the patch attached. > > Thanks a lot. > Best regards, > Jie > > On 2019/11/15 ??8:29, Zhengyu Gu wrote: >> Looks good. >> >> Please send me final patch, I will push for you. >> >> Thanks, >> >> -Zhengyu >> >> On 11/15/19 2:21 AM, Jie Fu wrote: >>> Hi all, >>> >>> May I get reviews for this small fix? >>> >>> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8234232 >>> Webrev: http://cr.openjdk.java.net/~jiefu/8234232/webrev.00/ >>> >>> And I need a sponsor. >>> >>> Thanks a lot. >>> Best regards, >>> Jie >>> >>> >> From manc at google.com Fri Nov 15 19:00:47 2019 From: manc at google.com (Man Cao) Date: Fri, 15 Nov 2019 11:00:47 -0800 Subject: RFR (S): 8234208: Logging reports zero total refined cards under "Before GC RS summary" In-Reply-To: <74c0dc30-b357-b1cf-4244-046ee808b983@oracle.com> References: <74c0dc30-b357-b1cf-4244-046ee808b983@oracle.com> Message-ID: Thanks for the reviews. "#include" added. -Man From fujie at loongson.cn Fri Nov 15 23:12:58 2019 From: fujie at loongson.cn (Jie Fu) Date: Sat, 16 Nov 2019 07:12:58 +0800 Subject: RFR: 8234232: [TESTBUG] gc/shenandoah/jvmti/TestHeapDump.java fails with -Xcomp In-Reply-To: <246d760f-66e1-8869-04cc-0827ecc7a757@redhat.com> References: <7769ecf2-a539-29ac-dcc4-21b71ae960c6@loongson.cn> <46e98d9d-328d-e216-3319-4086f231bf0d@redhat.com> <7b88d4d4-b6e5-fed1-224a-bff086b4d244@loongson.cn> <246d760f-66e1-8869-04cc-0827ecc7a757@redhat.com> Message-ID: Thank you so much, Zhengyu. On 2019/11/15 ??9:21, Zhengyu Gu wrote: > Pushed: http://hg.openjdk.java.net/jdk/jdk/rev/8c4c358272a9 > > -Zhengyu > > On 11/15/19 7:46 AM, Jie Fu wrote: >> Thanks Zhengyu for your review and sponsoring. >> >> Please see the patch attached. >> >> Thanks a lot. >> Best regards, >> Jie >> >> On 2019/11/15 ??8:29, Zhengyu Gu wrote: >>> Looks good. >>> >>> Please send me final patch, I will push for you. >>> >>> Thanks, >>> >>> -Zhengyu >>> >>> On 11/15/19 2:21 AM, Jie Fu wrote: >>>> Hi all, >>>> >>>> May I get reviews for this small fix? >>>> >>>> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8234232 >>>> Webrev: http://cr.openjdk.java.net/~jiefu/8234232/webrev.00/ >>>> >>>> And I need a sponsor. >>>> >>>> Thanks a lot. >>>> Best regards, >>>> Jie >>>> >>>> >>> > From leihouyju at gmail.com Sat Nov 16 05:13:42 2019 From: leihouyju at gmail.com (Haoyu Li) Date: Sat, 16 Nov 2019 13:13:42 +0800 Subject: RFR: 8220465: Use shadow regions for faster ParallelGC full GCs In-Reply-To: <26ff7fee-2457-53df-c933-37feb3c87ae9@oracle.com> References: <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com> <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com> <400df998-171a-5bbe-9f3e-01af1781afb4@oracle.com> <955C4FA4-FC18-446C-851A-0D04A916D88D@oracle.com> <4fce596c-a4eb-5da6-1b47-b4e2314de3c5@oracle.com> <26ff7fee-2457-53df-c933-37feb3c87ae9@oracle.com> Message-ID: Hi Stefan, Thanks very much for all your reviewing effort! Please feel free to contact me if there is any problem. Looking forward to hearing from you! Best Regards, Haoyu Li Stefan Johansson ?2019?11?15??? ??5:21??? > Hi Haoyu, > > Thanks for the updates, here are new webrevs: > Full: http://cr.openjdk.java.net/~sjohanss/8220465/03/ > Inc: http://cr.openjdk.java.net/~sjohanss/8220465/02-03/ > > I have asked more people to look at the change now, so you might await > some more feedback. > > Thanks, > Stefan > > On 2019-11-12 16:11, Haoyu Li wrote: > > Hi Stefan, > > > > Thanks for your advice! > > > > Nice, I think it would make sense to used cmpxchg in mark_normal() as > > well and assert that the returned value is SHADOW. > > > > I've changed mark_normal() to use Atomic::cmpxchg and added an > > assertion. Please find the changes in the attached patchs. Thanks! > > Best Regards, > > Haoyu Li, > > > > Stefan Johansson > > ?2019?11?11??? ??11:10??? > > > > Hi Haoyu, > > > > Thanks for the updated patches, I think they look good in general, > just > > one comment inline below. > > > > Here are some updated webrev: > > Full: http://cr.openjdk.java.net/~sjohanss/8220465/02 > > Inc: http://cr.openjdk.java.net/~sjohanss/8220465/01-02 > > > > On 2019-11-06 08:17, Haoyu Li wrote: > > > Hi Stefan, > > > > > > Sorry for the late update. I have attached both a full patch > > > (shadow-region-v3.patch) and the incremental changes > > > (shadow-region-incr.patch) in this mail, and details are as > follows. > > > > > > Regarding the current patch, I think that it looks good in > > general, > > > but I thought a bit more around how to share stuff between the > > > closures and I agree that adding those extra virtual functions > > > doesn?t really feel worth it. I?m wondering if a solution > > where we > > > revert back to letting destination be the ?real destination? > (not > > > ever pointing to the shadow region) and add a > > copy_destination which > > > is destination + offset. To make this work the normal > > > MoveAndUpdateClosure would also have an offset, but it would > > always > > > be 0. If do_addr() is then updated to use the > > copy_destination() in > > > some places we might end up with something pretty nice, but > maybe > > > I?m missing something. > > > > > > > > > It is an excellent idea to let MoveAndUpdateClosure have an > _offset > > > equal to 0, so ShadowClosure can reuse more code from it. I have > > made > > > the above changes in the new patch. > > Yes, using this approach looks very nice. > > > > > > > > I also realized that the current patch will trigger an assert > > > because destination is expected not to be the shadow address: > > > # Internal Error > > > > (open/src/hotspot/share/gc/parallel/psParallelCompact.cpp:3045), > > > pid=12649, tid=12728 > > > # assert(src_cp->destination() == destination) failed: first > > live > > > obj in the space must match the destination > > > > > > So this also suggests that we should keep destination() > returning > > > the real destination. > > > > > > Some other comments: > > > src/hotspot/share/gc/parallel/psParallelCompact.cpp > > > ? > > > 3383 void ShadowClosure::complete_region(ParCompactionManager > > *cm, > > > HeapWord *dest_addr, > > > 3384 > > > PSParallelCompact::RegionData *region_ptr) { > > > 3385 assert(region_ptr->shadow_state() == > > > ParallelCompactData::RegionData::FINISH, "Region should be > > finished?); > > > > > > This assertion will also trigger when running with a debug > > build and > > > at this point the shadow state should be SHADOW not FINISH. > > > ? > > > > > > > > > Sorry for these buggy assertions. The shadow_state in > > > ShadowClosure::complete_region should be SHADOW instead of > > FINISH, and > > > I've corrected it. Moreover, while I was testing it in the debug > > mode, I > > > found another interesting case, in which a region should return > > to the > > > normal path if it becomes available before invoking > > fill_shadow_region > > > (the branch that shadow_region == 0 at > psParallelCompact.cpp:3182). > > > Therefore, I add a new function > > > ParallelCompactData::RegionData::mark_normal() to handle this > > special > > > case, so the assertion in MoveAndUpdateClosure::complete_region > will > > > success. > > Nice, I think it would make sense to used cmpxchg in mark_normal() as > > well and assert that the returned value is SHADOW. > > > > Thanks, > > Stefan > > > > > > > > src/hotspot/share/gc/parallel/psParallelCompact.hpp > > > ? > > > 632 inline bool > > ParallelCompactData::RegionData::mark_filled() { > > > 633 return Atomic::cmpxchg(FILLED, &_shadow_state, > SHADOW) == > > > SHADOW; > > > 634 } > > > > > > Since we never check the return value here we should make it > void > > > and maybe instead add an assert that the return value is > SHADOW. > > > ? > > > > > > > > > Thanks for the suggestion. I have changed mark_filled() to void. > > > > > > I really appreciate your reviews. If there are any issues in the > > patch, > > > please let me know at any time. Thanks again! > > > Best Regards, > > > Haoyu Li > > > > > > Stefan Johansson > > > > > >> ?2019?10?29??? ??3:03 > > ??? > > > > > > Hi Haoyu, > > > > > > I?ve looked through the patch in detail now and created a new > > webrev at: > > > http://cr.openjdk.java.net/~sjohanss/8220465/01/ > > > > > > I took the liberty of removing the removal of move_and_update > > from > > > your patch since I?m addressing that separately in > > JDK-8233065. The > > > webrev above is still based on that removal, but I expect > > that to be > > > pushed tomorrow or Wednesday so that should be fine. > > > > > > I also changed the subject to make it more clear that this is > > now a > > > review of: > > > https://bugs.openjdk.java.net/browse/JDK-8220465 > > > > > > Regarding the current patch, I think that it looks good in > > general, > > > but I thought a bit more around how to share stuff between the > > > closures and I agree that adding those extra virtual functions > > > doesn?t really feel worth it. I?m wondering if a solution > > where we > > > revert back to letting destination be the ?real destination? > (not > > > ever pointing to the shadow region) and add a > > copy_destination which > > > is destination + offset. To make this work the normal > > > MoveAndUpdateClosure would also have an offset, but it would > > always > > > be 0. If do_addr() is then updated to use the > > copy_destination() in > > > some places we might end up with something pretty nice, but > maybe > > > I?m missing something. > > > > > > I also realized that the current patch will trigger an assert > > > because destination is expected not to be the shadow address: > > > # Internal Error > > > > (open/src/hotspot/share/gc/parallel/psParallelCompact.cpp:3045), > > > pid=12649, tid=12728 > > > # assert(src_cp->destination() == destination) failed: first > > live > > > obj in the space must match the destination > > > > > > So this also suggests that we should keep destination() > returning > > > the real destination. > > > > > > Some other comments: > > > src/hotspot/share/gc/parallel/psParallelCompact.cpp > > > ? > > > 3383 void ShadowClosure::complete_region(ParCompactionManager > > *cm, > > > HeapWord *dest_addr, > > > 3384 > > > PSParallelCompact::RegionData *region_ptr) { > > > 3385 assert(region_ptr->shadow_state() == > > > ParallelCompactData::RegionData::FINISH, "Region should be > > finished?); > > > > > > This assertion will also trigger when running with a debug > > build and > > > at this point the shadow state should be SHADOW not FINISH. > > > ? > > > > > > src/hotspot/share/gc/parallel/psParallelCompact.hpp > > > ? > > > 632 inline bool > > ParallelCompactData::RegionData::mark_filled() { > > > 633 return Atomic::cmpxchg(FILLED, &_shadow_state, > SHADOW) == > > > SHADOW; > > > 634 } > > > > > > Since we never check the return value here we should make it > void > > > and maybe instead add an assert that the return value is > SHADOW. > > > ? > > > > > > When you addressed these comments, would it be possible to > > include > > > both the full patch and and the incremental changes from the > > current > > > version. That makes it easier for the reviewers to see what > > changed > > > between version of the patch. > > > > > > Thanks, > > > Stefan > > > > > > > 24 okt. 2019 kl. 14:16 skrev Stefan Johansson > > > > > > > >>: > > > > > > > > Hi Haoyu, > > > > > > > > On 2019-10-23 17:15, Haoyu Li wrote: > > > >> Hi Stefan, > > > >> Thanks for your constructive feedback. I've addressed all > the > > > issues you mentioned, and the updated patch is attached in > > this email. > > > > Nice, I will look at the patch next week, but I'll shortly > > answer > > > your questions right away. > > > > > > > >> During refining the patch, I have a couple of questions: > > > >> 1) Now the MoveAndUpdateClosure and ShadowClosure assume > the > > > destination address is the very beginning of a region, > > instead of an > > > arbitrary address like what it used to be. However, there is > an > > > unused function named PSParallelCompact::move_and_update() > > uses the > > > MoveAndUpdateClosure to process a region from its middle, > which > > > conflicts with the assumption. I notice that you removed this > > > function in your patch, and so did I in the updated patch. > > Does it > > > matter? > > > > Yes, I found this function during my code review and it > > should be > > > removed, but I think that should be handled as a separate > > issue. We > > > can do this removal before this patch goes in. > > > > > > > >> 2) Using the same do_addr() in MoveAndUpdateClosure and > > > ShadowClosure is doable, but it does not reuse all the code > > neatly. > > > Because storing the address of the shadow region in > _destination > > > requires extra virtual functions to handle allocating blocks > > in the > > > start_array and setting addresses of deferred objects. In > > > particular, allocate_blocks() and set_deferred_object_for() > > in both > > > closures are added. Is it worth avoiding to use _offset to > > calculate > > > the shadow_destination? > > > > Ok, sounds like it might be better to have specific > do_addr() > > > functions then. I'll think some more around this when > > reviewing the > > > new patch in depth. > > > > > > > >> If there are any problems with this patch, please contact > me > > > anytime. I'm more than happy to keep improving the code. > Thanks > > > again for reviewing. > > > >> > > > > Sound good, thanks, > > > > Stefan > > > > > > From igor.ignatyev at oracle.com Sun Nov 17 19:00:33 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Sun, 17 Nov 2019 11:00:33 -0800 Subject: RFR(S) : 8147017 : Platform.isGraal should be removed Message-ID: <981118AF-1DAD-4231-9FA6-7A89A46E5EDB@oracle.com> http://cr.openjdk.java.net/~iignatyev//8147017/webrev.00/index.html > 16 lines changed: 2 ins; 8 del; 6 mod; Hi all, jdk.test.lib.Platform.isGraal method assumes that JVM w/ Graal as JIT has 'Graal VM' in its name, which is wrong, and caused other to incorrectly assume that '-graal' flag exist and must be used to select Graal compiler. the patch removes this method and updates its only meaningful usage in TestGCLogMessages test. TestGCLogMessages test should use LogMessageWithLevelC2OrJVMCIOnly only when c2 or graal is available, so it's been updated to use corresponding methods of sun.hotspot.code.Compiler class, which requires WhiteBoxAPI being enabled. JBS: https://bugs.openjdk.java.net/browse/JDK-8147017 webrev: http://cr.openjdk.java.net/~iignatyev//8147017/webrev.00/index.html testing: tier1 + TestGCLogMessages w/ different JIT configurations Thanks, -- Igor From per.liden at oracle.com Mon Nov 18 10:15:31 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 18 Nov 2019 11:15:31 +0100 Subject: RFR: 8234312: ZGC: Adjust warmup criteria Message-ID: <88a2d5d2-4dca-a17e-4411-73700d8593e7@oracle.com> JDK-8232001 introduced logic which ignores "Metastace GC Threshold" collections until the GC has warmed up. While this works in principle, it also causes some intermittent metaspace test failures. These tests use a small MaxMetaspaceSize, and aggressively loads and unloads classes. It's not completely obvious why these tests started to fail after JDK-8232001. However, we will be collecting metaspace a little bit later now, which could mean we have more fragmentation and are not able to free as much memory when the GC eventually happens. This patch reverts to the old behavior (we no longer ignore "Metspace GC Threshold" collections until the GC is warm), but instead we only consider the GC to be warm once we've done three "Warmup" GCs. Bug: https://bugs.openjdk.java.net/browse/JDK-8234312 Webrev: http://cr.openjdk.java.net/~pliden/8234312/webrev.0 Testing: Tier1-7 on ZGC. Multiple manual runs of vmTestbase/gc/gctests/LoadUnloadGC (the test that intermittently fails in our CI). /Per From thomas.schatzl at oracle.com Mon Nov 18 14:33:00 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 18 Nov 2019 15:33:00 +0100 Subject: RFR: 8220465: Use shadow regions for faster ParallelGC full GCs In-Reply-To: References: <8ef5b52e-d6fc-3073-5ca7-44c87c1eb981@oracle.com> <92277aab-0578-9e2c-3f4f-55ae1e8c94a9@oracle.com> <400df998-171a-5bbe-9f3e-01af1781afb4@oracle.com> <955C4FA4-FC18-446C-851A-0D04A916D88D@oracle.com> <4fce596c-a4eb-5da6-1b47-b4e2314de3c5@oracle.com> <26ff7fee-2457-53df-c933-37feb3c87ae9@oracle.com> Message-ID: <19ea83db-b3d1-4875-2487-d19e5596d9b3@oracle.com> Hi all, On 16.11.19 06:13, Haoyu Li wrote: > Hi Stefan, > > Thanks very much for all your reviewing effort! Please feel free to contact > me if there is any problem. Looking forward to hearing from you! > > Best Regards, > Haoyu Li > > Stefan Johansson ?2019?11?15??? ??5:21??? > >> Hi Haoyu, >> >> Thanks for the updates, here are new webrevs: >> Full: http://cr.openjdk.java.net/~sjohanss/8220465/03/ >> Inc: http://cr.openjdk.java.net/~sjohanss/8220465/02-03/ >> >> I have asked more people to look at the change now, so you might await >> some more feedback. >> There were a few style issues that I am not all listing below. I also started renaming quite a few methods because they were (imo) misnamed, did not capture the intent of the method, and I have the strong suspicion that the same things are named differently at times - at least as far as I understood. That makes reviewing very confusing and needs to be cleaned up. My initial changes are available at the following link, but it became clear to me that this needs more care: http://cr.openjdk.java.net/~tschatzl/8220465/webrev.03.suggestions/ As indicated in the URL, these are merely suggestions. Other notes: - static const ints can be initialized in the definition (UNUSED, SHADOW, ...); also they should be CamelCased; they are very unspecific too - I added some prefix to distinguish them a bit. - the documentation about this change is imho lacking. - It would be nice to explain the idea of shadow regions somewhere assuming that you know how parallel works. Including the reference to the paper. :) - some of the comments just show what code (often a single statement) does, not the what and why or the reason why a particular method or member exists. Or explains one or the other. E.g. "The shadow region array, we use it in a LIFO fashion, so that we can reuse shadow regions for better data locality and utilization" - at this point we have no idea what a "shadow region" is and we can't find out easily because it is called "shadow record" or "steal record" elsewhere. Something better could be: "Contains currently free shadow regions (assuming we converge on that name). We use it in a LIFO fashion for better data locality and utilization." - I think there is a missed optimization opportunity in (now) PSParallelCompact::initialize_shadow_regions(). There, the code initializes the "free" region ids to region_at_top+1 to end_region of a particular space. If the top for a given space is at a region boundary (e.g. if a space is empty, which is probably common for one of the survivor spaces), you loose a single region per space. One reason might that the code uses region "0" as sentinel to indicate "there is no shadow region available" in ParCompactionManager::acquire_shadow_region(). This could be fixed by improving the code in PSParallelCompact::initialize_shadow_regions() and use a sentinel region value of (size_t)~0 (as an explicit constant). Even if you do not change this, please introduce an explicit constant for this sentinel value. This makes the code more self-explanatory. - at least in ParallelCompactData::RegionData::try_steal I would add a dirty read of the _shadow_state to avoid the overhead of obviously unsuccessful steal attempts (I do not know about frequencies of those, so ymmv, but probably it would be easiest to add it everywhere). Also all the cmpxchg can/should use memory_order_relaxed to avoid the two full fences every time accessed as far as I can tell. - not sure about whether "acquire_shadow_region()/release_shadow_region" are good names for "PSParallelCompact::try_pop_shadow_region/push_shadow_region" (or something else). "Acquire"/"release" has a very specific semantic related to a completely different area (memory ordering in MP systems), so we should probably avoid using them. There are other well-used pairs of names to add and remove elements to a container too. - the changes in PSParallelCompact sometimes use the terms "steal_record", "shadow_record" and "shadow_region" (e.g. _shadow_region_array) interchangeably. Can you give a reason for this? I am good with any (with a preference for "shadow_region" since it gives an idea of the contents while "record" is quite generic), but it makes reading the code harder than necessary. - the names of the new methods e.g. in PSParallelCompact::RegionData should be more precise; e.g. please add what does "try_push" wants to push? Or "try_steal" steal? Not even the comments for these contain that information, and I believe that by better naming of the methods, we can avoid the comments completely in most cases. I *think* the change is good otherwise, but I'm constantly in need of referencing back to the definitions of the members/methods when looking through it, which makes me a bit uneasy. So I would like to ask you to improve above points first before I can give my go-ahead, and then I will look at it again. Thanks, Thomas From erik.osterlund at oracle.com Mon Nov 18 16:29:02 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Mon, 18 Nov 2019 17:29:02 +0100 Subject: RFR: 8234312: ZGC: Adjust warmup criteria In-Reply-To: <88a2d5d2-4dca-a17e-4411-73700d8593e7@oracle.com> References: <88a2d5d2-4dca-a17e-4411-73700d8593e7@oracle.com> Message-ID: Hi Per, Looks good. Thanks, /Erik On 11/18/19 11:15 AM, Per Liden wrote: > JDK-8232001 introduced logic which ignores "Metastace GC Threshold" > collections until the GC has warmed up. While this works in principle, > it also causes some intermittent metaspace test failures. These tests > use a small MaxMetaspaceSize, and aggressively loads and unloads > classes. It's not completely obvious why these tests started to fail > after JDK-8232001. However, we will be collecting metaspace a little > bit later now, which could mean we have more fragmentation and are not > able to free as much memory when the GC eventually happens. > > This patch reverts to the old behavior (we no longer ignore "Metspace > GC Threshold" collections until the GC is warm), but instead we only > consider the GC to be warm once we've done three "Warmup" GCs. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234312 > Webrev: http://cr.openjdk.java.net/~pliden/8234312/webrev.0 > > Testing: Tier1-7 on ZGC. Multiple manual runs of > vmTestbase/gc/gctests/LoadUnloadGC (the test that intermittently fails > in our CI). > > /Per From stefan.johansson at oracle.com Mon Nov 18 16:35:53 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Mon, 18 Nov 2019 17:35:53 +0100 Subject: RFR: 8141637: Parallelize single threaded heap region iteration during Pre Evacuate Collection Set In-Reply-To: <96d72eba-7ad9-30a0-18d2-944e6cf826be@oracle.com> References: <96d72eba-7ad9-30a0-18d2-944e6cf826be@oracle.com> Message-ID: <266d3b83-db50-3f56-c74c-fe4225006d51@oracle.com> Hi Thomas, Thanks for reviewing. On 2019-11-15 11:35, Thomas Schatzl wrote: > Hi, > > On 15.11.19 10:38, Stefan Johansson wrote: >> Hi, >> >> Please review this fix to parallelize parts of the G1 young gc >> preparation. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8141637 >> Webrev: http://cr.openjdk.java.net/~sjohanss/8141637/00/ >> >> Summary >> To sub-phases of the "Pre Evacuate Collection Set" phase, "Region >> Register" and "Prepare Heap Roots" previously did single threaded >> iteration of all regions in the heap. For heaps with a lot of regions >> this become a problem so this change groups these iterations into one >> task that is executed in parallel. >> >> Testing >> Manual performance testing show good results on large heaps and no >> problem on small heaps. Functional tests using mach5 tier 1-4. >> > > A few nits: > > - G1RemSet::prepare_region_for_scannning: for naming consistency I would > prefer "prepare_region_for_scan" I agree, done. > > Also there is now a near-name clash with > prepare_for_scan_heap_roots(uint region_idx). Maybe rename the latter to > "disable_scan_for_region" or something similar (which is a pre-existing > bad name), with a better comment? Fixed. > > - please move the comment for G1RemSet::prepare_region_for_scanning() in > the cpp file to the hpp file. Done. > > - existing: in G1RemSet::prepare_region_for_scanning: please add a > comment in the code path for regions in the collection set that "we do > not need to disable scanning for these regions as the default is to not > scan". > > Actually I think it is useful to move the clear_scan_top() call for all > regions to here from G1RemSetScanState::prepare(), which makes it clear > that we reset it for all regions here. (Obviating the need for this > comment). > > Also the remaining reseet of > G1RemSetScanstate::_collection_set_iter_state from > G1RemSetScanstate::prepare could be moved here (or calling a method that > initializes a given region in G1RemSetScanState). Updated prepare_region_for_scan to take these three comments into account, I think this really makes sense. > > - move the declaration of G1RemSet::prepare_region_for_scanning() below > cleanup_after_scan_heap_roots. The comment of > prepare_for_scan_heap_roots() references the cleanup method, as > following the prepare method. I.e. the new method declaration is > confusingly placed inbetween. Done. > > - unnecessary newlines: before > G1PrepareRegionsClosure::humongous_region_is_candidate(), in > G1PrepareEvacuationTask::G1PrepareEvacuationTask() (probably just > collapse the end bracket in the same line as the opening bracket). After > the closing bracket of G1PrepareRregionsClosure::do_heap_region(). > Oups, removed. Updated webrevs: Full: http://cr.openjdk.java.net/~sjohanss/8141637/01/ Inc: http://cr.openjdk.java.net/~sjohanss/8141637/00-01/ Thanks, Stefan > Thanks, > ? Thomas From thomas.schatzl at oracle.com Mon Nov 18 17:59:16 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 18 Nov 2019 18:59:16 +0100 Subject: RFR: 8141637: Parallelize single threaded heap region iteration during Pre Evacuate Collection Set In-Reply-To: <266d3b83-db50-3f56-c74c-fe4225006d51@oracle.com> References: <96d72eba-7ad9-30a0-18d2-944e6cf826be@oracle.com> <266d3b83-db50-3f56-c74c-fe4225006d51@oracle.com> Message-ID: <20803d82-e49d-0833-6134-ce881f5e0480@oracle.com> Hi Stefan, On 18.11.19 17:35, Stefan Johansson wrote: > Hi Thomas, > > Thanks for reviewing. > > On 2019-11-15 11:35, Thomas Schatzl wrote: >> Hi, >> >> On 15.11.19 10:38, Stefan Johansson wrote: >>> Hi, >>> >>> Please review this fix to parallelize parts of the G1 young gc >>> preparation. >>> >>> Issue: https://bugs.openjdk.java.net/browse/JDK-8141637 >>> Webrev: http://cr.openjdk.java.net/~sjohanss/8141637/00/ [...] > > Updated webrevs: > Full: http://cr.openjdk.java.net/~sjohanss/8141637/01/ > Inc: http://cr.openjdk.java.net/~sjohanss/8141637/00-01/ > looks good. It would be nice to rename the G1RemSet::prepare_for_scan_heap_roots() method to G1RemSet::exclude_from_scan() as we discussed internally in this change too. It does not seem to warrant an extra CR, unless you have more planned. I would not need a re-review for that rename, but you need to wait for another reviewer anyway.... Thanks, Thomas From sangheon.kim at oracle.com Mon Nov 18 21:31:08 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Mon, 18 Nov 2019 13:31:08 -0800 Subject: RFR(XS): 8232533: G1 uses only a single thread for pretouching the java heap Message-ID: <32e58527-abb7-43c8-719d-b64529abbba6@oracle.com> Hi all, Can I have some reviews for this small patch? G1 initiates only 1 GC thread for faster start-up and then initialize more when we need more GC threads. This is also same when we enable +AlwaysPreTouch option, so 1 thread touching all heap situation happens as the CR described. The proposed patch is trying to cap the total worker thread count instead of active worker thread count. And this will make faster start-up as well. CR: https://bugs.openjdk.java.net/browse/JDK-8232533 Webrev: http://cr.openjdk.java.net/~sangheki/8232533/webrev.0/ Testing: hs-tier1 Thanks, Sangheon From mikhailo.seledtsov at oracle.com Mon Nov 18 22:06:39 2019 From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com) Date: Mon, 18 Nov 2019 14:06:39 -0800 Subject: RFR(S) : 8147017 : Platform.isGraal should be removed In-Reply-To: <981118AF-1DAD-4231-9FA6-7A89A46E5EDB@oracle.com> References: <981118AF-1DAD-4231-9FA6-7A89A46E5EDB@oracle.com> Message-ID: <5e1d17af-798f-123f-ef5e-3957b98a8340@oracle.com> Looks good to me, Misha On 11/17/19 11:00 AM, Igor Ignatyev wrote: > http://cr.openjdk.java.net/~iignatyev//8147017/webrev.00/index.html >> 16 lines changed: 2 ins; 8 del; 6 mod; > Hi all, > > jdk.test.lib.Platform.isGraal method assumes that JVM w/ Graal as JIT has 'Graal VM' in its name, which is wrong, and caused other to incorrectly assume that '-graal' flag exist and must be used to select Graal compiler. the patch removes this method and updates its only meaningful usage in TestGCLogMessages test. TestGCLogMessages test should use LogMessageWithLevelC2OrJVMCIOnly only when c2 or graal is available, so it's been updated to use corresponding methods of sun.hotspot.code.Compiler class, which requires WhiteBoxAPI being enabled. > > > JBS: https://bugs.openjdk.java.net/browse/JDK-8147017 > webrev: http://cr.openjdk.java.net/~iignatyev//8147017/webrev.00/index.html > testing: tier1 + TestGCLogMessages w/ different JIT configurations > > Thanks, > -- Igor From kim.barrett at oracle.com Mon Nov 18 22:07:48 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 18 Nov 2019 17:07:48 -0500 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting In-Reply-To: References: <34BD822F-20DD-43E3-956F-09622EB62148@oracle.com> <5c700c0a-954e-6187-67b4-cc523cace95c@oracle.com> Message-ID: <387D12CA-4C5A-4DE5-A6E7-ED6739AC7173@oracle.com> > On Nov 13, 2019, at 8:59 PM, Man Cao wrote: > > Thanks for the reviews and testing, very helpful! > > I have addressed all comments, new webrevs: > http://cr.openjdk.java.net/~manc/8087198/webrev.01/ > Incremental: https://cr.openjdk.java.net/~manc/8087198/webrev.00-01.inc/ > > Some tests involving G1AddMetaspaceDependency are crashing in > debug/fastdebug builds, due to > https://bugs.openjdk.java.net/browse/JDK-8234127 and G1UpdateBufferSize=1. > I'm working on fixing this bug for the hashtable. > I also see some Windows build failure on Submit repo, not sure if it is > related to JDK-8234127, could someone find the logs for this run? > Job: mach5-one-manc-JDK-8087198-1-20191113-2315-6689774 > > I also tested the performance for sorting order of the cards on > BigRamTester, below are the CPU time for refinement threads, averaged > across 34 trials with 95% confidence intervals in brackets: > base (no batching): 437656.676 [425374.295, 449939.057] > batching and decreasing order: 424714.529 [412841.728, 436587.330] > batching and increasing order: 459918.294 [448483.304, 471353.284] > >> [?] >> collect_and_clean_cards could use two-finger compaction of the >> _node_buffer in place. (Similar to the SATB buffer filtering.) > Great! I should have thought of this after removing the MemRegions. > Now it's doing the two-finger compaction, and no more memcpy and > ResourceMark. src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp 253 size_t clean_cards() { Though it will work and uses two pointers, that isn't the two-finger compaction algorithm I was suggesting. I intended something based on the algorithm attributed to Edwards (and used by us for SATB buffer filtering; see SATBMarkQueue::apply_filter, which might even be usable as-is with a suitable filter_out function); see Jones & Lins GC book, section 5.3; or Jones et al GC Handbook, section 3.1; or web search for "edwards two-finger compaction". It has the benefit over the proposed algorithm of doing no more and possibly significantly fewer element moves. It doesn't maintain the order of the elements, but we're doing a sort after the compaction. >> I think this function is poorly named. We're not abandoning the >> cards. We are instead abandoning the refinement of the cards in part >> of the buffer. Something like keep_unrefined_cards might be better. > Renamed to redirty_unrefined_cards. That?s a much better name. >> This change re-reads top() after the fence. >> ... >> I *think* re-read is okay for humongous regions too, but haven't fully >> convinced myself of that yet. > I thought about this further and convinced myself it is safe. > I added some DEBUG_ONLY code to check the two reads of top() > should return the same value, by using a KVHashtable. > I'm not sure if such checking code is desirable. Please advise. > Maybe I should use ResourceHashtable as Ioi suggested in JDK-8234127. > > ------------------------------------------------------------------------------ >> I already split this CR into two. >> From a functional POV hs-tier1-5 pass with the change. > Thanks for the work! Let's finalize what to do with the DEBUG_ONLY > KVHashtable first, > then do another round of testing. I think the argument for top being stable for unfiltered humongous regions is correct, and it's not worth cluttering the code with the debug-only hashtable. Some additional detailed comments: ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp 259 for (int i = _node_buffer_size - 1; i >= static_cast(start); --i) { (Assuming the sliding algorithm is retained, rather than using Edwards as discussed above) I would prefer keeping the indices as size_t and avoiding conversions (both implicit and explicit), i.e. for (size_t i = _node_buffer_size; i > start; /* blank */ ) { --i; ... } ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp 1388 // fence, because top is stable for unfiltered humongous regions, so it must I'd also like the old/archive cases to be covered by the comment, so future readers don't hit a "but what about..." pause. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp 299 _node_buffer(reinterpret_cast(BufferNode::make_buffer_from_node(node))), I would prefer leaving the type alone here and instead using static_cast(_node_buffer[i]) (maybe packaged in a little helper here). But I have a strong dislike for reinterpret_cast (with whatever spelling) where it can reasonably be avoided. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp 246 qsort(_node_buffer + start_index, Consider using QuickSort::sort, in utilities/quicksort.hpp. That will likely inline the trivial comparison, where using library qsort might not. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp 324 G1RefineBufferedCards buffered_cards(node, 325 buffer_size(), 326 total_refined_cards); 327 return buffered_cards.refine(worker_id); Why is worker_id passed as an argument to refine, while others are passed to the constructor? Seems like worker_id could also be passed to the constructor and captured in a member, for use in the one place where it is needed. ------------------------------------------------------------------------------ From kim.barrett at oracle.com Tue Nov 19 06:19:55 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 19 Nov 2019 01:19:55 -0500 Subject: RFR (M): 8233588: Clean up SurvRateGroup In-Reply-To: <1ab7fd39-fc0c-79ee-e925-fc8b88e32177@oracle.com> References: <1ab7fd39-fc0c-79ee-e925-fc8b88e32177@oracle.com> Message-ID: <573C1C2E-FEA7-4F2F-8AFE-5A74560832E8@oracle.com> > On Nov 12, 2019, at 10:23 AM, Thomas Schatzl wrote: > > Hi all, > > can I have some reviews for this change that cleans up the SurvRateGroup class. In particular, while working with it I found that it contains two members that are duplicates of others. > > This removed a few methods, which in turn made some others obsolete. > > Further I tried to improve encapsulation so that not everyone needs to know all details down to SurvRateGroup. > > That's why this change is a bit larger than you'd probably expect, but it contains a significant amount of code deletion! :) > > CR: > https://bugs.openjdk.java.net/browse/JDK-8233588 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8233588/webrev/ > Testing: > hs-tier1-5 > > Thanks, > Thomas Looks good. One pre-existing issue: ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1CollectionSet.cpp 343 if (r->age_in_surv_rate_group() < 0) { [pre-existing] A few lines before we checked r->has_surv_rate_group(). If it doesn't have one, should we really be checking the age? Looks like we'd currently assert if we don't have one. I think just making this an "else if? with the preceeding ?has? check fixes it. ------------------------------------------------------------------------------ From per.liden at oracle.com Tue Nov 19 07:14:01 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 08:14:01 +0100 Subject: RFR: 8234379: ZGC: Do not resize TALBs unless -XX:ResizeTLAB is enabled Message-ID: <3c35513b-3e8d-1600-610a-bbb4d931be07@oracle.com> ZGC currently calls ThreadLocalAllocBuffer::resize() regardless of if -XX:ResizeTLAB is enabled or not. This causes the following assert to fail: # Internal Error (open/src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp:130), pid=22060, tid=22069 # assert(ResizeTLAB) failed: Should not call this otherwise Bug: https://bugs.openjdk.java.net/browse/JDK-8234379 Webrev: http://cr.openjdk.java.net/~pliden/8234379/webrev.0 /Per From per.liden at oracle.com Tue Nov 19 07:16:26 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 08:16:26 +0100 Subject: RFR: 8234312: ZGC: Adjust warmup criteria In-Reply-To: References: <88a2d5d2-4dca-a17e-4411-73700d8593e7@oracle.com> Message-ID: Thanks Erik! /Per On 11/18/19 5:29 PM, erik.osterlund at oracle.com wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > > On 11/18/19 11:15 AM, Per Liden wrote: >> JDK-8232001 introduced logic which ignores "Metastace GC Threshold" >> collections until the GC has warmed up. While this works in principle, >> it also causes some intermittent metaspace test failures. These tests >> use a small MaxMetaspaceSize, and aggressively loads and unloads >> classes. It's not completely obvious why these tests started to fail >> after JDK-8232001. However, we will be collecting metaspace a little >> bit later now, which could mean we have more fragmentation and are not >> able to free as much memory when the GC eventually happens. >> >> This patch reverts to the old behavior (we no longer ignore "Metspace >> GC Threshold" collections until the GC is warm), but instead we only >> consider the GC to be warm once we've done three "Warmup" GCs. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234312 >> Webrev: http://cr.openjdk.java.net/~pliden/8234312/webrev.0 >> >> Testing: Tier1-7 on ZGC. Multiple manual runs of >> vmTestbase/gc/gctests/LoadUnloadGC (the test that intermittently fails >> in our CI). >> >> /Per > From per.liden at oracle.com Tue Nov 19 07:32:10 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 08:32:10 +0100 Subject: RFR: 8234382: Test tools/javac/processing/model/testgetallmembers/Main.java using too small heap Message-ID: Hi, Please review this one-liner test fix. The test tools/javac/processing/model/testgetallmembers/Main.java is assuming that a 256M heap is enough to hold its live-set, but this is only true under some conditions. There are a number of JVM flags that, when used, can break this assumption. For example, choice of GC, compressed oop, object alignment, and other options affecting the heap layout or allocation strategy. Under ideal conditions, the test is already fairly close to using the whole heap, so it doesn't take that much to push it over the edge. For example, the following combinations all fail: -XX:+UseSerialGC -XX:-UseCompressedOops -XX:+UseParallelGC -XX:-UseCompressedOops -XX:+UseZGC (always has -XX:-UseCompressedOops) -XX:+UseG1GC -XX:-UseCompressedOops -XX:ObjectAlignmentInBytes=16 I suggest we bump the max heap size to something like 512M, to give the test more headroom and make it less sensitive to exact choice of JVM flags. Bug: https://bugs.openjdk.java.net/browse/JDK-8234382 Webrev: http://cr.openjdk.java.net/~pliden/8234382/webrev.0 /Per From erik.osterlund at oracle.com Tue Nov 19 08:13:38 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 19 Nov 2019 09:13:38 +0100 Subject: RFR: 8234382: Test tools/javac/processing/model/testgetallmembers/Main.java using too small heap In-Reply-To: References: Message-ID: <8757fad5-88b5-4e28-af8e-01dd457deec9@oracle.com> Hi Per, Looks good. Thanks, /Erik On 2019-11-19 08:32, Per Liden wrote: > Hi, > > Please review this one-liner test fix. > > The test tools/javac/processing/model/testgetallmembers/Main.java is > assuming that a 256M heap is enough to hold its live-set, but this is > only true under some conditions. There are a number of JVM flags that, > when used, can break this assumption. For example, choice of GC, > compressed oop, object alignment, and other options affecting the heap > layout or allocation strategy. Under ideal conditions, the test is > already fairly close to using the whole heap, so it doesn't take that > much to push it over the edge. For example, the following combinations > all fail: > > -XX:+UseSerialGC -XX:-UseCompressedOops > -XX:+UseParallelGC -XX:-UseCompressedOops > -XX:+UseZGC (always has -XX:-UseCompressedOops) > -XX:+UseG1GC -XX:-UseCompressedOops -XX:ObjectAlignmentInBytes=16 > > I suggest we bump the max heap size to something like 512M, to give > the test more headroom and make it less sensitive to exact choice of > JVM flags. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234382 > Webrev: http://cr.openjdk.java.net/~pliden/8234382/webrev.0 > > /Per From erik.osterlund at oracle.com Tue Nov 19 08:15:14 2019 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Tue, 19 Nov 2019 09:15:14 +0100 Subject: RFR: 8234379: ZGC: Do not resize TALBs unless -XX:ResizeTLAB is enabled In-Reply-To: <3c35513b-3e8d-1600-610a-bbb4d931be07@oracle.com> References: <3c35513b-3e8d-1600-610a-bbb4d931be07@oracle.com> Message-ID: <1acc25b8-7a1b-dcfe-bc5d-4bef5dde0513@oracle.com> Hi Per, Looks good. Thanks, /Erik On 2019-11-19 08:14, Per Liden wrote: > ZGC currently calls ThreadLocalAllocBuffer::resize() regardless of if > -XX:ResizeTLAB is enabled or not. This causes the following assert to > fail: > > #? Internal Error > (open/src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp:130), > pid=22060, tid=22069 > #? assert(ResizeTLAB) failed: Should not call this otherwise > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234379 > Webrev: http://cr.openjdk.java.net/~pliden/8234379/webrev.0 > > /Per From thomas.schatzl at oracle.com Tue Nov 19 08:21:31 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 19 Nov 2019 09:21:31 +0100 Subject: RFR: 8234382: Test tools/javac/processing/model/testgetallmembers/Main.java using too small heap In-Reply-To: References: Message-ID: Hi, On 19.11.19 08:32, Per Liden wrote: > Hi, > > Please review this one-liner test fix. > > The test tools/javac/processing/model/testgetallmembers/Main.java is[...] > I suggest we bump the max heap size to something like 512M, to give the > test more headroom and make it less sensitive to exact choice of JVM flags. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234382 > Webrev: http://cr.openjdk.java.net/~pliden/8234382/webrev.0 looks good. Thomas From thomas.schatzl at oracle.com Tue Nov 19 08:22:58 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 19 Nov 2019 09:22:58 +0100 Subject: RFR: 8234379: ZGC: Do not resize TALBs unless -XX:ResizeTLAB is enabled In-Reply-To: <3c35513b-3e8d-1600-610a-bbb4d931be07@oracle.com> References: <3c35513b-3e8d-1600-610a-bbb4d931be07@oracle.com> Message-ID: Hi, On 19.11.19 08:14, Per Liden wrote: > ZGC currently calls ThreadLocalAllocBuffer::resize() regardless of if > -XX:ResizeTLAB is enabled or not. This causes the following assert to fail: > > #? Internal Error > (open/src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp:130), > pid=22060, tid=22069 > #? assert(ResizeTLAB) failed: Should not call this otherwise > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234379 > Webrev: http://cr.openjdk.java.net/~pliden/8234379/webrev.0 > > /Per looks good. Thomas From per.liden at oracle.com Tue Nov 19 08:33:43 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 09:33:43 +0100 Subject: RFR: 8234379: ZGC: Do not resize TALBs unless -XX:ResizeTLAB is enabled In-Reply-To: <1acc25b8-7a1b-dcfe-bc5d-4bef5dde0513@oracle.com> References: <3c35513b-3e8d-1600-610a-bbb4d931be07@oracle.com> <1acc25b8-7a1b-dcfe-bc5d-4bef5dde0513@oracle.com> Message-ID: <83b2101e-628c-d581-319e-c4e5b9b0cd46@oracle.com> Thanks Erik! /Per On 11/19/19 9:15 AM, Erik ?sterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > > On 2019-11-19 08:14, Per Liden wrote: >> ZGC currently calls ThreadLocalAllocBuffer::resize() regardless of if >> -XX:ResizeTLAB is enabled or not. This causes the following assert to >> fail: >> >> #? Internal Error >> (open/src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp:130), >> pid=22060, tid=22069 >> #? assert(ResizeTLAB) failed: Should not call this otherwise >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234379 >> Webrev: http://cr.openjdk.java.net/~pliden/8234379/webrev.0 >> >> /Per > From per.liden at oracle.com Tue Nov 19 08:33:17 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 09:33:17 +0100 Subject: RFR: 8234382: Test tools/javac/processing/model/testgetallmembers/Main.java using too small heap In-Reply-To: <8757fad5-88b5-4e28-af8e-01dd457deec9@oracle.com> References: <8757fad5-88b5-4e28-af8e-01dd457deec9@oracle.com> Message-ID: Thanks Erik! /Per On 11/19/19 9:13 AM, Erik ?sterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > > On 2019-11-19 08:32, Per Liden wrote: >> Hi, >> >> Please review this one-liner test fix. >> >> The test tools/javac/processing/model/testgetallmembers/Main.java is >> assuming that a 256M heap is enough to hold its live-set, but this is >> only true under some conditions. There are a number of JVM flags that, >> when used, can break this assumption. For example, choice of GC, >> compressed oop, object alignment, and other options affecting the heap >> layout or allocation strategy. Under ideal conditions, the test is >> already fairly close to using the whole heap, so it doesn't take that >> much to push it over the edge. For example, the following combinations >> all fail: >> >> -XX:+UseSerialGC -XX:-UseCompressedOops >> -XX:+UseParallelGC -XX:-UseCompressedOops >> -XX:+UseZGC (always has -XX:-UseCompressedOops) >> -XX:+UseG1GC -XX:-UseCompressedOops -XX:ObjectAlignmentInBytes=16 >> >> I suggest we bump the max heap size to something like 512M, to give >> the test more headroom and make it less sensitive to exact choice of >> JVM flags. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234382 >> Webrev: http://cr.openjdk.java.net/~pliden/8234382/webrev.0 >> >> /Per > From thomas.schatzl at oracle.com Tue Nov 19 08:35:37 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 19 Nov 2019 09:35:37 +0100 Subject: RFR(XS): 8232533: G1 uses only a single thread for pretouching the java heap In-Reply-To: <32e58527-abb7-43c8-719d-b64529abbba6@oracle.com> References: <32e58527-abb7-43c8-719d-b64529abbba6@oracle.com> Message-ID: <46a08221-0ae2-7025-911b-899d124a5529@oracle.com> Hi, On 18.11.19 22:31, sangheon.kim at oracle.com wrote: > Hi all, > > Can I have some reviews for this small patch? > > G1 initiates only 1 GC thread for faster start-up and then initialize > more when we need more GC threads. > This is also same when we enable +AlwaysPreTouch option, so 1 thread > touching all heap situation happens as the CR described. > > The proposed patch is trying to cap the total worker thread count > instead of active worker thread count. And this will make faster > start-up as well. the rationale is that supposedly if a user is specifying AlwaysPreTouch, he wants best performance, and does not care so much about initializing the time it takes to initialize the extra threads. Also initializing the threads first and then doing the work in parallel should be faster than pretouching with only a single thread. :) I.e. in the case of the CR, it takes 2mins to pretouch the heap - initializing the threads shouldn't be that slow :P > > CR: https://bugs.openjdk.java.net/browse/JDK-8232533 > Webrev: http://cr.openjdk.java.net/~sangheki/8232533/webrev.0/ > Testing: hs-tier1 Please fix the copyright date before pushing if you want. I do not need a re-review for this change. Looks good otherwise. Thanks, Thomas From stefan.johansson at oracle.com Tue Nov 19 09:11:05 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 19 Nov 2019 10:11:05 +0100 Subject: RFR(XS): 8232533: G1 uses only a single thread for pretouching the java heap In-Reply-To: <46a08221-0ae2-7025-911b-899d124a5529@oracle.com> References: <32e58527-abb7-43c8-719d-b64529abbba6@oracle.com> <46a08221-0ae2-7025-911b-899d124a5529@oracle.com> Message-ID: <9020bf46-96a7-79ad-fe8b-9e1cd9f89498@oracle.com> On 2019-11-19 09:35, Thomas Schatzl wrote: > Hi, > > On 18.11.19 22:31, sangheon.kim at oracle.com wrote: >> Hi all, >> >> Can I have some reviews for this small patch? >> >> G1 initiates only 1 GC thread for faster start-up and then initialize >> more when we need more GC threads. >> This is also same when we enable +AlwaysPreTouch option, so 1 thread >> touching all heap situation happens as the CR described. >> >> The proposed patch is trying to cap the total worker thread count >> instead of active worker thread count. And this will make faster >> start-up as well. > > ? the rationale is that supposedly if a user is specifying > AlwaysPreTouch, he wants best performance, and does not care so much > about initializing the time it takes to initialize the extra threads. > Also initializing the threads first and then doing the work in parallel > should be faster than pretouching with only a single thread. :) > > I.e. in the case of the CR, it takes 2mins to pretouch the heap - > initializing the threads shouldn't be that slow :P > >> >> CR: https://bugs.openjdk.java.net/browse/JDK-8232533 >> Webrev: http://cr.openjdk.java.net/~sangheki/8232533/webrev.0/ >> Testing: hs-tier1 > > Please fix the copyright date before pushing if you want. I do not need > a re-review for this change. Looks good otherwise. Looks good to me too, Stefan > > Thanks, > ? Thomas From stefan.johansson at oracle.com Tue Nov 19 09:23:26 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 19 Nov 2019 10:23:26 +0100 Subject: RFR: 8141637: Parallelize single threaded heap region iteration during Pre Evacuate Collection Set In-Reply-To: <20803d82-e49d-0833-6134-ce881f5e0480@oracle.com> References: <96d72eba-7ad9-30a0-18d2-944e6cf826be@oracle.com> <266d3b83-db50-3f56-c74c-fe4225006d51@oracle.com> <20803d82-e49d-0833-6134-ce881f5e0480@oracle.com> Message-ID: <7feccf49-0e79-204d-d31e-eb7064dc4cc3@oracle.com> Hi Thomas, On 2019-11-18 18:59, Thomas Schatzl wrote: > Hi Stefan, > > On 18.11.19 17:35, Stefan Johansson wrote: >> Hi Thomas, >> >> Thanks for reviewing. >> >> On 2019-11-15 11:35, Thomas Schatzl wrote: >>> Hi, >>> >>> On 15.11.19 10:38, Stefan Johansson wrote: >>>> Hi, >>>> >>>> Please review this fix to parallelize parts of the G1 young gc >>>> preparation. >>>> >>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8141637 >>>> Webrev: http://cr.openjdk.java.net/~sjohanss/8141637/00/ > [...] > >> >> Updated webrevs: >> Full: http://cr.openjdk.java.net/~sjohanss/8141637/01/ >> Inc: http://cr.openjdk.java.net/~sjohanss/8141637/00-01/ >> > > ? looks good. > > It would be nice to rename the G1RemSet::prepare_for_scan_heap_roots() > method to G1RemSet::exclude_from_scan() as we discussed internally in > this change too. It does not seem to warrant an extra CR, unless you > have more planned. > > I would not need a re-review for that rename, but you need to wait for > another reviewer anyway.... Not sure how I missed to include that, fixed and here are the new webrevs: Full: http://cr.openjdk.java.net/~sjohanss/8141637/02 Inc: http://cr.openjdk.java.net/~sjohanss/8141637/01-02 Thanks, Stefan > > Thanks, > ? Thomas > From thomas.schatzl at oracle.com Tue Nov 19 09:25:38 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 19 Nov 2019 10:25:38 +0100 Subject: RFR (M): 8231579: G1's incremental calculation of region elapsed time always uses the same age group for prediction Message-ID: Hi all, can I have reviews for this change that fixes use of the wrong predictors when adding a new mutator region to the collection set as it is retired? G1, through the young remset sampling thread, and the mutator threads when they retire, keep track of a prediction for the time it takes to evacuate the entire young gen. The mutator thread, when it retires a region, updates that value by the current prediction of that retired region: that is where the error occurs: currently, it only ever uses the prediction for the region that has been retired just before GC. Typically a lot of data in this region survives (e.g. 80%+), so the prediction for the overall young gen is heavily inflated. Which means that during mixed gc, a smaller than necessary amount of time is seen as left for evacuating old gen regions, which in turn means that G1 typically does more, shorter than expected mixed gcs. This wastes some throughput, as typically normal young collections can take a much larger eden. This change fixes that by the mutator and the young remset sampling thread not updating the time it takes to copy the contents of eden regions - the predictions for that are fixed anyway after a GC as the predictors for how many objects are expected to be live in a region are not updated while the mutator is running. Only other components of the prediction are. The time to copy the eden regions is later, when G1 finalizes the prediction for the time the young gen will take, added back. CR: https://bugs.openjdk.java.net/browse/JDK-8231579 Webrev: http://cr.openjdk.java.net/~tschatzl/8231579/webrev/ Testing: hs-tier1-5; local tests showed significant decreases in time spent in mixed gcs; dev-submit testing: no significant score changes either with this change in particular or compared to all recent changes Thanks, Thomas From thomas.schatzl at oracle.com Tue Nov 19 09:27:56 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 19 Nov 2019 10:27:56 +0100 Subject: RFR: 8141637: Parallelize single threaded heap region iteration during Pre Evacuate Collection Set In-Reply-To: <7feccf49-0e79-204d-d31e-eb7064dc4cc3@oracle.com> References: <96d72eba-7ad9-30a0-18d2-944e6cf826be@oracle.com> <266d3b83-db50-3f56-c74c-fe4225006d51@oracle.com> <20803d82-e49d-0833-6134-ce881f5e0480@oracle.com> <7feccf49-0e79-204d-d31e-eb7064dc4cc3@oracle.com> Message-ID: Hi Stefan, On 19.11.19 10:23, Stefan Johansson wrote: > Hi Thomas, > > On 2019-11-18 18:59, Thomas Schatzl wrote: >> Hi Stefan, >> [...] >> >>> >>> Updated webrevs: >>> Full: http://cr.openjdk.java.net/~sjohanss/8141637/01/ >>> Inc: http://cr.openjdk.java.net/~sjohanss/8141637/00-01/ >>> >> >> ?? looks good. >> >> It would be nice to rename the G1RemSet::prepare_for_scan_heap_roots() >> method to G1RemSet::exclude_from_scan() as we discussed internally in >> this change too. It does not seem to warrant an extra CR, unless you >> have more planned. >> >> I would not need a re-review for that rename, but you need to wait for >> another reviewer anyway.... > Not sure how I missed to include that, fixed and here are the new webrevs: > Full: http://cr.openjdk.java.net/~sjohanss/8141637/02 > Inc: http://cr.openjdk.java.net/~sjohanss/8141637/01-02 Looks good. Thanks. Thomas From per.liden at oracle.com Tue Nov 19 09:59:33 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 10:59:33 +0100 Subject: RFR: 8234379: ZGC: Do not resize TALBs unless -XX:ResizeTLAB is enabled In-Reply-To: References: <3c35513b-3e8d-1600-610a-bbb4d931be07@oracle.com> Message-ID: <531f878c-63a6-283e-28e7-848b10b82721@oracle.com> Thanks Thomas! /Per On 11/19/19 9:22 AM, Thomas Schatzl wrote: > Hi, > > On 19.11.19 08:14, Per Liden wrote: >> ZGC currently calls ThreadLocalAllocBuffer::resize() regardless of if >> -XX:ResizeTLAB is enabled or not. This causes the following assert to >> fail: >> >> #? Internal Error >> (open/src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp:130), >> pid=22060, tid=22069 >> #? assert(ResizeTLAB) failed: Should not call this otherwise >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234379 >> Webrev: http://cr.openjdk.java.net/~pliden/8234379/webrev.0 >> >> /Per > > ? looks good. > > Thomas From per.liden at oracle.com Tue Nov 19 10:02:43 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 11:02:43 +0100 Subject: RFR: 8234382: Test tools/javac/processing/model/testgetallmembers/Main.java using too small heap In-Reply-To: References: Message-ID: Thanks Thomas! /Per On 11/19/19 9:21 AM, Thomas Schatzl wrote: > Hi, > > On 19.11.19 08:32, Per Liden wrote: >> Hi, >> >> Please review this one-liner test fix. >> >> The test tools/javac/processing/model/testgetallmembers/Main.java is[...] >> I suggest we bump the max heap size to something like 512M, to give >> the test more headroom and make it less sensitive to exact choice of >> JVM flags. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234382 >> Webrev: http://cr.openjdk.java.net/~pliden/8234382/webrev.0 > > ?looks good. > > Thomas From stefan.johansson at oracle.com Tue Nov 19 10:30:13 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 19 Nov 2019 11:30:13 +0100 Subject: RFR: 8141637: Parallelize single threaded heap region iteration during Pre Evacuate Collection Set In-Reply-To: References: <96d72eba-7ad9-30a0-18d2-944e6cf826be@oracle.com> <266d3b83-db50-3f56-c74c-fe4225006d51@oracle.com> <20803d82-e49d-0833-6134-ce881f5e0480@oracle.com> <7feccf49-0e79-204d-d31e-eb7064dc4cc3@oracle.com> Message-ID: On 2019-11-19 10:27, Thomas Schatzl wrote: > Hi Stefan, > > On 19.11.19 10:23, Stefan Johansson wrote: >> Hi Thomas, >> >> On 2019-11-18 18:59, Thomas Schatzl wrote: >>> Hi Stefan, >>> [...] >>> >>>> >>>> Updated webrevs: >>>> Full: http://cr.openjdk.java.net/~sjohanss/8141637/01/ >>>> Inc: http://cr.openjdk.java.net/~sjohanss/8141637/00-01/ >>>> >>> >>> ?? looks good. >>> >>> It would be nice to rename the >>> G1RemSet::prepare_for_scan_heap_roots() method to >>> G1RemSet::exclude_from_scan() as we discussed internally in this >>> change too. It does not seem to warrant an extra CR, unless you have >>> more planned. >>> >>> I would not need a re-review for that rename, but you need to wait >>> for another reviewer anyway.... >> Not sure how I missed to include that, fixed and here are the new >> webrevs: >> Full: http://cr.openjdk.java.net/~sjohanss/8141637/02 >> Inc: http://cr.openjdk.java.net/~sjohanss/8141637/01-02 > > Looks good. Thanks. Thanks Thomas, I just realized the added assertion yesterday triggers due to free regions, so I updated the latest webrevs inline to include a fix for that, by reverting back to an else-if statement. Re-running mach5 now. Stefan > > Thomas > From thomas.schatzl at oracle.com Tue Nov 19 15:01:14 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 19 Nov 2019 16:01:14 +0100 Subject: RFR (M): 8233588: Clean up SurvRateGroup In-Reply-To: <573C1C2E-FEA7-4F2F-8AFE-5A74560832E8@oracle.com> References: <1ab7fd39-fc0c-79ee-e925-fc8b88e32177@oracle.com> <573C1C2E-FEA7-4F2F-8AFE-5A74560832E8@oracle.com> Message-ID: <11fb9f65-0bdb-2570-96b2-f8876c955a14@oracle.com> Hi, On 19.11.19 07:19, Kim Barrett wrote: >> On Nov 12, 2019, at 10:23 AM, Thomas Schatzl wrote: >> >> Hi all, >> >> can I have some reviews for this change that cleans up the SurvRateGroup class. In particular, while working with it I found that it contains two members that are duplicates of others. >> >> This removed a few methods, which in turn made some others obsolete. >> [...] > > Looks good. > > One pre-existing issue: > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/g1CollectionSet.cpp > 343 if (r->age_in_surv_rate_group() < 0) { > > [pre-existing] > > A few lines before we checked r->has_surv_rate_group(). If it doesn't > have one, should we really be checking the age? Looks like we'd currently > assert if we don't have one. > > I think just making this an "else if? with the preceeding ?has? check fixes it. > > ------------------------------------------------------------------------------ > That does not work because the age_in_surv_rate_group() getter will already assert with a bogus HeapRegion::_age_index, i.e. what this code actually wants to check. I was not sure about just removing the whole verification in G1CollectionSet due to the many existing checks (I could still do that), but then left it in by introducing a has_valid_age_in_surv_rate() bool getter. So fixed in http://cr.openjdk.java.net/~tschatzl/8233588/webrev.0_to_1/ http://cr.openjdk.java.net/~tschatzl/8233588/webrev.1/ Passed jtreg gc/g1 testing Thanks, Thomas From sangheon.kim at oracle.com Tue Nov 19 15:02:39 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Tue, 19 Nov 2019 07:02:39 -0800 Subject: RFR(XS): 8232533: G1 uses only a single thread for pretouching the java heap In-Reply-To: <9020bf46-96a7-79ad-fe8b-9e1cd9f89498@oracle.com> References: <32e58527-abb7-43c8-719d-b64529abbba6@oracle.com> <46a08221-0ae2-7025-911b-899d124a5529@oracle.com> <9020bf46-96a7-79ad-fe8b-9e1cd9f89498@oracle.com> Message-ID: <0622e559-672b-3a62-6966-c90b4d633971@oracle.com> Hi Thomas and Stefan, On 11/19/19 1:11 AM, Stefan Johansson wrote: > > > On 2019-11-19 09:35, Thomas Schatzl wrote: >> Hi, >> >> On 18.11.19 22:31, sangheon.kim at oracle.com wrote: >>> Hi all, >>> >>> Can I have some reviews for this small patch? >>> >>> G1 initiates only 1 GC thread for faster start-up and then >>> initialize more when we need more GC threads. >>> This is also same when we enable +AlwaysPreTouch option, so 1 thread >>> touching all heap situation happens as the CR described. >>> >>> The proposed patch is trying to cap the total worker thread count >>> instead of active worker thread count. And this will make faster >>> start-up as well. >> >> ?? the rationale is that supposedly if a user is specifying >> AlwaysPreTouch, he wants best performance, and does not care so much >> about initializing the time it takes to initialize the extra threads. >> Also initializing the threads first and then doing the work in >> parallel should be faster than pretouching with only a single thread. :) >> >> I.e. in the case of the CR, it takes 2mins to pretouch the heap - >> initializing the threads shouldn't be that slow :P >> >>> >>> CR: https://bugs.openjdk.java.net/browse/JDK-8232533 >>> Webrev: http://cr.openjdk.java.net/~sangheki/8232533/webrev.0/ >>> Testing: hs-tier1 >> >> Please fix the copyright date before pushing if you want. I do not >> need a re-review for this change. Looks good otherwise. > Looks good to me too, > Stefan Thanks for your review. I will update the copyright year before pushing. Thanks, Sangheon >> >> Thanks, >> ?? Thomas From per.liden at oracle.com Tue Nov 19 18:54:45 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 19:54:45 +0100 Subject: RFR: 8234338: ZGC: Improve small heap usage Message-ID: <2f03fed4-d9c4-7479-754c-7335af6d3981@oracle.com> When using small heaps (like less than 128M), the heap reserve tends to take up a relatively large portion of the heap. We have quite a lot of tests that explicitly set the heap size to small values (like -Xmx8M, -Xmx16M, etc). Today, these tests often fail with OOME when using ZGC. While ZGC isn't really that focused on tiny/small heaps, we still want to make testing of ZGC easy without having to adjust these tests to use a larger heap. There are at least two things we can do when ZGC is given a small heap: 1) Dynamically scale the medium ZPage size, and even disable medium pages all together when using tiny heaps. 2) Stop using per-CPU small pages for allocations, and instead switch to using a single small page. With this patch, ZGC can scale down to 8M. Bug: https://bugs.openjdk.java.net/browse/JDK-8234338 Webrev: http://cr.openjdk.java.net/~pliden/8234338/webrev.0 Testing: Tier1-7 using ZGC. A new small heap test was also added. /Per From per.liden at oracle.com Tue Nov 19 18:54:48 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 19:54:48 +0100 Subject: RFR: 8234361: ZGC: Move heuristics code in ZWorker to ZHeuristics Message-ID: This patch applies on top of JDK-8234338, which introduced ZHeuristics, and move the heuristics code in ZWorker into ZHeuristics. No logic changes, just code motion. Bug: https://bugs.openjdk.java.net/browse/JDK-8234361 Webrev: http://cr.openjdk.java.net/~pliden/8234361/webrev.0 /Per From stefan.karlsson at oracle.com Tue Nov 19 19:06:37 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 19 Nov 2019 20:06:37 +0100 Subject: RFR: 8234338: ZGC: Improve small heap usage In-Reply-To: <2f03fed4-d9c4-7479-754c-7335af6d3981@oracle.com> References: <2f03fed4-d9c4-7479-754c-7335af6d3981@oracle.com> Message-ID: Hi Per, Looks good. I would have preferred if the per_cpu_shared_small_page() function were prefixed with use_. I was again confused by that function when reading this for the second time. I'll let you decide if you agree or not. Thanks, StefanK On 2019-11-19 19:54, Per Liden wrote: > When using small heaps (like less than 128M), the heap reserve tends > to take up a relatively large portion of the heap. We have quite a lot > of tests that explicitly set the heap size to small values (like > -Xmx8M, -Xmx16M, etc). Today, these tests often fail with OOME when > using ZGC. While ZGC isn't really that focused on tiny/small heaps, we > still want to make testing of ZGC easy without having to adjust these > tests to use a larger heap. > > There are at least two things we can do when ZGC is given a small heap: > 1) Dynamically scale the medium ZPage size, and even disable medium > pages all together when using tiny heaps. > 2) Stop using per-CPU small pages for allocations, and instead switch > to using a single small page. > > With this patch, ZGC can scale down to 8M. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234338 > Webrev: http://cr.openjdk.java.net/~pliden/8234338/webrev.0 > > Testing: Tier1-7 using ZGC. A new small heap test was also added. > > /Per From stefan.karlsson at oracle.com Tue Nov 19 19:07:25 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 19 Nov 2019 20:07:25 +0100 Subject: RFR: 8234361: ZGC: Move heuristics code in ZWorker to ZHeuristics In-Reply-To: References: Message-ID: <37e4c4bd-64cd-bf7e-b8cf-12e24755b67b@oracle.com> Looks good. StefanK On 2019-11-19 19:54, Per Liden wrote: > This patch applies on top of JDK-8234338, which introduced > ZHeuristics, and move the heuristics code in ZWorker into ZHeuristics. > No logic changes, just code motion. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234361 > Webrev: http://cr.openjdk.java.net/~pliden/8234361/webrev.0 > > /Per From stefan.karlsson at oracle.com Tue Nov 19 19:21:34 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 19 Nov 2019 20:21:34 +0100 Subject: RFR: 8234312: ZGC: Adjust warmup criteria In-Reply-To: <88a2d5d2-4dca-a17e-4411-73700d8593e7@oracle.com> References: <88a2d5d2-4dca-a17e-4411-73700d8593e7@oracle.com> Message-ID: <80924010-0013-7f85-a98d-b282ad42071e@oracle.com> Looks good. StefanK On 2019-11-18 11:15, Per Liden wrote: > JDK-8232001 introduced logic which ignores "Metastace GC Threshold" > collections until the GC has warmed up. While this works in principle, > it also causes some intermittent metaspace test failures. These tests > use a small MaxMetaspaceSize, and aggressively loads and unloads > classes. It's not completely obvious why these tests started to fail > after JDK-8232001. However, we will be collecting metaspace a little > bit later now, which could mean we have more fragmentation and are not > able to free as much memory when the GC eventually happens. > > This patch reverts to the old behavior (we no longer ignore "Metspace > GC Threshold" collections until the GC is warm), but instead we only > consider the GC to be warm once we've done three "Warmup" GCs. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234312 > Webrev: http://cr.openjdk.java.net/~pliden/8234312/webrev.0 > > Testing: Tier1-7 on ZGC. Multiple manual runs of > vmTestbase/gc/gctests/LoadUnloadGC (the test that intermittently fails > in our CI). > > /Per From manc at google.com Tue Nov 19 20:02:01 2019 From: manc at google.com (Man Cao) Date: Tue, 19 Nov 2019 12:02:01 -0800 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting In-Reply-To: <387D12CA-4C5A-4DE5-A6E7-ED6739AC7173@oracle.com> References: <34BD822F-20DD-43E3-956F-09622EB62148@oracle.com> <5c700c0a-954e-6187-67b4-cc523cace95c@oracle.com> <387D12CA-4C5A-4DE5-A6E7-ED6739AC7173@oracle.com> Message-ID: Thanks for the feedback. I will address them soon. One question about casting: > I would prefer leaving the type alone here and instead using > static_cast(_node_buffer[i]) > (maybe packaged in a little helper here). > But I have a strong dislike for reinterpret_cast (with whatever > spelling) where it can reasonably be avoided. There is a problem with G1RemSet::clean_card_before_refine(CardValue*& card_ptr), that it needs to modify the card_ptr. In order to make the code clean for Edward's two-finger compaction, the code needs to call clean_card_before_refine(_node_buffer[i]) and able to modify _node_buffer[i] in-place. If we have "void** _node_buffer", then it cannot do static_cast(_node_buffer[i]) or static_cast(&_node_buffer[i]). I found a workaround: Change to: G1RemSet::clean_card_before_refine(CardValue** card_ptr_addr); Then: void* card_addr = &_node_buffer[i]; clean_card_before_refine(static_cast(card_addr)) However, I think this is basically reinterpret_cast(&_node_buffer[i]). Any suggestion on a better casting approach? -Man From per.liden at oracle.com Tue Nov 19 20:27:17 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 21:27:17 +0100 Subject: RFR: 8234338: ZGC: Improve small heap usage In-Reply-To: References: <2f03fed4-d9c4-7479-754c-7335af6d3981@oracle.com> Message-ID: <2016dfc9-6bef-98fb-25a3-83156b461e1f@oracle.com> On 11/19/19 8:06 PM, Stefan Karlsson wrote: > Hi Per, > > Looks good. > > I would have preferred if the per_cpu_shared_small_page() function were > prefixed with use_. I was again confused by that function when reading > this for the second time. I'll let you decide if you agree or not. Will fix. Thanks for reviewing, Stefan! /Per > > Thanks, > StefanK > > On 2019-11-19 19:54, Per Liden wrote: >> When using small heaps (like less than 128M), the heap reserve tends >> to take up a relatively large portion of the heap. We have quite a lot >> of tests that explicitly set the heap size to small values (like >> -Xmx8M, -Xmx16M, etc). Today, these tests often fail with OOME when >> using ZGC. While ZGC isn't really that focused on tiny/small heaps, we >> still want to make testing of ZGC easy without having to adjust these >> tests to use a larger heap. >> >> There are at least two things we can do when ZGC is given a small heap: >> 1) Dynamically scale the medium ZPage size, and even disable medium >> pages all together when using tiny heaps. >> 2) Stop using per-CPU small pages for allocations, and instead switch >> to using a single small page. >> >> With this patch, ZGC can scale down to 8M. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234338 >> Webrev: http://cr.openjdk.java.net/~pliden/8234338/webrev.0 >> >> Testing: Tier1-7 using ZGC. A new small heap test was also added. >> >> /Per > From per.liden at oracle.com Tue Nov 19 20:27:30 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 21:27:30 +0100 Subject: RFR: 8234361: ZGC: Move heuristics code in ZWorker to ZHeuristics In-Reply-To: <37e4c4bd-64cd-bf7e-b8cf-12e24755b67b@oracle.com> References: <37e4c4bd-64cd-bf7e-b8cf-12e24755b67b@oracle.com> Message-ID: <3df65502-5234-1e05-bdd2-649ed9789f1d@oracle.com> Thanks Stefan! /Per On 11/19/19 8:07 PM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2019-11-19 19:54, Per Liden wrote: >> This patch applies on top of JDK-8234338, which introduced >> ZHeuristics, and move the heuristics code in ZWorker into ZHeuristics. >> No logic changes, just code motion. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234361 >> Webrev: http://cr.openjdk.java.net/~pliden/8234361/webrev.0 >> >> /Per > From per.liden at oracle.com Tue Nov 19 20:27:36 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 21:27:36 +0100 Subject: RFR: 8234312: ZGC: Adjust warmup criteria In-Reply-To: <80924010-0013-7f85-a98d-b282ad42071e@oracle.com> References: <88a2d5d2-4dca-a17e-4411-73700d8593e7@oracle.com> <80924010-0013-7f85-a98d-b282ad42071e@oracle.com> Message-ID: Thanks Stefan! /Per On 11/19/19 8:21 PM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2019-11-18 11:15, Per Liden wrote: >> JDK-8232001 introduced logic which ignores "Metastace GC Threshold" >> collections until the GC has warmed up. While this works in principle, >> it also causes some intermittent metaspace test failures. These tests >> use a small MaxMetaspaceSize, and aggressively loads and unloads >> classes. It's not completely obvious why these tests started to fail >> after JDK-8232001. However, we will be collecting metaspace a little >> bit later now, which could mean we have more fragmentation and are not >> able to free as much memory when the GC eventually happens. >> >> This patch reverts to the old behavior (we no longer ignore "Metspace >> GC Threshold" collections until the GC is warm), but instead we only >> consider the GC to be warm once we've done three "Warmup" GCs. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234312 >> Webrev: http://cr.openjdk.java.net/~pliden/8234312/webrev.0 >> >> Testing: Tier1-7 on ZGC. Multiple manual runs of >> vmTestbase/gc/gctests/LoadUnloadGC (the test that intermittently fails >> in our CI). >> >> /Per > From erik.osterlund at oracle.com Tue Nov 19 20:46:55 2019 From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=) Date: Tue, 19 Nov 2019 21:46:55 +0100 Subject: RFR: 8234338: ZGC: Improve small heap usage In-Reply-To: <2f03fed4-d9c4-7479-754c-7335af6d3981@oracle.com> References: <2f03fed4-d9c4-7479-754c-7335af6d3981@oracle.com> Message-ID: Hi Per, Looks good. Thanks, /Erik > On 19 Nov 2019, at 19:54, Per Liden wrote: > > ?When using small heaps (like less than 128M), the heap reserve tends to take up a relatively large portion of the heap. We have quite a lot of tests that explicitly set the heap size to small values (like -Xmx8M, -Xmx16M, etc). Today, these tests often fail with OOME when using ZGC. While ZGC isn't really that focused on tiny/small heaps, we still want to make testing of ZGC easy without having to adjust these tests to use a larger heap. > > There are at least two things we can do when ZGC is given a small heap: > 1) Dynamically scale the medium ZPage size, and even disable medium pages all together when using tiny heaps. > 2) Stop using per-CPU small pages for allocations, and instead switch to using a single small page. > > With this patch, ZGC can scale down to 8M. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234338 > Webrev: http://cr.openjdk.java.net/~pliden/8234338/webrev.0 > > Testing: Tier1-7 using ZGC. A new small heap test was also added. > > /Per From erik.osterlund at oracle.com Tue Nov 19 20:48:27 2019 From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=) Date: Tue, 19 Nov 2019 21:48:27 +0100 Subject: RFR: 8234361: ZGC: Move heuristics code in ZWorker to ZHeuristics In-Reply-To: References: Message-ID: <14BF1448-2D62-455F-80F5-03A5BEF3A880@oracle.com> Hi Per, Looks good. Thanks, /Erik > On 19 Nov 2019, at 19:54, Per Liden wrote: > > ?This patch applies on top of JDK-8234338, which introduced ZHeuristics, and move the heuristics code in ZWorker into ZHeuristics. No logic changes, just code motion. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234361 > Webrev: http://cr.openjdk.java.net/~pliden/8234361/webrev.0 > > /Per From per.liden at oracle.com Tue Nov 19 20:59:11 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 21:59:11 +0100 Subject: RFR: 8234338: ZGC: Improve small heap usage In-Reply-To: References: <2f03fed4-d9c4-7479-754c-7335af6d3981@oracle.com> Message-ID: <2da97bd2-8d67-1383-a023-72b98b38b7ac@oracle.com> Thanks Erik! /Per On 11/19/19 9:46 PM, Erik ?sterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > >> On 19 Nov 2019, at 19:54, Per Liden wrote: >> >> ?When using small heaps (like less than 128M), the heap reserve tends to take up a relatively large portion of the heap. We have quite a lot of tests that explicitly set the heap size to small values (like -Xmx8M, -Xmx16M, etc). Today, these tests often fail with OOME when using ZGC. While ZGC isn't really that focused on tiny/small heaps, we still want to make testing of ZGC easy without having to adjust these tests to use a larger heap. >> >> There are at least two things we can do when ZGC is given a small heap: >> 1) Dynamically scale the medium ZPage size, and even disable medium pages all together when using tiny heaps. >> 2) Stop using per-CPU small pages for allocations, and instead switch to using a single small page. >> >> With this patch, ZGC can scale down to 8M. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234338 >> Webrev: http://cr.openjdk.java.net/~pliden/8234338/webrev.0 >> >> Testing: Tier1-7 using ZGC. A new small heap test was also added. >> >> /Per > From per.liden at oracle.com Tue Nov 19 20:59:19 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 19 Nov 2019 21:59:19 +0100 Subject: RFR: 8234361: ZGC: Move heuristics code in ZWorker to ZHeuristics In-Reply-To: <14BF1448-2D62-455F-80F5-03A5BEF3A880@oracle.com> References: <14BF1448-2D62-455F-80F5-03A5BEF3A880@oracle.com> Message-ID: <28a30288-ec65-b37c-89d1-2996a1e8f750@oracle.com> Thanks Erik! /Per On 11/19/19 9:48 PM, Erik ?sterlund wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > >> On 19 Nov 2019, at 19:54, Per Liden wrote: >> >> ?This patch applies on top of JDK-8234338, which introduced ZHeuristics, and move the heuristics code in ZWorker into ZHeuristics. No logic changes, just code motion. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234361 >> Webrev: http://cr.openjdk.java.net/~pliden/8234361/webrev.0 >> >> /Per > From kim.barrett at oracle.com Tue Nov 19 21:21:02 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 19 Nov 2019 16:21:02 -0500 Subject: RFR (M): 8233588: Clean up SurvRateGroup In-Reply-To: <11fb9f65-0bdb-2570-96b2-f8876c955a14@oracle.com> References: <1ab7fd39-fc0c-79ee-e925-fc8b88e32177@oracle.com> <573C1C2E-FEA7-4F2F-8AFE-5A74560832E8@oracle.com> <11fb9f65-0bdb-2570-96b2-f8876c955a14@oracle.com> Message-ID: <9C357F79-782A-42B3-BCDF-A4161D2A2DBD@oracle.com> > On Nov 19, 2019, at 10:01 AM, Thomas Schatzl wrote: > So fixed in > > http://cr.openjdk.java.net/~tschatzl/8233588/webrev.0_to_1/ > http://cr.openjdk.java.net/~tschatzl/8233588/webrev.1/ > > Passed jtreg gc/g1 testing > > Thanks, > Thomas Looks good. From kim.barrett at oracle.com Tue Nov 19 21:31:12 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 19 Nov 2019 16:31:12 -0500 Subject: RFR (M): 8228609: G1 copy cost prediction uses used vs. actual copied bytes In-Reply-To: <96c45857-03b3-40c4-ca90-2ec7bc2fa2d8@oracle.com> References: <3A799B0C-76B1-4145-A000-1071672BD566@oracle.com> <96c45857-03b3-40c4-ca90-2ec7bc2fa2d8@oracle.com> Message-ID: > On Nov 12, 2019, at 4:06 AM, Thomas Schatzl wrote: > > Hi Kim, > > sorry for the late reply - I have been working on updating this, but other things went in-between. Sorry. And then I failed to keep track of the update? > [?] > Webrev: > http://cr.openjdk.java.net/~tschatzl/8228609/webrev.0_to_1/ (diff) > http://cr.openjdk.java.net/~tschatzl/8228609/webrev.1/ (full) > Testing: > hs-tier1-5 > > Thanks, > Thomas Looks good. From kim.barrett at oracle.com Tue Nov 19 21:58:07 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 19 Nov 2019 16:58:07 -0500 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting In-Reply-To: References: <34BD822F-20DD-43E3-956F-09622EB62148@oracle.com> <5c700c0a-954e-6187-67b4-cc523cace95c@oracle.com> <387D12CA-4C5A-4DE5-A6E7-ED6739AC7173@oracle.com> Message-ID: <37161B48-8AD1-4887-A56D-4905BD8DC159@oracle.com> > On Nov 19, 2019, at 3:02 PM, Man Cao wrote: > > Thanks for the feedback. I will address them soon. > > One question about casting: > I would prefer leaving the type alone here and instead using > static_cast(_node_buffer[i]) > (maybe packaged in a little helper here). > But I have a strong dislike for reinterpret_cast (with whatever > spelling) where it can reasonably be avoided. > > There is a problem with G1RemSet::clean_card_before_refine(CardValue*& card_ptr), > that it needs to modify the card_ptr. In order to make the code clean for Edward's > two-finger compaction, the code needs to call > clean_card_before_refine(_node_buffer[i]) and able to modify _node_buffer[i] in-place. > > If we have "void** _node_buffer", then it cannot do > static_cast(_node_buffer[i]) > or > static_cast(&_node_buffer[i]). > > I found a workaround: > Change to: > G1RemSet::clean_card_before_refine(CardValue** card_ptr_addr); > Then: > void* card_addr = &_node_buffer[i]; > clean_card_before_refine(static_cast(card_addr)) > > However, I think this is basically reinterpret_cast(&_node_buffer[i]). > Any suggestion on a better casting approach? > > -Man Ick! I forgot about clean_card_before_refine possibly updating. Given that, I think the various cures are worse; just go with the reinterpret_cast you had originally. From kim.barrett at oracle.com Tue Nov 19 22:01:41 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 19 Nov 2019 17:01:41 -0500 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting In-Reply-To: <37161B48-8AD1-4887-A56D-4905BD8DC159@oracle.com> References: <34BD822F-20DD-43E3-956F-09622EB62148@oracle.com> <5c700c0a-954e-6187-67b4-cc523cace95c@oracle.com> <387D12CA-4C5A-4DE5-A6E7-ED6739AC7173@oracle.com> <37161B48-8AD1-4887-A56D-4905BD8DC159@oracle.com> Message-ID: <4D5605AD-7F3E-4AB4-8A13-5AA18DC0AB40@oracle.com> > On Nov 19, 2019, at 4:58 PM, Kim Barrett wrote: > >> On Nov 19, 2019, at 3:02 PM, Man Cao wrote: >> >> Thanks for the feedback. I will address them soon. >> >> One question about casting: >> I would prefer leaving the type alone here and instead using >> static_cast(_node_buffer[i]) >> (maybe packaged in a little helper here). >> But I have a strong dislike for reinterpret_cast (with whatever >> spelling) where it can reasonably be avoided. >> >> There is a problem with G1RemSet::clean_card_before_refine(CardValue*& card_ptr), >> that it needs to modify the card_ptr. In order to make the code clean for Edward's >> two-finger compaction, the code needs to call >> clean_card_before_refine(_node_buffer[i]) and able to modify _node_buffer[i] in-place. >> >> If we have "void** _node_buffer", then it cannot do >> static_cast(_node_buffer[i]) >> or >> static_cast(&_node_buffer[i]). >> >> I found a workaround: >> Change to: >> G1RemSet::clean_card_before_refine(CardValue** card_ptr_addr); >> Then: >> void* card_addr = &_node_buffer[i]; >> clean_card_before_refine(static_cast(card_addr)) >> >> However, I think this is basically reinterpret_cast(&_node_buffer[i]). >> Any suggestion on a better casting approach? >> >> -Man > > Ick! I forgot about clean_card_before_refine possibly updating. > Given that, I think the various cures are worse; just go with the > reinterpret_cast you had originally. Of course, there is JDK-8225409 :) From per.liden at oracle.com Tue Nov 19 23:11:36 2019 From: per.liden at oracle.com (Per Liden) Date: Wed, 20 Nov 2019 00:11:36 +0100 Subject: RFR: 8234437: Remove CollectedHeap::safe_object_iterate() Message-ID: <175f6542-2372-18c2-a051-9c1bfff58aaa@oracle.com> With CMS now removed, there is no longer any GC where CollectedHeap::object_iterate() and CollectedHeap::safe_object_iterate() do different things. Hence, we can remove CollectedHeap::safe_object_iterate() and let all uses of it shift over to use CollectedHeap::object_iterate(). Bug: https://bugs.openjdk.java.net/browse/JDK-8234437 Webrev: http://cr.openjdk.java.net/~pliden/8234437/webrev.0 /Per From per.liden at oracle.com Tue Nov 19 23:24:53 2019 From: per.liden at oracle.com (Per Liden) Date: Wed, 20 Nov 2019 00:24:53 +0100 Subject: RFR: 8234438: Remove some CMS leftovers Message-ID: <0c4d0f05-0878-79db-9684-6c065d560487@oracle.com> The following files has some CMS leftovers that can be removed. I'm sure there are more things like this here and there. I just happened to run across these while working on a different patch. src/hotspot/share/gc/shared/space.cpp src/hotspot/share/gc/shared/space.hpp src/hotspot/share/gc/shared/space.inline.hpp src/hotspot/share/memory/freeList.hpp src/hotspot/share/memory/iterator.hpp Bug: https://bugs.openjdk.java.net/browse/JDK-8234438 Webrev: http://cr.openjdk.java.net/~pliden/8234438/webrev.0 /Per From sangheon.kim at oracle.com Tue Nov 19 23:33:43 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Tue, 19 Nov 2019 15:33:43 -0800 Subject: RFR (M): 8233588: Clean up SurvRateGroup In-Reply-To: <9C357F79-782A-42B3-BCDF-A4161D2A2DBD@oracle.com> References: <1ab7fd39-fc0c-79ee-e925-fc8b88e32177@oracle.com> <573C1C2E-FEA7-4F2F-8AFE-5A74560832E8@oracle.com> <11fb9f65-0bdb-2570-96b2-f8876c955a14@oracle.com> <9C357F79-782A-42B3-BCDF-A4161D2A2DBD@oracle.com> Message-ID: Hi Thomas, On 11/19/19 1:21 PM, Kim Barrett wrote: >> On Nov 19, 2019, at 10:01 AM, Thomas Schatzl wrote: >> So fixed in >> >> http://cr.openjdk.java.net/~tschatzl/8233588/webrev.0_to_1/ >> http://cr.openjdk.java.net/~tschatzl/8233588/webrev.1/ >> >> Passed jtreg gc/g1 testing >> >> Thanks, >> Thomas > Looks good. Looks good to me too. Nice cleanup. If you are interested updating the copyright year, survRateGroup.cpp needs to be updated. :) I don't need another webrev for this. Thanks, Sangheon > From manc at google.com Wed Nov 20 02:26:23 2019 From: manc at google.com (Man Cao) Date: Tue, 19 Nov 2019 18:26:23 -0800 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting In-Reply-To: <4D5605AD-7F3E-4AB4-8A13-5AA18DC0AB40@oracle.com> References: <34BD822F-20DD-43E3-956F-09622EB62148@oracle.com> <5c700c0a-954e-6187-67b4-cc523cace95c@oracle.com> <387D12CA-4C5A-4DE5-A6E7-ED6739AC7173@oracle.com> <37161B48-8AD1-4887-A56D-4905BD8DC159@oracle.com> <4D5605AD-7F3E-4AB4-8A13-5AA18DC0AB40@oracle.com> Message-ID: Hi all, Thanks! I have addressed all comments: Full: http://cr.openjdk.java.net/~manc/8087198/webrev.02/ Incremental: http://cr.openjdk.java.net/~manc/8087198/webrev.01-02.inc/ Tested on submit repo and stress tested locally with and without the KVHashtable. Removing DEBUG_ONLY KVHashtable fixed the Windows build errors on submit repo. Some responses below. The assert looks safe (and the code valid) because in case of a > safepoint (e.g. remark) the card is re-evaluated with the potentially > new top anyway. > I.e. there can not be a safepoint with potential removal of the > humongous regions between cleaning and actual refining. Other old gen > region's top never change. I also added a comment that cleaning and refining of a card cannot span across safepoint. Though it will work and uses two pointers, that isn't the two-finger > compaction algorithm I was suggesting. I intended something based on > the algorithm attributed to Edwards (and used by us for SATB buffer > filtering; see SATBMarkQueue::apply_filter, which might even be usable > as-is with a suitable filter_out function); see Jones & Lins GC book, > section 5.3; or Jones et al GC Handbook, section 3.1; or web search > for "edwards two-finger compaction". > It has the benefit over the proposed algorithm of doing no more and > possibly significantly fewer element moves. It doesn't maintain the > order of the elements, but we're doing a sort after the compaction. Thanks for the clarification. Now it uses this algorithm, but I needed to rewrite the code a bit because of hot card cache replacing buffer's element in-place. I also changed G1RemSet::clean_card_before_refine() to take a "CardValue**" parameter instead of "CardValue*&" to make it more obvious that it can modify the card pointer. -Man From kim.barrett at oracle.com Wed Nov 20 03:46:20 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 19 Nov 2019 22:46:20 -0500 Subject: RFR: 8234437: Remove CollectedHeap::safe_object_iterate() In-Reply-To: <175f6542-2372-18c2-a051-9c1bfff58aaa@oracle.com> References: <175f6542-2372-18c2-a051-9c1bfff58aaa@oracle.com> Message-ID: > On Nov 19, 2019, at 6:11 PM, Per Liden wrote: > > With CMS now removed, there is no longer any GC where CollectedHeap::object_iterate() and CollectedHeap::safe_object_iterate() do different things. Hence, we can remove CollectedHeap::safe_object_iterate() and let all uses of it shift over to use CollectedHeap::object_iterate(). > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234437 > Webrev: http://cr.openjdk.java.net/~pliden/8234437/webrev.0 > > /Per Looks good. From kim.barrett at oracle.com Wed Nov 20 03:53:32 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 19 Nov 2019 22:53:32 -0500 Subject: RFR: 8234438: Remove some CMS leftovers In-Reply-To: <0c4d0f05-0878-79db-9684-6c065d560487@oracle.com> References: <0c4d0f05-0878-79db-9684-6c065d560487@oracle.com> Message-ID: <5432024E-2547-4973-930F-1712CA652EF0@oracle.com> > On Nov 19, 2019, at 6:24 PM, Per Liden wrote: > > The following files has some CMS leftovers that can be removed. I'm sure there are more things like this here and there. I just happened to run across these while working on a different patch. > > src/hotspot/share/gc/shared/space.cpp > src/hotspot/share/gc/shared/space.hpp > src/hotspot/share/gc/shared/space.inline.hpp > src/hotspot/share/memory/freeList.hpp > src/hotspot/share/memory/iterator.hpp > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234438 > Webrev: http://cr.openjdk.java.net/~pliden/8234438/webrev.0 > > /Per Looks good. From per.liden at oracle.com Wed Nov 20 07:04:43 2019 From: per.liden at oracle.com (Per Liden) Date: Wed, 20 Nov 2019 08:04:43 +0100 Subject: RFR: 8234438: Remove some CMS leftovers In-Reply-To: <5432024E-2547-4973-930F-1712CA652EF0@oracle.com> References: <0c4d0f05-0878-79db-9684-6c065d560487@oracle.com> <5432024E-2547-4973-930F-1712CA652EF0@oracle.com> Message-ID: Thanks Kim! /Per On 11/20/19 4:53 AM, Kim Barrett wrote: >> On Nov 19, 2019, at 6:24 PM, Per Liden wrote: >> >> The following files has some CMS leftovers that can be removed. I'm sure there are more things like this here and there. I just happened to run across these while working on a different patch. >> >> src/hotspot/share/gc/shared/space.cpp >> src/hotspot/share/gc/shared/space.hpp >> src/hotspot/share/gc/shared/space.inline.hpp >> src/hotspot/share/memory/freeList.hpp >> src/hotspot/share/memory/iterator.hpp >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234438 >> Webrev: http://cr.openjdk.java.net/~pliden/8234438/webrev.0 >> >> /Per > > Looks good. > From per.liden at oracle.com Wed Nov 20 07:06:34 2019 From: per.liden at oracle.com (Per Liden) Date: Wed, 20 Nov 2019 08:06:34 +0100 Subject: RFR: 8234437: Remove CollectedHeap::safe_object_iterate() In-Reply-To: References: <175f6542-2372-18c2-a051-9c1bfff58aaa@oracle.com> Message-ID: <57fca097-77c6-78d7-2f8d-d73e21c85b01@oracle.com> Thanks Kim! /Per On 11/20/19 4:46 AM, Kim Barrett wrote: >> On Nov 19, 2019, at 6:11 PM, Per Liden wrote: >> >> With CMS now removed, there is no longer any GC where CollectedHeap::object_iterate() and CollectedHeap::safe_object_iterate() do different things. Hence, we can remove CollectedHeap::safe_object_iterate() and let all uses of it shift over to use CollectedHeap::object_iterate(). >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234437 >> Webrev: http://cr.openjdk.java.net/~pliden/8234437/webrev.0 >> >> /Per > > Looks good. > From stefan.johansson at oracle.com Wed Nov 20 08:41:36 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 20 Nov 2019 09:41:36 +0100 Subject: RFR: 8234438: Remove some CMS leftovers In-Reply-To: References: <0c4d0f05-0878-79db-9684-6c065d560487@oracle.com> <5432024E-2547-4973-930F-1712CA652EF0@oracle.com> Message-ID: Looks good to me too, StefanJ On 2019-11-20 08:04, Per Liden wrote: > Thanks Kim! > > /Per > > On 11/20/19 4:53 AM, Kim Barrett wrote: >>> On Nov 19, 2019, at 6:24 PM, Per Liden wrote: >>> >>> The following files has some CMS leftovers that can be removed. I'm >>> sure there are more things like this here and there. I just happened >>> to run across these while working on a different patch. >>> >>> src/hotspot/share/gc/shared/space.cpp >>> src/hotspot/share/gc/shared/space.hpp >>> src/hotspot/share/gc/shared/space.inline.hpp >>> src/hotspot/share/memory/freeList.hpp >>> src/hotspot/share/memory/iterator.hpp >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8234438 >>> Webrev: http://cr.openjdk.java.net/~pliden/8234438/webrev.0 >>> >>> /Per >> >> Looks good. >> From stefan.johansson at oracle.com Wed Nov 20 08:46:24 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 20 Nov 2019 09:46:24 +0100 Subject: RFR: 8234437: Remove CollectedHeap::safe_object_iterate() In-Reply-To: <57fca097-77c6-78d7-2f8d-d73e21c85b01@oracle.com> References: <175f6542-2372-18c2-a051-9c1bfff58aaa@oracle.com> <57fca097-77c6-78d7-2f8d-d73e21c85b01@oracle.com> Message-ID: Thanks for cleaning up Per, Looks good, Stefan On 2019-11-20 08:06, Per Liden wrote: > Thanks Kim! > > /Per > > On 11/20/19 4:46 AM, Kim Barrett wrote: >>> On Nov 19, 2019, at 6:11 PM, Per Liden wrote: >>> >>> With CMS now removed, there is no longer any GC where >>> CollectedHeap::object_iterate() and >>> CollectedHeap::safe_object_iterate() do different things. Hence, we >>> can remove CollectedHeap::safe_object_iterate() and let all uses of >>> it shift over to use CollectedHeap::object_iterate(). >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8234437 >>> Webrev: http://cr.openjdk.java.net/~pliden/8234437/webrev.0 >>> >>> /Per >> >> Looks good. >> From per.liden at oracle.com Wed Nov 20 09:02:46 2019 From: per.liden at oracle.com (Per Liden) Date: Wed, 20 Nov 2019 10:02:46 +0100 Subject: RFR: 8234437: Remove CollectedHeap::safe_object_iterate() In-Reply-To: References: <175f6542-2372-18c2-a051-9c1bfff58aaa@oracle.com> <57fca097-77c6-78d7-2f8d-d73e21c85b01@oracle.com> Message-ID: <167a6159-a39b-c580-bc94-2702f2cdce91@oracle.com> Thanks Stefan! /Per On 11/20/19 9:46 AM, Stefan Johansson wrote: > Thanks for cleaning up Per, > > Looks good, > Stefan > > On 2019-11-20 08:06, Per Liden wrote: >> Thanks Kim! >> >> /Per >> >> On 11/20/19 4:46 AM, Kim Barrett wrote: >>>> On Nov 19, 2019, at 6:11 PM, Per Liden wrote: >>>> >>>> With CMS now removed, there is no longer any GC where >>>> CollectedHeap::object_iterate() and >>>> CollectedHeap::safe_object_iterate() do different things. Hence, we >>>> can remove CollectedHeap::safe_object_iterate() and let all uses of >>>> it shift over to use CollectedHeap::object_iterate(). >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8234437 >>>> Webrev: http://cr.openjdk.java.net/~pliden/8234437/webrev.0 >>>> >>>> /Per >>> >>> Looks good. >>> From per.liden at oracle.com Wed Nov 20 09:02:50 2019 From: per.liden at oracle.com (Per Liden) Date: Wed, 20 Nov 2019 10:02:50 +0100 Subject: RFR: 8234438: Remove some CMS leftovers In-Reply-To: References: <0c4d0f05-0878-79db-9684-6c065d560487@oracle.com> <5432024E-2547-4973-930F-1712CA652EF0@oracle.com> Message-ID: <50fee267-812e-eeb8-0fa3-56e229d24e6a@oracle.com> Thanks Stefan! /Per On 11/20/19 9:41 AM, Stefan Johansson wrote: > Looks good to me too, > StefanJ > > On 2019-11-20 08:04, Per Liden wrote: >> Thanks Kim! >> >> /Per >> >> On 11/20/19 4:53 AM, Kim Barrett wrote: >>>> On Nov 19, 2019, at 6:24 PM, Per Liden wrote: >>>> >>>> The following files has some CMS leftovers that can be removed. I'm >>>> sure there are more things like this here and there. I just happened >>>> to run across these while working on a different patch. >>>> >>>> src/hotspot/share/gc/shared/space.cpp >>>> src/hotspot/share/gc/shared/space.hpp >>>> src/hotspot/share/gc/shared/space.inline.hpp >>>> src/hotspot/share/memory/freeList.hpp >>>> src/hotspot/share/memory/iterator.hpp >>>> >>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8234438 >>>> Webrev: http://cr.openjdk.java.net/~pliden/8234438/webrev.0 >>>> >>>> /Per >>> >>> Looks good. >>> From stefan.johansson at oracle.com Wed Nov 20 10:02:38 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 20 Nov 2019 11:02:38 +0100 Subject: RFR (S): 8233998: New young regions registered too early in collection set In-Reply-To: References: Message-ID: Hi Thomas, On 2019-11-12 16:24, Thomas Schatzl wrote: > Hi, > > ? can I have reviews for this change that changes the place in which > new mutator regions are published in the collection set list? > > Previously a new eden region has been published before some data that > would be read by the young gen sampling thread could be visible. > > This change simply does the member updates before adding the regions to > the collection set. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8233998 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8233998/webrev/ Looks good, StefanJ > Testing: > hs-tier1-5 with other similar patches, specjvm2008 specjvm.validation > with -XX:G1ConcRefinementServiceIntervalMillis=1 (that one causes issues > quickly if you assert that the remembered sets for regions only grows) > > Thanks, > ? Thomas From stefan.johansson at oracle.com Wed Nov 20 10:42:02 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 20 Nov 2019 11:42:02 +0100 Subject: RFR (M): 8233919: Incrementally calculate the occupied cards in a heap region remembered set In-Reply-To: <23b30b41-b109-4d05-606f-fa6a87a07897@oracle.com> References: <23b30b41-b109-4d05-606f-fa6a87a07897@oracle.com> Message-ID: <16491a20-6f2a-0d70-42c2-6c23c3a9e407@oracle.com> Hi Thomas, On 2019-11-12 16:24, Thomas Schatzl wrote: > Hi all, > > ? may I have reviews for this change that ultimately makes sure that > the number of occupied cards in a remembered set is only growing by > providing a per-OtherRegionsTable count that is atomically updated when > adding a remembered set entry. > > Note that this count may not be completely accurate due to races when > deleting a PerRegionTable (which is a known issue) from an > OtherRegionsTable; but that is no different than before. > > This helps improving the predictions in the young gen remset sampling > thread, and increase the performance of getting the occupancy count. > > Based on JDK-8233997, and JDK-8233998 also out for review. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8233919 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8233919/webrev/ I like this change and it looks good in general, just one small comment: src/hotspot/share/gc/g1/heapRegionRemSet.cpp --- 247 bool added = prt->add_reference(from); 248 Atomic::add(num_added_by_coarsening + (added ? 1 : 0), &_num_occupied, memory_order_relaxed); I would prefer: if (prt->add_reference(from)) { num_added_by_coarsening++; } Atomic::add... I you disagree, leave it as is. --- Thanks, Stefan > Testing: > hs-tier1-5, > > Thanks, > ? Thomas From thomas.schatzl at oracle.com Wed Nov 20 10:49:51 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 20 Nov 2019 11:49:51 +0100 Subject: RFR (S): 8233998: New young regions registered too early in collection set In-Reply-To: References: Message-ID: <6dc8783f-24ea-903f-a5a0-c2cc9df3b748@oracle.com> Hi, On 20.11.19 11:02, Stefan Johansson wrote: > Hi Thomas, > > On 2019-11-12 16:24, Thomas Schatzl wrote: >> Hi, >> >> ?? can I have reviews for this change that changes the place in which >> new mutator regions are published in the collection set list? >> >> Previously a new eden region has been published before some data that >> would be read by the young gen sampling thread could be visible. >> >> This change simply does the member updates before adding the regions >> to the collection set. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8233998 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8233998/webrev/ > Looks good, > StefanJ thanks for your review. Thomas From thomas.schatzl at oracle.com Wed Nov 20 10:51:28 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 20 Nov 2019 11:51:28 +0100 Subject: RFR (M): 8233588: Clean up SurvRateGroup In-Reply-To: References: <1ab7fd39-fc0c-79ee-e925-fc8b88e32177@oracle.com> <573C1C2E-FEA7-4F2F-8AFE-5A74560832E8@oracle.com> <11fb9f65-0bdb-2570-96b2-f8876c955a14@oracle.com> <9C357F79-782A-42B3-BCDF-A4161D2A2DBD@oracle.com> Message-ID: <1939164e-6ac8-56b7-8860-4708e886c2b5@oracle.com> Hi Sangheon, Kim, On 20.11.19 00:33, sangheon.kim at oracle.com wrote: > Hi Thomas, > > On 11/19/19 1:21 PM, Kim Barrett wrote: >>> On Nov 19, 2019, at 10:01 AM, Thomas Schatzl >>> wrote: >>> So fixed in >>> >>> http://cr.openjdk.java.net/~tschatzl/8233588/webrev.0_to_1/ >>> http://cr.openjdk.java.net/~tschatzl/8233588/webrev.1/ >>> >>> Passed jtreg gc/g1 testing >>> >>> Thanks, >>> ? Thomas >> Looks good. > Looks good to me too. > Nice cleanup. thanks for your reviews. I already fixed the copyright year in the patch :) Thomas From stefan.johansson at oracle.com Wed Nov 20 11:24:27 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 20 Nov 2019 12:24:27 +0100 Subject: RFR (M): 8231579: G1's incremental calculation of region elapsed time always uses the same age group for prediction In-Reply-To: References: Message-ID: <6bb0be27-c32c-f2c0-b21e-b03802590cf4@oracle.com> Hi Thomas, On 2019-11-19 10:25, Thomas Schatzl wrote: > Hi all, > > ? can I have reviews for this change that fixes use of the wrong > predictors when adding a new mutator region to the collection set as it > is retired? > > G1, through the young remset sampling thread, and the mutator threads > when they retire, keep track of a prediction for the time it takes to > evacuate the entire young gen. > > The mutator thread, when it retires a region, updates that value by the > current prediction of that retired region: that is where the error > occurs: currently, it only ever uses the prediction for the region that > has been retired just before GC. > Typically a lot of data in this region survives (e.g. 80%+), so the > prediction for the overall young gen is heavily inflated. > Which means that during mixed gc, a smaller than necessary amount of > time is seen as left for evacuating old gen regions, which in turn means > that G1 typically does more, shorter than expected mixed gcs. > > This wastes some throughput, as typically normal young collections can > take a much larger eden. > > This change fixes that by the mutator and the young remset sampling > thread not updating the time it takes to copy the contents of eden > regions - the predictions for that are fixed anyway after a GC as the > predictors for how many objects are expected to be live in a region are > not updated while the mutator is running. Only other components of the > prediction are. > > The time to copy the eden regions is later, when G1 finalizes the > prediction for the time the young gen will take, added back. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8231579 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8231579/webrev/ Looks good, just some variable naming that I think should be updated: src/hotspot/share/gc/g1/g1CollectionSet.cpp --- 248 double old_elapsed_time_ms = hr->predicted_non_copy_time_ms(); 249 double new_region_elapsed_time_ms = predict_region_non_copy_time_ms(hr); 250 double non_copy_time_ms_diff = new_region_elapsed_time_ms - old_elapsed_time_ms; 251 hr->set_predicted_non_copy_time_ms(new_region_elapsed_time_ms); 252 _inc_predicted_non_copy_time_ms_diff += non_copy_time_ms_diff; I think the local variables should change to reflect the new naming, something like "old_non_copy_time" and "new_non_copy_time". --- Thanks, Stefan > Testing: > hs-tier1-5; local tests showed significant decreases in time spent in > mixed gcs; dev-submit testing: no significant score changes either with > this change in particular or compared to all recent changes > > Thanks, > ? Thomas From thomas.schatzl at oracle.com Wed Nov 20 12:49:49 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 20 Nov 2019 13:49:49 +0100 Subject: RFR (M): 8231579: G1's incremental calculation of region elapsed time always uses the same age group for prediction In-Reply-To: <6bb0be27-c32c-f2c0-b21e-b03802590cf4@oracle.com> References: <6bb0be27-c32c-f2c0-b21e-b03802590cf4@oracle.com> Message-ID: <3fab7724-302f-faab-40bf-b95df0de5b05@oracle.com> Hi Stefan, On 20.11.19 12:24, Stefan Johansson wrote: > Hi Thomas, > > On 2019-11-19 10:25, Thomas Schatzl wrote: >> Hi all, >> >> ?? can I have reviews for this change that fixes use of the wrong >> predictors when adding a new mutator region to the collection set as >> it is retired? >> >> G1, through the young remset sampling thread, and the mutator threads >> when they retire, keep track of a prediction for the time it takes to >> evacuate the entire young gen. >> >> The mutator thread, when it retires a region, updates that value by >> the current prediction of that retired region: that is where the error >> occurs: currently, it only ever uses the prediction for the region >> that has been retired just before GC. >> Typically a lot of data in this region survives (e.g. 80%+), so the >> prediction for the overall young gen is heavily inflated. >> Which means that during mixed gc, a smaller than necessary amount of >> time is seen as left for evacuating old gen regions, which in turn >> means that G1 typically does more, shorter than expected mixed gcs. >> >> This wastes some throughput, as typically normal young collections can >> take a much larger eden. >> >> This change fixes that by the mutator and the young remset sampling >> thread not updating the time it takes to copy the contents of eden >> regions - the predictions for that are fixed anyway after a GC as the >> predictors for how many objects are expected to be live in a region >> are not updated while the mutator is running. Only other components of >> the prediction are. >> >> The time to copy the eden regions is later, when G1 finalizes the >> prediction for the time the young gen will take, added back. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8231579 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8231579/webrev/ > Looks good, just some variable naming that I think should be updated: > src/hotspot/share/gc/g1/g1CollectionSet.cpp > --- > ?248?? double old_elapsed_time_ms = hr->predicted_non_copy_time_ms(); > ?249?? double new_region_elapsed_time_ms = > predict_region_non_copy_time_ms(hr); > ?250?? double non_copy_time_ms_diff = new_region_elapsed_time_ms - > old_elapsed_time_ms; > ?251?? hr->set_predicted_non_copy_time_ms(new_region_elapsed_time_ms); > ?252?? _inc_predicted_non_copy_time_ms_diff += non_copy_time_ms_diff; > > I think the local variables should change to reflect the new naming, > something like "old_non_copy_time" and "new_non_copy_time". you are right. Fixed in http://cr.openjdk.java.net/~tschatzl/8231579/webrev.0_to_1/ (diff) http://cr.openjdk.java.net/~tschatzl/8231579/webrev.1/ (full) Thanks, Thomas From stefan.johansson at oracle.com Wed Nov 20 13:29:58 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 20 Nov 2019 14:29:58 +0100 Subject: RFR (M): 8227739: Merge cost predictions for scanning cards and log buffer entries In-Reply-To: References: Message-ID: Hi Thomas, Sorry for taking so long to get to this review. On 2019-10-22 20:26, Thomas Schatzl wrote: > Hi all, > > ? can I have reviews for this change that aligns the cost predictions > to the way we do evacuations, i.e. that we first drop all remembered > sets onto the card table, and only a fraction of that will be scanned as > introduced by JDK-8213108. > > This code adds all the predictions for ratios etc to align to that code > in our prediction model too. > > After this change (and all previous) changes just sent out for review, > mostly JDK-8228609 (which is a prerequisite for this change), > predictions are a bit (noticably) better than before :) > > CR: > https://bugs.openjdk.java.net/browse/JDK-8227739 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8227739/webrev/ Looks good in general, just some small comments: src/hotspot/share/gc/g1/g1Analytics.cpp --- 255 if (for_young_gc || _mixed_cost_per_card_merge_ms_seq->num() < 3) We have a few of these "seq->num() < 3" checks, what do you think about adding a helper for those? Something like, ready_for_prediction(seq) and do: for_young_gc || !ready_for_prediction(_mixed_cost_per_card_merge_ms_seq) --- src/hotspot/share/gc/g1/g1Policy.cpp --- 725 if (total_cards_merged > 10) { ... 738 if (total_cards_scanned > 10) { Kind of pre-existing, but do you know why we have this limit of 10 in these cases. Would be nice to add a comment about it and maybe add a constant with some descriptive name. --- Thanks, Stefan > Testing: > hs-tier1-5, perf testing, pause time keeping improves a little > > Thanks, > ? Thomas From stefan.johansson at oracle.com Wed Nov 20 13:43:17 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 20 Nov 2019 14:43:17 +0100 Subject: RFR (M): 8231579: G1's incremental calculation of region elapsed time always uses the same age group for prediction In-Reply-To: <3fab7724-302f-faab-40bf-b95df0de5b05@oracle.com> References: <6bb0be27-c32c-f2c0-b21e-b03802590cf4@oracle.com> <3fab7724-302f-faab-40bf-b95df0de5b05@oracle.com> Message-ID: On 2019-11-20 13:49, Thomas Schatzl wrote: > Hi Stefan, > > On 20.11.19 12:24, Stefan Johansson wrote: >> Hi Thomas, >> >> On 2019-11-19 10:25, Thomas Schatzl wrote: >>> Hi all, >>> >>> ?? can I have reviews for this change that fixes use of the wrong >>> predictors when adding a new mutator region to the collection set as >>> it is retired? >>> >>> G1, through the young remset sampling thread, and the mutator threads >>> when they retire, keep track of a prediction for the time it takes to >>> evacuate the entire young gen. >>> >>> The mutator thread, when it retires a region, updates that value by >>> the current prediction of that retired region: that is where the >>> error occurs: currently, it only ever uses the prediction for the >>> region that has been retired just before GC. >>> Typically a lot of data in this region survives (e.g. 80%+), so the >>> prediction for the overall young gen is heavily inflated. >>> Which means that during mixed gc, a smaller than necessary amount of >>> time is seen as left for evacuating old gen regions, which in turn >>> means that G1 typically does more, shorter than expected mixed gcs. >>> >>> This wastes some throughput, as typically normal young collections >>> can take a much larger eden. >>> >>> This change fixes that by the mutator and the young remset sampling >>> thread not updating the time it takes to copy the contents of eden >>> regions - the predictions for that are fixed anyway after a GC as the >>> predictors for how many objects are expected to be live in a region >>> are not updated while the mutator is running. Only other components >>> of the prediction are. >>> >>> The time to copy the eden regions is later, when G1 finalizes the >>> prediction for the time the young gen will take, added back. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8231579 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8231579/webrev/ >> Looks good, just some variable naming that I think should be updated: >> src/hotspot/share/gc/g1/g1CollectionSet.cpp >> --- >> ??248?? double old_elapsed_time_ms = hr->predicted_non_copy_time_ms(); >> ??249?? double new_region_elapsed_time_ms = >> predict_region_non_copy_time_ms(hr); >> ??250?? double non_copy_time_ms_diff = new_region_elapsed_time_ms - >> old_elapsed_time_ms; >> ??251?? hr->set_predicted_non_copy_time_ms(new_region_elapsed_time_ms); >> ??252?? _inc_predicted_non_copy_time_ms_diff += non_copy_time_ms_diff; >> >> I think the local variables should change to reflect the new naming, >> something like "old_non_copy_time" and "new_non_copy_time". > > ?you are right. Fixed in > > http://cr.openjdk.java.net/~tschatzl/8231579/webrev.0_to_1/ (diff) > http://cr.openjdk.java.net/~tschatzl/8231579/webrev.1/ (full) Looks good. Thanks, Stefan > > Thanks, > ? Thomas From thomas.schatzl at oracle.com Wed Nov 20 15:26:04 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Wed, 20 Nov 2019 16:26:04 +0100 Subject: RFR (M): 8227739: Merge cost predictions for scanning cards and log buffer entries In-Reply-To: References: Message-ID: <6f66ee24-112d-032d-a28e-1a08850a9268@oracle.com> Hi, On 20.11.19 14:29, Stefan Johansson wrote: > Hi Thomas, > > Sorry for taking so long to get to this review. > > On 2019-10-22 20:26, Thomas Schatzl wrote: >> Hi all, >> >> ?? can I have reviews for this change that aligns the cost predictions >> to the way we do evacuations, i.e. that we first drop all remembered >> sets onto the card table, and only a fraction of that will be scanned >> as introduced by JDK-8213108. >> >> This code adds all the predictions for ratios etc to align to that >> code in our prediction model too. >> >> After this change (and all previous) changes just sent out for review, >> mostly JDK-8228609 (which is a prerequisite for this change), >> predictions are a bit (noticably) better than before :) >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8227739 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8227739/webrev/ > Looks good in general, just some small comments: > src/hotspot/share/gc/g1/g1Analytics.cpp > --- > 255?? if (for_young_gc || _mixed_cost_per_card_merge_ms_seq->num() < 3) > > We have a few of these "seq->num() < 3" checks, what do you think about > adding a helper for those? Something like, ready_for_prediction(seq) and > do: > for_young_gc || !ready_for_prediction(_mixed_cost_per_card_merge_ms_seq) > --- > > src/hotspot/share/gc/g1/g1Policy.cpp > --- > ?725???? if (total_cards_merged > 10) { > ?... > ?738???? if (total_cards_scanned > 10) { > > Kind of pre-existing, but do you know why we have this limit of 10 in > these cases. Would be nice to add a comment about it and maybe add a > constant with some descriptive name. > --- Fixed. I do not know the history about the particular values, and only took them over from existing code. I guess these values are just guesses one way or another. I will file an RFE to look into this. There is already one for the 1.1 in predict_object_copy_time_ms_during_cm(). http://cr.openjdk.java.net/~tschatzl/8227739/webrev.0_to_1/ (diff) http://cr.openjdk.java.net/~tschatzl/8227739/webrev.1/ (full) Thanks, Thomas From igor.ignatyev at oracle.com Thu Nov 21 00:27:42 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 20 Nov 2019 16:27:42 -0800 Subject: RFR(S) : 8147017 : Platform.isGraal should be removed In-Reply-To: <5e1d17af-798f-123f-ef5e-3957b98a8340@oracle.com> References: <981118AF-1DAD-4231-9FA6-7A89A46E5EDB@oracle.com> <5e1d17af-798f-123f-ef5e-3957b98a8340@oracle.com> Message-ID: @Misha, thanks for your review. @list, can I get a 2nd review from a Reviewer? -- Igor > On Nov 18, 2019, at 2:06 PM, mikhailo.seledtsov at oracle.com wrote: > > Looks good to me, > > Misha > > On 11/17/19 11:00 AM, Igor Ignatyev wrote: >> http://cr.openjdk.java.net/~iignatyev//8147017/webrev.00/index.html >>> 16 lines changed: 2 ins; 8 del; 6 mod; >> Hi all, >> >> jdk.test.lib.Platform.isGraal method assumes that JVM w/ Graal as JIT has 'Graal VM' in its name, which is wrong, and caused other to incorrectly assume that '-graal' flag exist and must be used to select Graal compiler. the patch removes this method and updates its only meaningful usage in TestGCLogMessages test. TestGCLogMessages test should use LogMessageWithLevelC2OrJVMCIOnly only when c2 or graal is available, so it's been updated to use corresponding methods of sun.hotspot.code.Compiler class, which requires WhiteBoxAPI being enabled. >> >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8147017 >> webrev: http://cr.openjdk.java.net/~iignatyev//8147017/webrev.00/index.html >> testing: tier1 + TestGCLogMessages w/ different JIT configurations >> >> Thanks, >> -- Igor From vladimir.kozlov at oracle.com Thu Nov 21 01:54:34 2019 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 20 Nov 2019 17:54:34 -0800 Subject: RFR(S) : 8147017 : Platform.isGraal should be removed In-Reply-To: References: <981118AF-1DAD-4231-9FA6-7A89A46E5EDB@oracle.com> <5e1d17af-798f-123f-ef5e-3957b98a8340@oracle.com> Message-ID: <3d0ddee5-ab54-3742-c053-d9cd74a93cb8@oracle.com> Reviewed. Good. Thanks, Vladimir K On 11/20/19 4:27 PM, Igor Ignatyev wrote: > @Misha, > > thanks for your review. > > @list, > can I get a 2nd review from a Reviewer? > > -- Igor > >> On Nov 18, 2019, at 2:06 PM, mikhailo.seledtsov at oracle.com wrote: >> >> Looks good to me, >> >> Misha >> >> On 11/17/19 11:00 AM, Igor Ignatyev wrote: >>> http://cr.openjdk.java.net/~iignatyev//8147017/webrev.00/index.html >>>> 16 lines changed: 2 ins; 8 del; 6 mod; >>> Hi all, >>> >>> jdk.test.lib.Platform.isGraal method assumes that JVM w/ Graal as JIT has 'Graal VM' in its name, which is wrong, and caused other to incorrectly assume that '-graal' flag exist and must be used to select Graal compiler. the patch removes this method and updates its only meaningful usage in TestGCLogMessages test. TestGCLogMessages test should use LogMessageWithLevelC2OrJVMCIOnly only when c2 or graal is available, so it's been updated to use corresponding methods of sun.hotspot.code.Compiler class, which requires WhiteBoxAPI being enabled. >>> >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8147017 >>> webrev: http://cr.openjdk.java.net/~iignatyev//8147017/webrev.00/index.html >>> testing: tier1 + TestGCLogMessages w/ different JIT configurations >>> >>> Thanks, >>> -- Igor > From igor.ignatyev at oracle.com Thu Nov 21 02:27:30 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 20 Nov 2019 18:27:30 -0800 Subject: RFR(S) : 8147017 : Platform.isGraal should be removed In-Reply-To: <3d0ddee5-ab54-3742-c053-d9cd74a93cb8@oracle.com> References: <981118AF-1DAD-4231-9FA6-7A89A46E5EDB@oracle.com> <5e1d17af-798f-123f-ef5e-3957b98a8340@oracle.com> <3d0ddee5-ab54-3742-c053-d9cd74a93cb8@oracle.com> Message-ID: Hi Vladimir, thanks for your review, pushed. -- Igor > On Nov 20, 2019, at 5:54 PM, Vladimir Kozlov wrote: > > Reviewed. Good. > > Thanks, > Vladimir K > > On 11/20/19 4:27 PM, Igor Ignatyev wrote: >> @Misha, >> thanks for your review. >> @list, >> can I get a 2nd review from a Reviewer? >> -- Igor >>> On Nov 18, 2019, at 2:06 PM, mikhailo.seledtsov at oracle.com wrote: >>> >>> Looks good to me, >>> >>> Misha >>> >>> On 11/17/19 11:00 AM, Igor Ignatyev wrote: >>>> http://cr.openjdk.java.net/~iignatyev//8147017/webrev.00/index.html >>>>> 16 lines changed: 2 ins; 8 del; 6 mod; >>>> Hi all, >>>> >>>> jdk.test.lib.Platform.isGraal method assumes that JVM w/ Graal as JIT has 'Graal VM' in its name, which is wrong, and caused other to incorrectly assume that '-graal' flag exist and must be used to select Graal compiler. the patch removes this method and updates its only meaningful usage in TestGCLogMessages test. TestGCLogMessages test should use LogMessageWithLevelC2OrJVMCIOnly only when c2 or graal is available, so it's been updated to use corresponding methods of sun.hotspot.code.Compiler class, which requires WhiteBoxAPI being enabled. >>>> >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8147017 >>>> webrev: http://cr.openjdk.java.net/~iignatyev//8147017/webrev.00/index.html >>>> testing: tier1 + TestGCLogMessages w/ different JIT configurations >>>> >>>> Thanks, >>>> -- Igor From per.liden at oracle.com Thu Nov 21 09:32:10 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 21 Nov 2019 10:32:10 +0100 Subject: RFR: 8234543: ZGC: Parallel pre-touch Message-ID: <47dfbfe2-996c-771d-66f2-27683cb458b1@oracle.com> When using -XX:+AlwaysPreTouch, ZGC is currently doing single threaded pre-touch. This patch makes this a parallel operation. This improves startup time, especially when using large heaps. For example, when using a 3TB heap (-XX:+AlwaysPreTouch -Xms3T -Xmx3T), startup time is improved by about 30x. Bug: https://bugs.openjdk.java.net/browse/JDK-8234543 Webrev: http://cr.openjdk.java.net/~pliden/8234543/webrev.0 /Per From thomas.schatzl at oracle.com Thu Nov 21 10:41:53 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 21 Nov 2019 11:41:53 +0100 Subject: RFR (M): 8233919: Incrementally calculate the occupied cards in a heap region remembered set In-Reply-To: <16491a20-6f2a-0d70-42c2-6c23c3a9e407@oracle.com> References: <23b30b41-b109-4d05-606f-fa6a87a07897@oracle.com> <16491a20-6f2a-0d70-42c2-6c23c3a9e407@oracle.com> Message-ID: Hi, On 20.11.19 11:42, Stefan Johansson wrote: > Hi Thomas, > > On 2019-11-12 16:24, Thomas Schatzl wrote: >> Hi all, >> >> ?? may I have reviews for this change that ultimately makes sure that >> the number of occupied cards in a remembered set is only growing by >> providing a per-OtherRegionsTable count that is atomically updated >> when adding a remembered set entry. >> >> Note that this count may not be completely accurate due to races when >> deleting a PerRegionTable (which is a known issue) from an >> OtherRegionsTable; but that is no different than before. >> >> This helps improving the predictions in the young gen remset sampling >> thread, and increase the performance of getting the occupancy count. >> >> Based on JDK-8233997, and JDK-8233998 also out for review. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8233919 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8233919/webrev/ > I like this change and it looks good in general, just one small comment: > src/hotspot/share/gc/g1/heapRegionRemSet.cpp > --- > ?247?? bool added = prt->add_reference(from); > ?248?? Atomic::add(num_added_by_coarsening + (added ? 1 : 0), > &_num_occupied, memory_order_relaxed); > > I would prefer: > if (prt->add_reference(from)) { > ? num_added_by_coarsening++; > } > Atomic::add... > > I you disagree, leave it as is. Fixed in http://cr.openjdk.java.net/~tschatzl/8233919/webrev.0_to_1/ http://cr.openjdk.java.net/~tschatzl/8233919/webrev.1/ Thanks for your review, Thomas From per.liden at oracle.com Thu Nov 21 11:11:12 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 21 Nov 2019 12:11:12 +0100 Subject: RFR: 8234573: ZGC: Enable ZVerifyMarking by default in debug builds Message-ID: Just like we have ZVerifyRoots enabled by default in debug builds, I think we should also enabled ZVerifyMarking, since it's fairly inexpensive. Bug: https://bugs.openjdk.java.net/browse/JDK-8234573 Webrev: http://cr.openjdk.java.net/~pliden/8234573/webrev.0 /Per From thomas.schatzl at oracle.com Thu Nov 21 11:15:45 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 21 Nov 2019 12:15:45 +0100 Subject: RFR: 8234573: ZGC: Enable ZVerifyMarking by default in debug builds In-Reply-To: References: Message-ID: <012384ce-bb65-a246-cb15-1a98a5455ff8@oracle.com> Hi, On 21.11.19 12:11, Per Liden wrote: > Just like we have ZVerifyRoots enabled by default in debug builds, I > think we should also enabled ZVerifyMarking, since it's fairly inexpensive. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234573 > Webrev: http://cr.openjdk.java.net/~pliden/8234573/webrev.0 > > /Per looks good and trivial. Thomas From stefan.karlsson at oracle.com Thu Nov 21 11:27:43 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 21 Nov 2019 12:27:43 +0100 Subject: RFR: 8234573: ZGC: Enable ZVerifyMarking by default in debug builds In-Reply-To: References: Message-ID: <8ab61322-0de5-5e49-e8cf-f95f223b34cc@oracle.com> Looks good. StefanK On 2019-11-21 12:11, Per Liden wrote: > Just like we have ZVerifyRoots enabled by default in debug builds, I > think we should also enabled ZVerifyMarking, since it's fairly inexpensive. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234573 > Webrev: http://cr.openjdk.java.net/~pliden/8234573/webrev.0 > > /Per From per.liden at oracle.com Thu Nov 21 11:33:35 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 21 Nov 2019 12:33:35 +0100 Subject: RFR: 8234573: ZGC: Enable ZVerifyMarking by default in debug builds In-Reply-To: <8ab61322-0de5-5e49-e8cf-f95f223b34cc@oracle.com> References: <8ab61322-0de5-5e49-e8cf-f95f223b34cc@oracle.com> Message-ID: <4dc84243-731a-cab0-7189-cb487f1ee5c3@oracle.com> Thanks Stefan! /Per On 11/21/19 12:27 PM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2019-11-21 12:11, Per Liden wrote: >> Just like we have ZVerifyRoots enabled by default in debug builds, I >> think we should also enabled ZVerifyMarking, since it's fairly >> inexpensive. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234573 >> Webrev: http://cr.openjdk.java.net/~pliden/8234573/webrev.0 >> >> /Per From thomas.schatzl at oracle.com Thu Nov 21 12:22:30 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 21 Nov 2019 13:22:30 +0100 Subject: RFR (S): 8234574: Rename prediction methods in G1Analytics Message-ID: <1390bd21-236f-d826-5d4c-a12a44f6bef1@oracle.com> Hi all, could you review this change that renames the prediction methods of G1Analytics (and wrappers around them). This has been requested during an internal review of JDK-8227434. This change has been split out to avoid re-reviewing later code already reviewed but not pushed yet due to missing dependencies and it's easier to review without functional changes mixed in. Based on JDK-8233588. CR: https://bugs.openjdk.java.net/browse/JDK-8234574 Webrev: http://cr.openjdk.java.net/~tschatzl/8234574/webrev/ Testing: local compilation (this is a mechanical IDE supported rename of three methods) Thanks, Thomas From stefan.karlsson at oracle.com Thu Nov 21 13:24:11 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 21 Nov 2019 14:24:11 +0100 Subject: RFR: 8234010: ZGC: Change ZResurrection to use Atomic::load/store Message-ID: <674bc3ce-d48f-d9d7-4ce1-d2fcf144246e@oracle.com> Hi all, Please review this patch to change ZResurrection to use Atomic::load and Atomic::store. https://cr.openjdk.java.net/~stefank/8234010/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8234010 Previously, ZResurrection::is_blocked() and ZResurrection::unblock() used loadload and storestore barriers to synchronize between the GC and mutator load barriers. JDK-8230661 changed so that we always perform a handshake before the ZResurrection::unblock() call. After that change we can rely on the handshake to perform the necessary synchronization, and we can change the implementation to use Atomic::load and Atomic::store. Tested with tier1-7 Thanks, StefanK From stefan.karlsson at oracle.com Thu Nov 21 13:28:29 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 21 Nov 2019 14:28:29 +0100 Subject: RFR: 8234009: ZGC: Move resurrection unblock to before the _unload.purge() call Message-ID: <18bf9548-2aa1-689e-420c-bbaeb5813d1c@oracle.com> Hi all, Please review this patch to move the resurrection to before the _unload.purge() call. https://cr.openjdk.java.net/~stefank/8234009/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8234009 After JDK-8230661 we are guaranteed that no mutator holds a weak oop containing a "dead" object when the ZResurrection::unblock() call happens. Therefore, it doesn't matter if the load barrier that the thread executes runs the code guarded by ZResurrection::is_blocked() or the "non-blocked" part, when ZResurrection::unblock() changes the state. As long as the ZResurrection::unblock() call happens after the handshake, we are good to go. Today, we perform the purging and deletion of metadata in _unload.purge(), and call ZResurrection:unblock() after that. There's no need to delay the unblocking to after the purge. We have the opportunity to shrink the resurrection-blocked window by moving the ZResurrection::unblock() call to before the call to _unload.purge(), but still after the handshake. Thanks, StefanK From thomas.schatzl at oracle.com Thu Nov 21 14:09:06 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 21 Nov 2019 15:09:06 +0100 Subject: RFR (S): 8234586: Rename survRateGroup.?pp files to g1SurvRateGroup.?pp Message-ID: <045ab345-0008-43d6-6917-c1d3aee1b9d7@oracle.com> Hi all, can I have reviews for this rename of the survRateGroup* files to g1SurvRateGroup* to follow the naming convention of (most) other G1 specific files? Based on JDK-8233588. CR: https://bugs.openjdk.java.net/browse/JDK-8234586 Webrev: http://cr.openjdk.java.net/~tschatzl/8234586/webrev/ Testing: local compilation Thanks, Thomas From thomas.schatzl at oracle.com Thu Nov 21 14:11:38 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 21 Nov 2019 15:11:38 +0100 Subject: RFR (S): 8234587: Rename the SurvRateGroup class to G1SurvRateGroup Message-ID: <75560f8e-7d5a-a9df-0489-d37885f2be1a@oracle.com> Hi all, can I have reviews for this addition of a "G1" prefix for the name of the SurvRateGroup class after moving it into correctly named files jusst earlier? Based on JDK-8234587. CR: https://bugs.openjdk.java.net/browse/JDK-8234587 Webrev: http://cr.openjdk.java.net/~tschatzl/8234587/webrev/ Testing: local compilation Thanks, Thomas From stefan.johansson at oracle.com Thu Nov 21 14:26:00 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Thu, 21 Nov 2019 15:26:00 +0100 Subject: RFR (M): 8227434: G1 predictions may over/underflow with high variance input In-Reply-To: <8112fc27-69a4-0249-de00-54e907ee38e4@oracle.com> References: <8112fc27-69a4-0249-de00-54e907ee38e4@oracle.com> Message-ID: Hi Thomas, On 2019-11-12 15:10, Thomas Schatzl wrote: > Hi all, > > ? can I have reviews for this change that tries to fix possible > underflows and overflows in our predictor use in case there is high > variance input? > > I did not analyze for every case whether the issue actually happened, > but changed the get_new_prediction() calls to something I believe is > appropriate for the given sequence. Cases where there has already been > some clamping going on were obvious of course. > > It's a bit boring to review... > > CR: > https://bugs.openjdk.java.net/browse/JDK-8227434 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8227434/webrev/ Looks good in general, me and Thomas discussed some naming suggestions offline, but we are going to do that as a separate RFE. See JDK-8234574. Thanks, Stefan > Testing: > hs-tier1-5 > > > Thanks, > ? Thomas From leo.korinth at oracle.com Thu Nov 21 18:12:53 2019 From: leo.korinth at oracle.com (Leo Korinth) Date: Thu, 21 Nov 2019 19:12:53 +0100 Subject: RFR: 8233029: Obsolete flag GCTaskTimeStampEntries Message-ID: <3b1ebefc-6643-4071-7e82-40472a7ad51e@oracle.com> Hi, When I changed ParallelGC to use WorkGangs instead of GCTasks, I unfortunately missed to remove (thereafter unused) command line option GCTaskTimeStampEntries. Here is a fix for that. CSR: https://bugs.openjdk.java.net/browse/JDK-8234396 Bug: https://bugs.openjdk.java.net/browse/JDK-8233029 Webrev: http://cr.openjdk.java.net/~lkorinth/8233029/00 Testing: tier 1-3 Thanks, Leo From kim.barrett at oracle.com Thu Nov 21 18:20:36 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 21 Nov 2019 13:20:36 -0500 Subject: RFR (S): 8234586: Rename survRateGroup.?pp files to g1SurvRateGroup.?pp In-Reply-To: <045ab345-0008-43d6-6917-c1d3aee1b9d7@oracle.com> References: <045ab345-0008-43d6-6917-c1d3aee1b9d7@oracle.com> Message-ID: <34380D46-927F-4BF5-A017-39B92CA2C60C@oracle.com> > On Nov 21, 2019, at 9:09 AM, Thomas Schatzl wrote: > > Hi all, > > can I have reviews for this rename of the survRateGroup* files to g1SurvRateGroup* to follow the naming convention of (most) other G1 specific files? > > Based on JDK-8233588. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8234586 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8234586/webrev/ > Testing: > local compilation > > Thanks, > Thomas Looks good, and trivial. From kim.barrett at oracle.com Thu Nov 21 18:24:31 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 21 Nov 2019 13:24:31 -0500 Subject: RFR (S): 8234587: Rename the SurvRateGroup class to G1SurvRateGroup In-Reply-To: <75560f8e-7d5a-a9df-0489-d37885f2be1a@oracle.com> References: <75560f8e-7d5a-a9df-0489-d37885f2be1a@oracle.com> Message-ID: <40921A01-0861-4072-BAB9-2D9A17A31466@oracle.com> > On Nov 21, 2019, at 9:11 AM, Thomas Schatzl wrote: > > Hi all, > > can I have reviews for this addition of a "G1" prefix for the name of the SurvRateGroup class after moving it into correctly named files jusst earlier? > > Based on JDK-8234587. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8234587 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8234587/webrev/ > Testing: > local compilation > > Thanks, > Thomas Looks good. From kim.barrett at oracle.com Thu Nov 21 18:34:21 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 21 Nov 2019 13:34:21 -0500 Subject: RFR: 8233029: Obsolete flag GCTaskTimeStampEntries In-Reply-To: <3b1ebefc-6643-4071-7e82-40472a7ad51e@oracle.com> References: <3b1ebefc-6643-4071-7e82-40472a7ad51e@oracle.com> Message-ID: <62DA5172-F020-4F0A-A9BF-8BF2AA48BD1A@oracle.com> > On Nov 21, 2019, at 1:12 PM, Leo Korinth wrote: > > Hi, > > When I changed ParallelGC to use WorkGangs instead of GCTasks, I unfortunately missed to remove (thereafter unused) command line option GCTaskTimeStampEntries. Here is a fix for that. > > CSR: > https://bugs.openjdk.java.net/browse/JDK-8234396 > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8233029 > > Webrev: > http://cr.openjdk.java.net/~lkorinth/8233029/00 > > Testing: > tier 1-3 > > Thanks, > Leo ------------------------------------------------------------------------------ test/hotspot/jtreg/gc/parallel/TestPrintGCDetailsVerbose.java With the removal of -XX:GCTaskTimeStampEntries=1 from the second @run line, we now have two identical @run lines. That second @run with the additional option was added by JDK-8177963 "Parallel GC fails fast when per-thread task log overflows". Seems like it's no longer relevant with the change to use WorkGang. So I think the duplcate test can just be removed, rather than looking for some other option(s) to make it different. ------------------------------------------------------------------------------ Other than that, looks good. I don't need a new webrev for removal of the now duplicate test @run line. From thomas.schatzl at oracle.com Thu Nov 21 18:35:38 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 21 Nov 2019 19:35:38 +0100 Subject: RFR: 8233029: Obsolete flag GCTaskTimeStampEntries In-Reply-To: <62DA5172-F020-4F0A-A9BF-8BF2AA48BD1A@oracle.com> References: <3b1ebefc-6643-4071-7e82-40472a7ad51e@oracle.com> <62DA5172-F020-4F0A-A9BF-8BF2AA48BD1A@oracle.com> Message-ID: <1eac0243-c009-fbbd-da2e-17ebfa39f73d@oracle.com> Hi, On 21.11.19 19:34, Kim Barrett wrote: >> On Nov 21, 2019, at 1:12 PM, Leo Korinth wrote: >> >> Hi, >> >> When I changed ParallelGC to use WorkGangs instead of GCTasks, I unfortunately missed to remove (thereafter unused) command line option GCTaskTimeStampEntries. Here is a fix for that. >> >> CSR: >> https://bugs.openjdk.java.net/browse/JDK-8234396 >> >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8233029 >> >> Webrev: >> http://cr.openjdk.java.net/~lkorinth/8233029/00 >> [...] > ------------------------------------------------------------------------------ > test/hotspot/jtreg/gc/parallel/TestPrintGCDetailsVerbose.java > > With the removal of -XX:GCTaskTimeStampEntries=1 from the second @run > line, we now have two identical @run lines. > [...] > ------------------------------------------------------------------------------ > > Other than that, looks good. I don't need a new webrev for removal of > the now duplicate test @run line. > +1 Thomas From sangheon.kim at oracle.com Thu Nov 21 18:41:47 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Thu, 21 Nov 2019 10:41:47 -0800 Subject: RFR (S): 8234587: Rename the SurvRateGroup class to G1SurvRateGroup In-Reply-To: <75560f8e-7d5a-a9df-0489-d37885f2be1a@oracle.com> References: <75560f8e-7d5a-a9df-0489-d37885f2be1a@oracle.com> Message-ID: Hi Thomas, On 11/21/19 6:11 AM, Thomas Schatzl wrote: > Hi all, > > ? can I have reviews for this addition of a "G1" prefix for the name > of the SurvRateGroup class after moving it into correctly named files > jusst earlier? > > Based on JDK-8234587. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8234587 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8234587/webrev/ Looks good. Thanks, Sangheon > Testing: > local compilation > > Thanks, > ? Thomas From leo.korinth at oracle.com Thu Nov 21 18:42:23 2019 From: leo.korinth at oracle.com (Leo Korinth) Date: Thu, 21 Nov 2019 19:42:23 +0100 Subject: RFR (S): 8234587: Rename the SurvRateGroup class to G1SurvRateGroup In-Reply-To: <75560f8e-7d5a-a9df-0489-d37885f2be1a@oracle.com> References: <75560f8e-7d5a-a9df-0489-d37885f2be1a@oracle.com> Message-ID: On 21/11/2019 15:11, Thomas Schatzl wrote: > Hi all, > > ? can I have reviews for this addition of a "G1" prefix for the name of > the SurvRateGroup class after moving it into correctly named files jusst > earlier? Looks good /Leo > > Based on JDK-8234587. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8234587 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8234587/webrev/ > Testing: > local compilation > > Thanks, > ? Thomas From leo.korinth at oracle.com Thu Nov 21 18:59:52 2019 From: leo.korinth at oracle.com (Leo Korinth) Date: Thu, 21 Nov 2019 19:59:52 +0100 Subject: RFR: 8233029: Obsolete flag GCTaskTimeStampEntries In-Reply-To: <1eac0243-c009-fbbd-da2e-17ebfa39f73d@oracle.com> References: <3b1ebefc-6643-4071-7e82-40472a7ad51e@oracle.com> <62DA5172-F020-4F0A-A9BF-8BF2AA48BD1A@oracle.com> <1eac0243-c009-fbbd-da2e-17ebfa39f73d@oracle.com> Message-ID: On 21/11/2019 19:35, Thomas Schatzl wrote: > Hi, > > On 21.11.19 19:34, Kim Barrett wrote: >>> On Nov 21, 2019, at 1:12 PM, Leo Korinth wrote: >>> >>> Hi, >>> >>> When I changed ParallelGC to use WorkGangs instead of GCTasks, I >>> unfortunately missed to remove (thereafter unused) command line >>> option GCTaskTimeStampEntries. Here is a fix for that. >>> >>> CSR: >>> https://bugs.openjdk.java.net/browse/JDK-8234396 >>> >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8233029 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~lkorinth/8233029/00 >>> > [...] >> ------------------------------------------------------------------------------ >> >> test/hotspot/jtreg/gc/parallel/TestPrintGCDetailsVerbose.java >> >> With the removal of -XX:GCTaskTimeStampEntries=1 from the second @run >> line, we now have two identical @run lines. >> [...] oops >> ------------------------------------------------------------------------------ >> >> >> Other than that, looks good.? I don't need a new webrev for removal of >> the now duplicate test @run line. >> > > +1 > > Thomas Thanks for your reviews Kim and Thomas. /Leo From sangheon.kim at oracle.com Thu Nov 21 19:01:20 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Thu, 21 Nov 2019 11:01:20 -0800 Subject: RFR (S): 8234574: Rename prediction methods in G1Analytics In-Reply-To: <1390bd21-236f-d826-5d4c-a12a44f6bef1@oracle.com> References: <1390bd21-236f-d826-5d4c-a12a44f6bef1@oracle.com> Message-ID: Hi Thomas, On 11/21/19 4:22 AM, Thomas Schatzl wrote: > Hi all, > > ? could you review this change that renames the prediction methods of > G1Analytics (and wrappers around them). This has been requested during > an internal review of JDK-8227434. > > This change has been split out to avoid re-reviewing later code > already reviewed but not pushed yet due to missing dependencies and > it's easier to review without functional changes mixed in. > > Based on JDK-8233588. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8234574 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8234574/webrev/ Renaming looks good. If you are interested updating the copyright year, test_g1Predictions.cpp needs to be updated. I don't need a new webrev for this. Thanks, Sangheon > Testing: > local compilation (this is a mechanical IDE supported rename of three > methods) > > Thanks, > ? Thomas From kim.barrett at oracle.com Thu Nov 21 20:35:11 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 21 Nov 2019 15:35:11 -0500 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting In-Reply-To: References: <34BD822F-20DD-43E3-956F-09622EB62148@oracle.com> <5c700c0a-954e-6187-67b4-cc523cace95c@oracle.com> <387D12CA-4C5A-4DE5-A6E7-ED6739AC7173@oracle.com> <37161B48-8AD1-4887-A56D-4905BD8DC159@oracle.com> <4D5605AD-7F3E-4AB4-8A13-5AA18DC0AB40@oracle.com> Message-ID: <7F106122-9DE0-428E-8C83-0E1F443D3905@oracle.com> > On Nov 19, 2019, at 9:26 PM, Man Cao wrote: > > Hi all, > > Thanks! I have addressed all comments: > Full: http://cr.openjdk.java.net/~manc/8087198/webrev.02/ > Incremental: http://cr.openjdk.java.net/~manc/8087198/webrev.01-02.inc/ > > > I also changed G1RemSet::clean_card_before_refine() to take a "CardValue**" > parameter instead of "CardValue*&" to make it more obvious that it can > modify > the card pointer. Thanks for that change; that seems to have made the code nicer. This is looking pretty good. Just a couple of possible improvements and questions. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp 247 QuickSort::sort(&_node_buffer[start_index], ... 250 true); I *think* a false value for idempotent is better here. We don't care about reordering of equal values, so there's no correctness issue either way. A true value will avoid unnecessary swaps of equal entries, at the cost of an extra comparison. The comparison is very cheap, so if equal entries were somewhat common that could be a win. But because of the dirty-card-based filtering, I think equal entries are uncommon, making the extra comparisons wasteful. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp 289 bool refine_cleaned_cards(size_t start_index) { 290 for (size_t i = start_index; i < _node_buffer_size; ++i) { 291 if (SuspendibleThreadSet::should_yield()) { 292 redirty_unrefined_cards(i); 293 _node->set_index(i); 294 return false; 295 } 296 _g1rs->refine_card_concurrently(_node_buffer[i], _worker_id); 297 (*_total_refined_cards)++; 298 } 299 _node->set_index(_node_buffer_size); 300 return true; 301 } It would be better to bulk increment *_total_refined_cards at the end, rather than on each iteration. Maybe something like this: bool refine_cleaned_cards(size_t start_index) { bool result = true; size_t i = start_index; for ( ; i < _node_buffer_size; ++i) { if (SuspendibleThreadSet::should_yield()) { redirty_unrefined_cards(i); result = false; break; } _g1rs->refine_card_concurrently(_node_buffer[i], _worker_id); } _node->set_index(i); *_total_refined_cards += i - start_index; return result; } ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp 321 bool refine() { 322 size_t first_clean_index = clean_cards(); Is it worth checking whether the cleaned buffer is now empty, skipping the rest altogether if so? I don't know how often that actually happens, and the rest isn't *that* expensive in the empty buffer case. So I'm going to guess not worth the extra test, but something to consider. ------------------------------------------------------------------------------ From stefan.karlsson at oracle.com Thu Nov 21 21:08:54 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 21 Nov 2019 13:08:54 -0800 (PST) Subject: RFR: 8234602: ZGC: Windows compile error in ZHeuristic Message-ID: <27cf412a-a499-60c6-aef9-de84c4ac037f@oracle.com> Hi all, Please review this trivial fix to silence a windows compiler warning about narrowing a size_t to int. https://cr.openjdk.java.net/~stefank/8234602/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8234602 Thanks, StefanK From sangheon.kim at oracle.com Thu Nov 21 22:37:21 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Thu, 21 Nov 2019 14:37:21 -0800 Subject: RFR (M): 8233306: Sort members in G1's HeapRegion after removal of Space dependency In-Reply-To: <412699eb-1ae0-edd8-1acd-ce7673872101@oracle.com> References: <568f7bca-3c39-f554-b557-953e5f7f157c@oracle.com> <412699eb-1ae0-edd8-1acd-ce7673872101@oracle.com> Message-ID: <31d867dd-5855-ad37-92ca-fb23a762ccd3@oracle.com> Hi Thomas, On 11/14/19 4:42 AM, Thomas Schatzl wrote: > Hi Stefan, > > On 13.11.19 10:17, Stefan Johansson wrote: >> Hi Thomas, >> >> On 2019-10-31 14:47, Thomas Schatzl wrote: >>> Hi all, >>> >>> ??after the change to HeapRegion in JDK-8233306 the declaration fo >>> the HeapRegion class is a bit messed up (merging G1ContiguousSpace, >>> adding a few members needed from ContiguousSpace). >>> >>> This change tries to fix this as much as possible by shuffling >>> around stuff (i.e. grouping allocation related methods, evacuation >>> related methods, some helper pointers in HeapRegion, etc). >>> >>> Depends on JDK-8189737 also out for review. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8233306 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8233306/webrev/ >> Looks good, >> Stefan >> > > ? thanks for your review. > > Fyi, there has been one merge issue with latest NUMA changes: in > heapRegion.cpp, in the initializer list of HeapRegion::HeapRegion, > NUMA added a _node_index member at the end. This caused the merge > logic to bail out because the context of the source hunk and the > current code did not exactly match. > > I updated the webrev. The updated webrev looks good. Thanks, Sangheon > > Thanks, > ? Thomas From thomas.schatzl at oracle.com Fri Nov 22 08:58:29 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 22 Nov 2019 09:58:29 +0100 Subject: RFR (M): 8233306: Sort members in G1's HeapRegion after removal of Space dependency In-Reply-To: <31d867dd-5855-ad37-92ca-fb23a762ccd3@oracle.com> References: <568f7bca-3c39-f554-b557-953e5f7f157c@oracle.com> <412699eb-1ae0-edd8-1acd-ce7673872101@oracle.com> <31d867dd-5855-ad37-92ca-fb23a762ccd3@oracle.com> Message-ID: <3778a8b9-8df4-3114-88da-d4682b3faabd@oracle.com> Hi Sangheon, On 21.11.19 23:37, sangheon.kim at oracle.com wrote: > Hi Thomas, > > On 11/14/19 4:42 AM, Thomas Schatzl wrote: >> Hi Stefan, >> [...] >> ? thanks for your review. >> >> Fyi, there has been one merge issue with latest NUMA changes: in >> heapRegion.cpp, in the initializer list of HeapRegion::HeapRegion, >> NUMA added a _node_index member at the end. This caused the merge >> logic to bail out because the context of the source hunk and the >> current code did not exactly match. >> >> I updated the webrev. > The updated webrev looks good. > > Thanks, > Sangheon thanks for your review. Thomas From stefan.johansson at oracle.com Fri Nov 22 09:18:44 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 22 Nov 2019 10:18:44 +0100 Subject: RFR (S): 8234586: Rename survRateGroup.?pp files to g1SurvRateGroup.?pp In-Reply-To: <34380D46-927F-4BF5-A017-39B92CA2C60C@oracle.com> References: <045ab345-0008-43d6-6917-c1d3aee1b9d7@oracle.com> <34380D46-927F-4BF5-A017-39B92CA2C60C@oracle.com> Message-ID: <3f67db06-0138-0e4a-5ee2-311cca91fd60@oracle.com> On 2019-11-21 19:20, Kim Barrett wrote: >> On Nov 21, 2019, at 9:09 AM, Thomas Schatzl wrote: >> >> Hi all, >> >> can I have reviews for this rename of the survRateGroup* files to g1SurvRateGroup* to follow the naming convention of (most) other G1 specific files? >> >> Based on JDK-8233588. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8234586 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8234586/webrev/ >> Testing: >> local compilation >> >> Thanks, >> Thomas > > Looks good, and trivial. > +1 From thomas.schatzl at oracle.com Fri Nov 22 09:20:23 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 22 Nov 2019 10:20:23 +0100 Subject: RFR (M): 8227739: Merge cost predictions for scanning cards and log buffer entries In-Reply-To: <6f66ee24-112d-032d-a28e-1a08850a9268@oracle.com> References: <6f66ee24-112d-032d-a28e-1a08850a9268@oracle.com> Message-ID: Hi, ping for a second review. It holds up a long chain of fixes depending on it (that have partially been fully reviewed). Thanks! Thomas On 20.11.19 16:26, Thomas Schatzl wrote: > Hi, > > On 20.11.19 14:29, Stefan Johansson wrote: >> Hi Thomas, >> >> Sorry for taking so long to get to this review. >> >> On 2019-10-22 20:26, Thomas Schatzl wrote: >>> Hi all, >>> >>> ?? can I have reviews for this change that aligns the cost >>> predictions to the way we do evacuations, i.e. that we first drop all >>> remembered sets onto the card table, and only a fraction of that will >>> be scanned as introduced by JDK-8213108. >>> >>> This code adds all the predictions for ratios etc to align to that >>> code in our prediction model too. >>> >>> After this change (and all previous) changes just sent out for >>> review, mostly JDK-8228609 (which is a prerequisite for this change), >>> predictions are a bit (noticably) better than before :) >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8227739 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8227739/webrev/ >> Looks good in general, just some small comments: >> src/hotspot/share/gc/g1/g1Analytics.cpp >> --- >> 255?? if (for_young_gc || _mixed_cost_per_card_merge_ms_seq->num() < 3) >> >> We have a few of these "seq->num() < 3" checks, what do you think >> about adding a helper for those? Something like, >> ready_for_prediction(seq) and do: >> for_young_gc || !ready_for_prediction(_mixed_cost_per_card_merge_ms_seq) >> --- >> >> src/hotspot/share/gc/g1/g1Policy.cpp >> --- >> ??725???? if (total_cards_merged > 10) { >> ??... >> ??738???? if (total_cards_scanned > 10) { >> >> Kind of pre-existing, but do you know why we have this limit of 10 in >> these cases. Would be nice to add a comment about it and maybe add a >> constant with some descriptive name. >> --- > > Fixed. I do not know the history about the particular values, and only > took them over from existing code. I guess these values are just guesses > one way or another. > > I will file an RFE to look into this. There is already one for the 1.1 > in predict_object_copy_time_ms_during_cm(). > > http://cr.openjdk.java.net/~tschatzl/8227739/webrev.0_to_1/ (diff) > http://cr.openjdk.java.net/~tschatzl/8227739/webrev.1/ (full) > > Thanks, > ? Thomas From thomas.schatzl at oracle.com Fri Nov 22 09:21:59 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 22 Nov 2019 10:21:59 +0100 Subject: RFR (M): 8227434: G1 predictions may over/underflow with high variance input In-Reply-To: References: <8112fc27-69a4-0249-de00-54e907ee38e4@oracle.com> Message-ID: <5dbdaed6-f5d9-eb9e-c708-3d67619ef5af@oracle.com> Hi all, ping for second review. Thanks, Thomas On 21.11.19 15:26, Stefan Johansson wrote: > Hi Thomas, > > On 2019-11-12 15:10, Thomas Schatzl wrote: >> Hi all, >> >> ?? can I have reviews for this change that tries to fix possible >> underflows and overflows in our predictor use in case there is high >> variance input? >> >> I did not analyze for every case whether the issue actually happened, >> but changed the get_new_prediction() calls to something I believe is >> appropriate for the given sequence. Cases where there has already been >> some clamping going on were obvious of course. >> >> It's a bit boring to review... >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8227434 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8227434/webrev/ > Looks good in general, me and Thomas discussed some naming suggestions > offline, but we are going to do that as a separate RFE. See JDK-8234574. > > Thanks, > Stefan > > >> Testing: >> hs-tier1-5 >> >> >> Thanks, >> ?? Thomas From stefan.johansson at oracle.com Fri Nov 22 09:22:05 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 22 Nov 2019 10:22:05 +0100 Subject: RFR (S): 8234574: Rename prediction methods in G1Analytics In-Reply-To: References: <1390bd21-236f-d826-5d4c-a12a44f6bef1@oracle.com> Message-ID: <0d88f0af-cee8-7ef5-ca72-7fef296b71f8@oracle.com> On 2019-11-21 20:01, sangheon.kim at oracle.com wrote: > Hi Thomas, > > On 11/21/19 4:22 AM, Thomas Schatzl wrote: >> Hi all, >> >> ? could you review this change that renames the prediction methods of >> G1Analytics (and wrappers around them). This has been requested during >> an internal review of JDK-8227434. >> >> This change has been split out to avoid re-reviewing later code >> already reviewed but not pushed yet due to missing dependencies and >> it's easier to review without functional changes mixed in. >> >> Based on JDK-8233588. >> >> CR: >> https://bugs.openjdk.java.net/browse/JDK-8234574 >> Webrev: >> http://cr.openjdk.java.net/~tschatzl/8234574/webrev/ > Renaming looks good. > > If you are interested updating the copyright year, > test_g1Predictions.cpp needs to be updated. > I don't need a new webrev for this. Looks good to me too, Stefan > > Thanks, > Sangheon > > >> Testing: >> local compilation (this is a mechanical IDE supported rename of three >> methods) >> >> Thanks, >> ? Thomas > From stefan.johansson at oracle.com Fri Nov 22 09:25:02 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 22 Nov 2019 10:25:02 +0100 Subject: RFR (M): 8227739: Merge cost predictions for scanning cards and log buffer entries In-Reply-To: <6f66ee24-112d-032d-a28e-1a08850a9268@oracle.com> References: <6f66ee24-112d-032d-a28e-1a08850a9268@oracle.com> Message-ID: Hi Thomas, On 2019-11-20 16:26, Thomas Schatzl wrote: > Hi, > > On 20.11.19 14:29, Stefan Johansson wrote: >> Hi Thomas, >> >> Sorry for taking so long to get to this review. >> >> On 2019-10-22 20:26, Thomas Schatzl wrote: >>> Hi all, >>> >>> ?? can I have reviews for this change that aligns the cost >>> predictions to the way we do evacuations, i.e. that we first drop all >>> remembered sets onto the card table, and only a fraction of that will >>> be scanned as introduced by JDK-8213108. >>> >>> This code adds all the predictions for ratios etc to align to that >>> code in our prediction model too. >>> >>> After this change (and all previous) changes just sent out for >>> review, mostly JDK-8228609 (which is a prerequisite for this change), >>> predictions are a bit (noticably) better than before :) >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8227739 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8227739/webrev/ >> Looks good in general, just some small comments: >> src/hotspot/share/gc/g1/g1Analytics.cpp >> --- >> 255?? if (for_young_gc || _mixed_cost_per_card_merge_ms_seq->num() < 3) >> >> We have a few of these "seq->num() < 3" checks, what do you think >> about adding a helper for those? Something like, >> ready_for_prediction(seq) and do: >> for_young_gc || !ready_for_prediction(_mixed_cost_per_card_merge_ms_seq) >> --- >> >> src/hotspot/share/gc/g1/g1Policy.cpp >> --- >> ??725???? if (total_cards_merged > 10) { >> ??... >> ??738???? if (total_cards_scanned > 10) { >> >> Kind of pre-existing, but do you know why we have this limit of 10 in >> these cases. Would be nice to add a comment about it and maybe add a >> constant with some descriptive name. >> --- > > Fixed. I do not know the history about the particular values, and only > took them over from existing code. I guess these values are just guesses > one way or another. > > I will file an RFE to look into this. There is already one for the 1.1 > in predict_object_copy_time_ms_during_cm(). > > http://cr.openjdk.java.net/~tschatzl/8227739/webrev.0_to_1/ (diff) > http://cr.openjdk.java.net/~tschatzl/8227739/webrev.1/ (full) This looks good, Stefan > > Thanks, > ? Thomas From thomas.schatzl at oracle.com Fri Nov 22 09:42:26 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 22 Nov 2019 10:42:26 +0100 Subject: RFR (S): 8234574: Rename prediction methods in G1Analytics In-Reply-To: <0d88f0af-cee8-7ef5-ca72-7fef296b71f8@oracle.com> References: <1390bd21-236f-d826-5d4c-a12a44f6bef1@oracle.com> <0d88f0af-cee8-7ef5-ca72-7fef296b71f8@oracle.com> Message-ID: Hi Sangheon, Stefan, On 22.11.19 10:22, Stefan Johansson wrote: > > > On 2019-11-21 20:01, sangheon.kim at oracle.com wrote: >> Hi Thomas, >> >> On 11/21/19 4:22 AM, Thomas Schatzl wrote: >>> Hi all, >>> >>> ? could you review this change that renames the prediction methods of >>> G1Analytics (and wrappers around them). This has been requested >>> during an internal review of JDK-8227434. >>> >>> This change has been split out to avoid re-reviewing later code >>> already reviewed but not pushed yet due to missing dependencies and >>> it's easier to review without functional changes mixed in. >>> >>> Based on JDK-8233588. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8234574 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8234574/webrev/ >> Renaming looks good. >> >> If you are interested updating the copyright year, >> test_g1Predictions.cpp needs to be updated. >> I don't need a new webrev for this. > Looks good to me too, > Stefan >> >> Thanks, >> Sangheon Thanks for your reviews. I updated the copyright date and regenerated in place. Thomas From erik.osterlund at oracle.com Fri Nov 22 09:44:18 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Fri, 22 Nov 2019 10:44:18 +0100 Subject: RFR: 8234543: ZGC: Parallel pre-touch In-Reply-To: <47dfbfe2-996c-771d-66f2-27683cb458b1@oracle.com> References: <47dfbfe2-996c-771d-66f2-27683cb458b1@oracle.com> Message-ID: <1989a4ec-bbf0-ade2-6520-a57fcf1013ba@oracle.com> Hi Per, Seems reasonable to me. Thanks, /Erik On 11/21/19 10:32 AM, Per Liden wrote: > When using -XX:+AlwaysPreTouch, ZGC is currently doing single threaded > pre-touch. This patch makes this a parallel operation. This improves > startup time, especially when using large heaps. For example, when > using a 3TB heap (-XX:+AlwaysPreTouch -Xms3T -Xmx3T), startup time is > improved by about 30x. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234543 > Webrev: http://cr.openjdk.java.net/~pliden/8234543/webrev.0 > > /Per From per.liden at oracle.com Fri Nov 22 09:49:58 2019 From: per.liden at oracle.com (Per Liden) Date: Fri, 22 Nov 2019 10:49:58 +0100 Subject: RFR: 8234543: ZGC: Parallel pre-touch In-Reply-To: <1989a4ec-bbf0-ade2-6520-a57fcf1013ba@oracle.com> References: <47dfbfe2-996c-771d-66f2-27683cb458b1@oracle.com> <1989a4ec-bbf0-ade2-6520-a57fcf1013ba@oracle.com> Message-ID: Thanks Erik! /Per On 11/22/19 10:44 AM, erik.osterlund at oracle.com wrote: > Hi Per, > > Seems reasonable to me. > > Thanks, > /Erik > > On 11/21/19 10:32 AM, Per Liden wrote: >> When using -XX:+AlwaysPreTouch, ZGC is currently doing single threaded >> pre-touch. This patch makes this a parallel operation. This improves >> startup time, especially when using large heaps. For example, when >> using a 3TB heap (-XX:+AlwaysPreTouch -Xms3T -Xmx3T), startup time is >> improved by about 30x. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234543 >> Webrev: http://cr.openjdk.java.net/~pliden/8234543/webrev.0 >> >> /Per > From thomas.schatzl at oracle.com Fri Nov 22 09:57:13 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 22 Nov 2019 10:57:13 +0100 Subject: RFR (S): 8234179: Move HeapRegion::_recorded_rs_length/_predicted_elapsed_time_ms into G1CollectionSet Message-ID: <4e22f801-827e-ef58-1c0a-297c2e6250e8@oracle.com> Hi all, can I have reviews for this change that moves two members (_recorded_rs_length/_predicted_elapsed_time_ms) of HeapRegion into G1CollectionSet that is the only user of these members? This also allows more efficient allocation of this information in the future, as these are only needed for eden regions. CR: https://bugs.openjdk.java.net/browse/JDK-8234179 Webrev: http://cr.openjdk.java.net/~tschatzl/8234179/webrev/ Testing: hs-tier1-5 Thanks, Thomas From erik.osterlund at oracle.com Fri Nov 22 11:07:28 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Fri, 22 Nov 2019 12:07:28 +0100 Subject: RFR: 8234602: ZGC: Windows compile error in ZHeuristic In-Reply-To: <27cf412a-a499-60c6-aef9-de84c4ac037f@oracle.com> References: <27cf412a-a499-60c6-aef9-de84c4ac037f@oracle.com> Message-ID: <6c213977-f2f5-9f3c-ea80-e117729ad774@oracle.com> Hi Stefan, Looks good and trivial. Thanks, /Erik On 11/21/19 10:08 PM, Stefan Karlsson wrote: > Hi all, > > Please review this trivial fix to silence a windows compiler warning > about narrowing a size_t to int. > > https://cr.openjdk.java.net/~stefank/8234602/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8234602 > > Thanks, > StefanK From erik.osterlund at oracle.com Fri Nov 22 11:08:40 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Fri, 22 Nov 2019 12:08:40 +0100 Subject: RFR: 8234010: ZGC: Change ZResurrection to use Atomic::load/store In-Reply-To: <674bc3ce-d48f-d9d7-4ce1-d2fcf144246e@oracle.com> References: <674bc3ce-d48f-d9d7-4ce1-d2fcf144246e@oracle.com> Message-ID: <8454976c-7c69-a127-8b46-b095be7ded82@oracle.com> Hi Stefan, Looks good. Thanks, /Erik On 11/21/19 2:24 PM, Stefan Karlsson wrote: > Hi all, > > Please review this patch to change ZResurrection to use Atomic::load > and Atomic::store. > > https://cr.openjdk.java.net/~stefank/8234010/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8234010 > > Previously, ZResurrection::is_blocked() and ZResurrection::unblock() > used loadload and storestore barriers to synchronize between the GC > and mutator load barriers. > > JDK-8230661 changed so that we always perform a handshake before the > ZResurrection::unblock() call. > > After that change we can rely on the handshake to perform the > necessary synchronization, and we can change the implementation to use > Atomic::load and Atomic::store. > > Tested with tier1-7 > > Thanks, > StefanK From erik.osterlund at oracle.com Fri Nov 22 11:09:31 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Fri, 22 Nov 2019 12:09:31 +0100 Subject: RFR: 8234009: ZGC: Move resurrection unblock to before the _unload.purge() call In-Reply-To: <18bf9548-2aa1-689e-420c-bbaeb5813d1c@oracle.com> References: <18bf9548-2aa1-689e-420c-bbaeb5813d1c@oracle.com> Message-ID: <9a149951-c07a-44d7-7338-3d2f91746c43@oracle.com> Hi Stefan, Looks good. Thanks, /Erik On 11/21/19 2:28 PM, Stefan Karlsson wrote: > Hi all, > > Please review this patch to move the resurrection to before the > _unload.purge() call. > > https://cr.openjdk.java.net/~stefank/8234009/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8234009 > > After JDK-8230661 we are guaranteed that no mutator holds a weak oop > containing a "dead" object when the ZResurrection::unblock() call > happens. > > Therefore, it doesn't matter if the load barrier that the thread > executes runs the code guarded by ZResurrection::is_blocked() or the > "non-blocked" part, when ZResurrection::unblock() changes the state. > As long as the ZResurrection::unblock() call happens after the > handshake, we are good to go. > > Today, we perform the purging and deletion of metadata in > _unload.purge(), and call ZResurrection:unblock() after that. There's > no need to delay the unblocking to after the purge. We have the > opportunity to shrink the resurrection-blocked window by moving the > ZResurrection::unblock() call to before the call to _unload.purge(), > but still after the handshake. > > Thanks, > StefanK From stefan.karlsson at oracle.com Fri Nov 22 11:15:45 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 22 Nov 2019 12:15:45 +0100 Subject: RFR: 8234009: ZGC: Move resurrection unblock to before the _unload.purge() call In-Reply-To: <9a149951-c07a-44d7-7338-3d2f91746c43@oracle.com> References: <18bf9548-2aa1-689e-420c-bbaeb5813d1c@oracle.com> <9a149951-c07a-44d7-7338-3d2f91746c43@oracle.com> Message-ID: Thanks, Erik. StefanK On 2019-11-22 12:09, erik.osterlund at oracle.com wrote: > Hi Stefan, > > Looks good. > > Thanks, > /Erik > > On 11/21/19 2:28 PM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to move the resurrection to before the >> _unload.purge() call. >> >> https://cr.openjdk.java.net/~stefank/8234009/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8234009 >> >> After JDK-8230661 we are guaranteed that no mutator holds a weak oop >> containing a "dead" object when the ZResurrection::unblock() call >> happens. >> >> Therefore, it doesn't matter if the load barrier that the thread >> executes runs the code guarded by ZResurrection::is_blocked() or the >> "non-blocked" part, when ZResurrection::unblock() changes the state. >> As long as the ZResurrection::unblock() call happens after the >> handshake, we are good to go. >> >> Today, we perform the purging and deletion of metadata in >> _unload.purge(), and call ZResurrection:unblock() after that. There's >> no need to delay the unblocking to after the purge. We have the >> opportunity to shrink the resurrection-blocked window by moving the >> ZResurrection::unblock() call to before the call to _unload.purge(), >> but still after the handshake. >> >> Thanks, >> StefanK > From stefan.karlsson at oracle.com Fri Nov 22 11:16:09 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 22 Nov 2019 12:16:09 +0100 Subject: RFR: 8234010: ZGC: Change ZResurrection to use Atomic::load/store In-Reply-To: <8454976c-7c69-a127-8b46-b095be7ded82@oracle.com> References: <674bc3ce-d48f-d9d7-4ce1-d2fcf144246e@oracle.com> <8454976c-7c69-a127-8b46-b095be7ded82@oracle.com> Message-ID: <57428c4b-0460-ca93-9e09-1da166b8ebab@oracle.com> Thanks, Erik. StefanK On 2019-11-22 12:08, erik.osterlund at oracle.com wrote: > Hi Stefan, > > Looks good. > > Thanks, > /Erik > > On 11/21/19 2:24 PM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to change ZResurrection to use Atomic::load >> and Atomic::store. >> >> https://cr.openjdk.java.net/~stefank/8234010/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8234010 >> >> Previously, ZResurrection::is_blocked() and ZResurrection::unblock() >> used loadload and storestore barriers to synchronize between the GC >> and mutator load barriers. >> >> JDK-8230661 changed so that we always perform a handshake before the >> ZResurrection::unblock() call. >> >> After that change we can rely on the handshake to perform the >> necessary synchronization, and we can change the implementation to use >> Atomic::load and Atomic::store. >> >> Tested with tier1-7 >> >> Thanks, >> StefanK > From stefan.karlsson at oracle.com Fri Nov 22 11:16:23 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 22 Nov 2019 12:16:23 +0100 Subject: RFR: 8234602: ZGC: Windows compile error in ZHeuristic In-Reply-To: <6c213977-f2f5-9f3c-ea80-e117729ad774@oracle.com> References: <27cf412a-a499-60c6-aef9-de84c4ac037f@oracle.com> <6c213977-f2f5-9f3c-ea80-e117729ad774@oracle.com> Message-ID: <0098bad0-4492-487a-0321-bc1cc499f63c@oracle.com> Thanks, Erik. StefanK On 2019-11-22 12:07, erik.osterlund at oracle.com wrote: > Hi Stefan, > > Looks good and trivial. > > Thanks, > /Erik > > On 11/21/19 10:08 PM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this trivial fix to silence a windows compiler warning >> about narrowing a size_t to int. >> >> https://cr.openjdk.java.net/~stefank/8234602/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8234602 >> >> Thanks, >> StefanK > From leo.korinth at oracle.com Fri Nov 22 11:23:20 2019 From: leo.korinth at oracle.com (Leo Korinth) Date: Fri, 22 Nov 2019 12:23:20 +0100 Subject: RFR (M): 8227434: G1 predictions may over/underflow with high variance input In-Reply-To: <5dbdaed6-f5d9-eb9e-c708-3d67619ef5af@oracle.com> References: <8112fc27-69a4-0249-de00-54e907ee38e4@oracle.com> <5dbdaed6-f5d9-eb9e-c708-3d67619ef5af@oracle.com> Message-ID: <6d64cc59-a14e-48e8-d319-b3f9348e8304@oracle.com> On 22/11/2019 10:21, Thomas Schatzl wrote: > Hi all, > > ? ping for second review. Looks good /Leo > > Thanks, > ? Thomas > > On 21.11.19 15:26, Stefan Johansson wrote: >> Hi Thomas, >> >> On 2019-11-12 15:10, Thomas Schatzl wrote: >>> Hi all, >>> >>> ?? can I have reviews for this change that tries to fix possible >>> underflows and overflows in our predictor use in case there is high >>> variance input? >>> >>> I did not analyze for every case whether the issue actually happened, >>> but changed the get_new_prediction() calls to something I believe is >>> appropriate for the given sequence. Cases where there has already >>> been some clamping going on were obvious of course. >>> >>> It's a bit boring to review... >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8227434 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8227434/webrev/ >> Looks good in general, me and Thomas discussed some naming suggestions >> offline, but we are going to do that as a separate RFE. See JDK-8234574. >> >> Thanks, >> Stefan >> >> >>> Testing: >>> hs-tier1-5 >>> >>> >>> Thanks, >>> ?? Thomas > From per.liden at oracle.com Fri Nov 22 12:43:18 2019 From: per.liden at oracle.com (Per Liden) Date: Fri, 22 Nov 2019 13:43:18 +0100 Subject: RFR: 8234602: ZGC: Windows compile error in ZHeuristic In-Reply-To: <27cf412a-a499-60c6-aef9-de84c4ac037f@oracle.com> References: <27cf412a-a499-60c6-aef9-de84c4ac037f@oracle.com> Message-ID: <9416e43a-8bff-c452-df27-d9e4278437ee@oracle.com> Looks good! /Per On 11/21/19 10:08 PM, Stefan Karlsson wrote: > Hi all, > > Please review this trivial fix to silence a windows compiler warning > about narrowing a size_t to int. > > https://cr.openjdk.java.net/~stefank/8234602/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8234602 > > Thanks, > StefanK From per.liden at oracle.com Fri Nov 22 12:48:12 2019 From: per.liden at oracle.com (Per Liden) Date: Fri, 22 Nov 2019 13:48:12 +0100 Subject: RFR: 8234010: ZGC: Change ZResurrection to use Atomic::load/store In-Reply-To: <674bc3ce-d48f-d9d7-4ce1-d2fcf144246e@oracle.com> References: <674bc3ce-d48f-d9d7-4ce1-d2fcf144246e@oracle.com> Message-ID: <84bb5dfb-88c1-bd90-2ee6-f0657b1a1ef5@oracle.com> Looks good! /Per On 11/21/19 2:24 PM, Stefan Karlsson wrote: > Hi all, > > Please review this patch to change ZResurrection to use Atomic::load and > Atomic::store. > > https://cr.openjdk.java.net/~stefank/8234010/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8234010 > > Previously, ZResurrection::is_blocked() and ZResurrection::unblock() > used loadload and storestore barriers to synchronize between the GC and > mutator load barriers. > > JDK-8230661 changed so that we always perform a handshake before the > ZResurrection::unblock() call. > > After that change we can rely on the handshake to perform the necessary > synchronization, and we can change the implementation to use > Atomic::load and Atomic::store. > > Tested with tier1-7 > > Thanks, > StefanK From thomas.schatzl at oracle.com Fri Nov 22 12:58:52 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 22 Nov 2019 13:58:52 +0100 Subject: RFR (S): 8233998: New young regions registered too early in collection set In-Reply-To: <6dc8783f-24ea-903f-a5a0-c2cc9df3b748@oracle.com> References: <6dc8783f-24ea-903f-a5a0-c2cc9df3b748@oracle.com> Message-ID: <1b0f8128-9508-716e-0160-b0625116376c@oracle.com> Hi, ping for a second review :) Thomas On 20.11.19 11:49, Thomas Schatzl wrote: > Hi, > > On 20.11.19 11:02, Stefan Johansson wrote: >> Hi Thomas, >> >> On 2019-11-12 16:24, Thomas Schatzl wrote: >>> Hi, >>> >>> ?? can I have reviews for this change that changes the place in which >>> new mutator regions are published in the collection set list? >>> >>> Previously a new eden region has been published before some data that >>> would be read by the young gen sampling thread could be visible. >>> >>> This change simply does the member updates before adding the regions >>> to the collection set. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8233998 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8233998/webrev/ >> Looks good, >> StefanJ > > ? thanks for your review. > > Thomas From thomas.schatzl at oracle.com Fri Nov 22 12:59:32 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 22 Nov 2019 13:59:32 +0100 Subject: RFR (M): 8233919: Incrementally calculate the occupied cards in a heap region remembered set In-Reply-To: <23b30b41-b109-4d05-606f-fa6a87a07897@oracle.com> References: <23b30b41-b109-4d05-606f-fa6a87a07897@oracle.com> Message-ID: <9adf400e-eb92-5892-c113-02b3ac29826a@oracle.com> Hi, ping for a second review :P Thomas On 12.11.19 16:24, Thomas Schatzl wrote: > Hi all, > > ? may I have reviews for this change that ultimately makes sure that > the number of occupied cards in a remembered set is only growing by > providing a per-OtherRegionsTable count that is atomically updated when > adding a remembered set entry. > > Note that this count may not be completely accurate due to races when > deleting a PerRegionTable (which is a known issue) from an > OtherRegionsTable; but that is no different than before. > > This helps improving the predictions in the young gen remset sampling > thread, and increase the performance of getting the occupancy count. > > Based on JDK-8233997, and JDK-8233998 also out for review. > > CR: > https://bugs.openjdk.java.net/browse/JDK-8233919 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8233919/webrev/ > Testing: > hs-tier1-5, > > Thanks, > ? Thomas From thomas.schatzl at oracle.com Fri Nov 22 13:00:41 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 22 Nov 2019 14:00:41 +0100 Subject: RFR (M): 8231579: G1's incremental calculation of region elapsed time always uses the same age group for prediction In-Reply-To: References: <6bb0be27-c32c-f2c0-b21e-b03802590cf4@oracle.com> <3fab7724-302f-faab-40bf-b95df0de5b05@oracle.com> Message-ID: <499678fc-980d-d010-ba7a-e48f5b1fff00@oracle.com> Hi Stefan, On 20.11.19 14:43, Stefan Johansson wrote: > > > On 2019-11-20 13:49, Thomas Schatzl wrote: >> Hi Stefan, >> >> On 20.11.19 12:24, Stefan Johansson wrote: >>> Hi Thomas, >>> [...] >>>> >>>> CR: >>>> https://bugs.openjdk.java.net/browse/JDK-8231579 >>>> Webrev: >>>> http://cr.openjdk.java.net/~tschatzl/8231579/webrev/ >>> Looks good, just some variable naming that I think should be updated: >>> src/hotspot/share/gc/g1/g1CollectionSet.cpp >>> --- >>> ??248?? double old_elapsed_time_ms = hr->predicted_non_copy_time_ms(); >>> ??249?? double new_region_elapsed_time_ms = >>> predict_region_non_copy_time_ms(hr); >>> ??250?? double non_copy_time_ms_diff = new_region_elapsed_time_ms - >>> old_elapsed_time_ms; >>> ??251?? hr->set_predicted_non_copy_time_ms(new_region_elapsed_time_ms); >>> ??252?? _inc_predicted_non_copy_time_ms_diff += non_copy_time_ms_diff; >>> >>> I think the local variables should change to reflect the new naming, >>> something like "old_non_copy_time" and "new_non_copy_time". >> >> ??you are right. Fixed in >> >> http://cr.openjdk.java.net/~tschatzl/8231579/webrev.0_to_1/ (diff) >> http://cr.openjdk.java.net/~tschatzl/8231579/webrev.1/ (full) > Looks good. > > Thanks, > Stefan thanks for your review. Thomas From per.liden at oracle.com Fri Nov 22 13:08:43 2019 From: per.liden at oracle.com (Per Liden) Date: Fri, 22 Nov 2019 14:08:43 +0100 Subject: RFR: 8234009: ZGC: Move resurrection unblock to before the _unload.purge() call In-Reply-To: <18bf9548-2aa1-689e-420c-bbaeb5813d1c@oracle.com> References: <18bf9548-2aa1-689e-420c-bbaeb5813d1c@oracle.com> Message-ID: <6135c64e-eb28-72af-cfe2-91b92c0854e6@oracle.com> Looks good! /Per On 11/21/19 2:28 PM, Stefan Karlsson wrote: > Hi all, > > Please review this patch to move the resurrection to before the > _unload.purge() call. > > https://cr.openjdk.java.net/~stefank/8234009/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8234009 > > After JDK-8230661 we are guaranteed that no mutator holds a weak oop > containing a "dead" object when the ZResurrection::unblock() call happens. > > Therefore, it doesn't matter if the load barrier that the thread > executes runs the code guarded by ZResurrection::is_blocked() or the > "non-blocked" part, when ZResurrection::unblock() changes the state. As > long as the ZResurrection::unblock() call happens after the handshake, > we are good to go. > > Today, we perform the purging and deletion of metadata in > _unload.purge(), and call ZResurrection:unblock() after that. There's no > need to delay the unblocking to after the purge. We have the opportunity > to shrink the resurrection-blocked window by moving the > ZResurrection::unblock() call to before the call to _unload.purge(), but > still after the handshake. > > Thanks, > StefanK From per.liden at oracle.com Fri Nov 22 13:09:47 2019 From: per.liden at oracle.com (Per Liden) Date: Fri, 22 Nov 2019 14:09:47 +0100 Subject: RFR: 8234573: ZGC: Enable ZVerifyMarking by default in debug builds In-Reply-To: <012384ce-bb65-a246-cb15-1a98a5455ff8@oracle.com> References: <012384ce-bb65-a246-cb15-1a98a5455ff8@oracle.com> Message-ID: <586efcb0-17b4-52cf-08bd-b0fc17dc20a1@oracle.com> Thanks Thomas! /Per On 11/21/19 12:15 PM, Thomas Schatzl wrote: > Hi, > > On 21.11.19 12:11, Per Liden wrote: >> Just like we have ZVerifyRoots enabled by default in debug builds, I >> think we should also enabled ZVerifyMarking, since it's fairly >> inexpensive. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234573 >> Webrev: http://cr.openjdk.java.net/~pliden/8234573/webrev.0 >> >> /Per > > ? looks good and trivial. > > Thomas From stefan.karlsson at oracle.com Fri Nov 22 13:15:58 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 22 Nov 2019 14:15:58 +0100 Subject: RFR: 8234009: ZGC: Move resurrection unblock to before the _unload.purge() call In-Reply-To: <6135c64e-eb28-72af-cfe2-91b92c0854e6@oracle.com> References: <18bf9548-2aa1-689e-420c-bbaeb5813d1c@oracle.com> <6135c64e-eb28-72af-cfe2-91b92c0854e6@oracle.com> Message-ID: <18b3436e-29a0-b6a7-2d56-da0dfe058cbd@oracle.com> Thanks, Per. StefanK On 2019-11-22 14:08, Per Liden wrote: > Looks good! > > /Per > > On 11/21/19 2:28 PM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to move the resurrection to before the >> _unload.purge() call. >> >> https://cr.openjdk.java.net/~stefank/8234009/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8234009 >> >> After JDK-8230661 we are guaranteed that no mutator holds a weak oop >> containing a "dead" object when the ZResurrection::unblock() call >> happens. >> >> Therefore, it doesn't matter if the load barrier that the thread >> executes runs the code guarded by ZResurrection::is_blocked() or the >> "non-blocked" part, when ZResurrection::unblock() changes the state. >> As long as the ZResurrection::unblock() call happens after the >> handshake, we are good to go. >> >> Today, we perform the purging and deletion of metadata in >> _unload.purge(), and call ZResurrection:unblock() after that. There's >> no need to delay the unblocking to after the purge. We have the >> opportunity to shrink the resurrection-blocked window by moving the >> ZResurrection::unblock() call to before the call to _unload.purge(), >> but still after the handshake. >> >> Thanks, >> StefanK From per.liden at oracle.com Fri Nov 22 14:03:18 2019 From: per.liden at oracle.com (Per Liden) Date: Fri, 22 Nov 2019 15:03:18 +0100 Subject: RFR: 8234654: ZGC: Only disarm NMethods when marking/relocating code roots Message-ID: <46674d77-9ae7-82e5-0572-77fc44f0c8c7@oracle.com> ZRootIterator will currently always try to disarm on-stack NMethods. Strictly speaking, we should only do this when marking/relocating code roots, not when e.g. iterating the heap. Bug: https://bugs.openjdk.java.net/browse/JDK-8234654 Webrev: http://cr.openjdk.java.net/~pliden/8234654/webrev.0 /Per From stefan.karlsson at oracle.com Fri Nov 22 14:06:38 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 22 Nov 2019 15:06:38 +0100 Subject: RFR: 8234602: ZGC: Windows compile error in ZHeuristic In-Reply-To: <9416e43a-8bff-c452-df27-d9e4278437ee@oracle.com> References: <27cf412a-a499-60c6-aef9-de84c4ac037f@oracle.com> <9416e43a-8bff-c452-df27-d9e4278437ee@oracle.com> Message-ID: <5965ffc2-9a7c-7e55-c17c-5b9012e5311c@oracle.com> Thanks, Per. StefanK On 2019-11-22 13:43, Per Liden wrote: > Looks good! > > /Per > > On 11/21/19 10:08 PM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this trivial fix to silence a windows compiler warning >> about narrowing a size_t to int. >> >> https://cr.openjdk.java.net/~stefank/8234602/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8234602 >> >> Thanks, >> StefanK From stefan.karlsson at oracle.com Fri Nov 22 14:07:32 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 22 Nov 2019 15:07:32 +0100 Subject: RFR: 8234010: ZGC: Change ZResurrection to use Atomic::load/store In-Reply-To: <84bb5dfb-88c1-bd90-2ee6-f0657b1a1ef5@oracle.com> References: <674bc3ce-d48f-d9d7-4ce1-d2fcf144246e@oracle.com> <84bb5dfb-88c1-bd90-2ee6-f0657b1a1ef5@oracle.com> Message-ID: Thanks, Per. StefanK On 2019-11-22 13:48, Per Liden wrote: > Looks good! > > /Per > > On 11/21/19 2:24 PM, Stefan Karlsson wrote: >> Hi all, >> >> Please review this patch to change ZResurrection to use Atomic::load >> and Atomic::store. >> >> https://cr.openjdk.java.net/~stefank/8234010/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8234010 >> >> Previously, ZResurrection::is_blocked() and ZResurrection::unblock() >> used loadload and storestore barriers to synchronize between the GC >> and mutator load barriers. >> >> JDK-8230661 changed so that we always perform a handshake before the >> ZResurrection::unblock() call. >> >> After that change we can rely on the handshake to perform the >> necessary synchronization, and we can change the implementation to use >> Atomic::load and Atomic::store. >> >> Tested with tier1-7 >> >> Thanks, >> StefanK From erik.osterlund at oracle.com Fri Nov 22 14:35:18 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Fri, 22 Nov 2019 15:35:18 +0100 Subject: RFR: 8234654: ZGC: Only disarm NMethods when marking/relocating code roots In-Reply-To: <46674d77-9ae7-82e5-0572-77fc44f0c8c7@oracle.com> References: <46674d77-9ae7-82e5-0572-77fc44f0c8c7@oracle.com> Message-ID: <30503437-ee40-a924-d7fa-8ed5787adf52@oracle.com> Hi Per, Looks good. Thanks, /Erik On 11/22/19 3:03 PM, Per Liden wrote: > ZRootIterator will currently always try to disarm on-stack NMethods. > Strictly speaking, we should only do this when marking/relocating code > roots, not when e.g. iterating the heap. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234654 > Webrev: http://cr.openjdk.java.net/~pliden/8234654/webrev.0 > > /Per From per.liden at oracle.com Fri Nov 22 14:38:17 2019 From: per.liden at oracle.com (Per Liden) Date: Fri, 22 Nov 2019 15:38:17 +0100 Subject: RFR: 8234654: ZGC: Only disarm NMethods when marking/relocating code roots In-Reply-To: <30503437-ee40-a924-d7fa-8ed5787adf52@oracle.com> References: <46674d77-9ae7-82e5-0572-77fc44f0c8c7@oracle.com> <30503437-ee40-a924-d7fa-8ed5787adf52@oracle.com> Message-ID: Thanks Erik! /Per On 11/22/19 3:35 PM, erik.osterlund at oracle.com wrote: > Hi Per, > > Looks good. > > Thanks, > /Erik > > On 11/22/19 3:03 PM, Per Liden wrote: >> ZRootIterator will currently always try to disarm on-stack NMethods. >> Strictly speaking, we should only do this when marking/relocating code >> roots, not when e.g. iterating the heap. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234654 >> Webrev: http://cr.openjdk.java.net/~pliden/8234654/webrev.0 >> >> /Per > From leo.korinth at oracle.com Fri Nov 22 16:37:52 2019 From: leo.korinth at oracle.com (Leo Korinth) Date: Fri, 22 Nov 2019 17:37:52 +0100 Subject: RFR (S): 8234179: Move HeapRegion::_recorded_rs_length/_predicted_elapsed_time_ms into G1CollectionSet In-Reply-To: <4e22f801-827e-ef58-1c0a-297c2e6250e8@oracle.com> References: <4e22f801-827e-ef58-1c0a-297c2e6250e8@oracle.com> Message-ID: <08b639cb-7444-3de2-e1f9-c6b12d8bbecc@oracle.com> On 22/11/2019 10:57, Thomas Schatzl wrote: > Hi all, > > ? can I have reviews for this change that moves two members > (_recorded_rs_length/_predicted_elapsed_time_ms) of HeapRegion into > G1CollectionSet that is the only user of these members? > > This also allows more efficient allocation of this information in the > future, as these are only needed for eden regions. Looks good in general, the array _inc_collection_set_stats should be initialized though. /Leo > > CR: > https://bugs.openjdk.java.net/browse/JDK-8234179 > Webrev: > http://cr.openjdk.java.net/~tschatzl/8234179/webrev/ > Testing: > hs-tier1-5 > > Thanks, > ? Thomas From manc at google.com Fri Nov 22 22:57:15 2019 From: manc at google.com (Man Cao) Date: Fri, 22 Nov 2019 14:57:15 -0800 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting In-Reply-To: <7F106122-9DE0-428E-8C83-0E1F443D3905@oracle.com> References: <34BD822F-20DD-43E3-956F-09622EB62148@oracle.com> <5c700c0a-954e-6187-67b4-cc523cace95c@oracle.com> <387D12CA-4C5A-4DE5-A6E7-ED6739AC7173@oracle.com> <37161B48-8AD1-4887-A56D-4905BD8DC159@oracle.com> <4D5605AD-7F3E-4AB4-8A13-5AA18DC0AB40@oracle.com> <7F106122-9DE0-428E-8C83-0E1F443D3905@oracle.com> Message-ID: Thanks for the suggestions. I agree with all of them and addressed them: https://cr.openjdk.java.net/~manc/8087198/webrev.03/ https://cr.openjdk.java.net/~manc/8087198/webrev.02-03.inc/ I have rerun correctness tests and they passed. src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp > 321 bool refine() { > 322 size_t first_clean_index = clean_cards(); > Is it worth checking whether the cleaned buffer is now empty, skipping > the rest altogether if so? I don't know how often that actually > happens, and the rest isn't *that* expensive in the empty buffer case. > So I'm going to guess not worth the extra test, but something to consider. I did a quick experiment to count the percentage of completely discarded buffers versus all refined buffers, using the default parameters (+G1UseAdaptiveConcRefinement, G1UpdateBufferSize=256). In some cases the percentage is not small: BigRamTester: 10-30% during the phase of filling up the hashmap. Close to zero during the phase of random accessing the hashmap. DaCapo h2: could be 20-30% just before the full GC between two iterations. Close to zero during the iteration. So I added check. With the future change for the epoch synchronization for JDK-8226731, we definitely want to avoid unnecessary epoch synchronization as much as possible. -Man On Thu, Nov 21, 2019 at 12:37 PM Kim Barrett wrote: > > On Nov 19, 2019, at 9:26 PM, Man Cao wrote: > > > > Hi all, > > > > Thanks! I have addressed all comments: > > Full: http://cr.openjdk.java.net/~manc/8087198/webrev.02/ > > Incremental: http://cr.openjdk.java.net/~manc/8087198/webrev.01-02.inc/ > > > > > > I also changed G1RemSet::clean_card_before_refine() to take a > "CardValue**" > > parameter instead of "CardValue*&" to make it more obvious that it can > > modify > > the card pointer. > > Thanks for that change; that seems to have made the code nicer. > > This is looking pretty good. Just a couple of possible improvements > and questions. > > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp > 247 QuickSort::sort(&_node_buffer[start_index], > ... > 250 true); > > I *think* a false value for idempotent is better here. > > We don't care about reordering of equal values, so there's no > correctness issue either way. A true value will avoid unnecessary > swaps of equal entries, at the cost of an extra comparison. The > comparison is very cheap, so if equal entries were somewhat common > that could be a win. But because of the dirty-card-based filtering, I > think equal entries are uncommon, making the extra comparisons > wasteful. > > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp > 289 bool refine_cleaned_cards(size_t start_index) { > 290 for (size_t i = start_index; i < _node_buffer_size; ++i) { > 291 if (SuspendibleThreadSet::should_yield()) { > 292 redirty_unrefined_cards(i); > 293 _node->set_index(i); > 294 return false; > 295 } > 296 _g1rs->refine_card_concurrently(_node_buffer[i], _worker_id); > 297 (*_total_refined_cards)++; > > 298 } > 299 _node->set_index(_node_buffer_size); > 300 return true; > 301 } > > It would be better to bulk increment *_total_refined_cards at the end, > rather than on each iteration. Maybe something like this: > > bool refine_cleaned_cards(size_t start_index) { > bool result = true; > size_t i = start_index; > for ( ; i < _node_buffer_size; ++i) { > if (SuspendibleThreadSet::should_yield()) { > redirty_unrefined_cards(i); > result = false; > break; > } > _g1rs->refine_card_concurrently(_node_buffer[i], _worker_id); > } > _node->set_index(i); > *_total_refined_cards += i - start_index; > return result; > } > > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp > 321 bool refine() { > 322 size_t first_clean_index = clean_cards(); > > Is it worth checking whether the cleaned buffer is now empty, skipping > the rest altogether if so? I don't know how often that actually > happens, and the rest isn't *that* expensive in the empty buffer case. > So I'm going to guess not worth the extra test, but something to consider. > > > ------------------------------------------------------------------------------ > > From kim.barrett at oracle.com Sat Nov 23 00:54:17 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 22 Nov 2019 19:54:17 -0500 Subject: RFR (M): 8087198: G1 card refinement: batching, sorting In-Reply-To: References: <34BD822F-20DD-43E3-956F-09622EB62148@oracle.com> <5c700c0a-954e-6187-67b4-cc523cace95c@oracle.com> <387D12CA-4C5A-4DE5-A6E7-ED6739AC7173@oracle.com> <37161B48-8AD1-4887-A56D-4905BD8DC159@oracle.com> <4D5605AD-7F3E-4AB4-8A13-5AA18DC0AB40@oracle.com> <7F106122-9DE0-428E-8C83-0E1F443D3905@oracle.com> Message-ID: <95A3BD36-3840-43BD-A598-CF2E3D41A83A@oracle.com> > On Nov 22, 2019, at 5:57 PM, Man Cao wrote: > > Thanks for the suggestions. I agree with all of them and addressed them: > https://cr.openjdk.java.net/~manc/8087198/webrev.03/ > https://cr.openjdk.java.net/~manc/8087198/webrev.02-03.inc/ Looks good. From stefan.johansson at oracle.com Mon Nov 25 10:21:59 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Mon, 25 Nov 2019 11:21:59 +0100 Subject: RFR (M): 8233919: Incrementally calculate the occupied cards in a heap region remembered set In-Reply-To: References: <23b30b41-b109-4d05-606f-fa6a87a07897@oracle.com> <16491a20-6f2a-0d70-42c2-6c23c3a9e407@oracle.com> Message-ID: <07e02dde-adb0-7b28-6901-008f4371618d@oracle.com> On 2019-11-21 11:41, Thomas Schatzl wrote: > Hi, > > On 20.11.19 11:42, Stefan Johansson wrote: >> Hi Thomas, >> >> On 2019-11-12 16:24, Thomas Schatzl wrote: >>> Hi all, >>> >>> ?? may I have reviews for this change that ultimately makes sure that >>> the number of occupied cards in a remembered set is only growing by >>> providing a per-OtherRegionsTable count that is atomically updated >>> when adding a remembered set entry. >>> >>> Note that this count may not be completely accurate due to races when >>> deleting a PerRegionTable (which is a known issue) from an >>> OtherRegionsTable; but that is no different than before. >>> >>> This helps improving the predictions in the young gen remset sampling >>> thread, and increase the performance of getting the occupancy count. >>> >>> Based on JDK-8233997, and JDK-8233998 also out for review. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8233919 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8233919/webrev/ >> I like this change and it looks good in general, just one small comment: >> src/hotspot/share/gc/g1/heapRegionRemSet.cpp >> --- >> ??247?? bool added = prt->add_reference(from); >> ??248?? Atomic::add(num_added_by_coarsening + (added ? 1 : 0), >> &_num_occupied, memory_order_relaxed); >> >> I would prefer: >> if (prt->add_reference(from)) { >> ?? num_added_by_coarsening++; >> } >> Atomic::add... >> >> I you disagree, leave it as is. > > Fixed in > http://cr.openjdk.java.net/~tschatzl/8233919/webrev.0_to_1/ > http://cr.openjdk.java.net/~tschatzl/8233919/webrev.1/ > Thanks Thomas! Looks good, Stefan > Thanks for your review, > ? Thomas > > From per.liden at oracle.com Mon Nov 25 11:21:30 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 25 Nov 2019 12:21:30 +0100 Subject: RFR: 8234619: ZGC: gc/z/TestSmallHeap.java failure with Out Of Memory Message-ID: The test gc/z/TestSmallHeap.java sometimes fails when run with and -Xcomp. As it turns out, about 120K worth of additional intern Strings are created when running with -Xcomp. On an 8M heap, means there's not enough room left for the 512K array allocation this test does. This patch simply disables this test when using -Xcomp. Bug: https://bugs.openjdk.java.net/browse/JDK-8234619 Webrev: http://cr.openjdk.java.net/~pliden/8234619/webrev.0 Testing: Manual runs with gc/z/TestSmallHeap.java /Per From thomas.schatzl at oracle.com Mon Nov 25 11:23:00 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 25 Nov 2019 12:23:00 +0100 Subject: RFR: 8234619: ZGC: gc/z/TestSmallHeap.java failure with Out Of Memory In-Reply-To: References: Message-ID: <179fd370-028a-500b-153b-a1e023b609e9@oracle.com> Hi, On 25.11.19 12:21, Per Liden wrote: > The test gc/z/TestSmallHeap.java sometimes fails when run with and > -Xcomp. As it turns out, about 120K worth of additional intern Strings > are created when running with -Xcomp. On an 8M heap, means there's not > enough room left for the 512K array allocation this test does. This > patch simply disables this test when using -Xcomp. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234619 > Webrev: http://cr.openjdk.java.net/~pliden/8234619/webrev.0 > > Testing: Manual runs with gc/z/TestSmallHeap.java > > /Per looks good. Thomas From per.liden at oracle.com Mon Nov 25 11:26:17 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 25 Nov 2019 12:26:17 +0100 Subject: RFR: 8234619: ZGC: gc/z/TestSmallHeap.java failure with Out Of Memory In-Reply-To: <179fd370-028a-500b-153b-a1e023b609e9@oracle.com> References: <179fd370-028a-500b-153b-a1e023b609e9@oracle.com> Message-ID: Thanks Thomas! /Per On 11/25/19 12:23 PM, Thomas Schatzl wrote: > Hi, > > On 25.11.19 12:21, Per Liden wrote: >> The test gc/z/TestSmallHeap.java sometimes fails when run with and >> -Xcomp. As it turns out, about 120K worth of additional intern Strings >> are created when running with -Xcomp. On an 8M heap, means there's not >> enough room left for the 512K array allocation this test does. This >> patch simply disables this test when using -Xcomp. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234619 >> Webrev: http://cr.openjdk.java.net/~pliden/8234619/webrev.0 >> >> Testing: Manual runs with gc/z/TestSmallHeap.java >> ?> /Per > > ? looks good. > > Thomas From thomas.schatzl at oracle.com Mon Nov 25 13:34:02 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Mon, 25 Nov 2019 14:34:02 +0100 Subject: RFR (S): 8234179: Move HeapRegion::_recorded_rs_length/_predicted_elapsed_time_ms into G1CollectionSet In-Reply-To: <08b639cb-7444-3de2-e1f9-c6b12d8bbecc@oracle.com> References: <4e22f801-827e-ef58-1c0a-297c2e6250e8@oracle.com> <08b639cb-7444-3de2-e1f9-c6b12d8bbecc@oracle.com> Message-ID: <4737b30d-f5cc-d075-aaa7-bbf8edf4f2fb@oracle.com> Hi Leo, On 22.11.19 17:37, Leo Korinth wrote: > On 22/11/2019 10:57, Thomas Schatzl wrote: >> Hi all, >> >> ?? can I have reviews for this change that moves two members >> (_recorded_rs_length/_predicted_elapsed_time_ms) of HeapRegion into >> G1CollectionSet that is the only user of these members? >> >> This also allows more efficient allocation of this information in the >> future, as these are only needed for eden regions. > > Looks good in general, the array _inc_collection_set_stats should be > initialized though. > Good idea. http://cr.openjdk.java.net/~tschatzl/8234179/webrev.0_to_1 http://cr.openjdk.java.net/~tschatzl/8234179/webrev.1 Now initializes the array with known bad values so that the asserts will catch any use-before-init situations. Passes hs-tier1-5 Thanks, Thomas From per.liden at oracle.com Mon Nov 25 14:10:31 2019 From: per.liden at oracle.com (Per Liden) Date: Mon, 25 Nov 2019 15:10:31 +0100 Subject: RFR: 8234543: ZGC: Parallel pre-touch In-Reply-To: <47dfbfe2-996c-771d-66f2-27683cb458b1@oracle.com> References: <47dfbfe2-996c-771d-66f2-27683cb458b1@oracle.com> Message-ID: I noticed that we didn't have a test for this on ZGC, so I added one: Updated webrev: http://cr.openjdk.java.net/~pliden/8234543/webrev.1 /Per On 11/21/19 10:32 AM, Per Liden wrote: > When using -XX:+AlwaysPreTouch, ZGC is currently doing single threaded > pre-touch. This patch makes this a parallel operation. This improves > startup time, especially when using large heaps. For example, when using > a 3TB heap (-XX:+AlwaysPreTouch -Xms3T -Xmx3T), startup time is improved > by about 30x. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234543 > Webrev: http://cr.openjdk.java.net/~pliden/8234543/webrev.0 > > /Per From leo.korinth at oracle.com Mon Nov 25 14:17:31 2019 From: leo.korinth at oracle.com (Leo Korinth) Date: Mon, 25 Nov 2019 15:17:31 +0100 Subject: RFR (S): 8234179: Move HeapRegion::_recorded_rs_length/_predicted_elapsed_time_ms into G1CollectionSet In-Reply-To: <4737b30d-f5cc-d075-aaa7-bbf8edf4f2fb@oracle.com> References: <4e22f801-827e-ef58-1c0a-297c2e6250e8@oracle.com> <08b639cb-7444-3de2-e1f9-c6b12d8bbecc@oracle.com> <4737b30d-f5cc-d075-aaa7-bbf8edf4f2fb@oracle.com> Message-ID: <19d933ff-1178-be71-79fc-f2dc0decd66a@oracle.com> On 25/11/2019 14:34, Thomas Schatzl wrote: > Hi Leo, > > On 22.11.19 17:37, Leo Korinth wrote: >> On 22/11/2019 10:57, Thomas Schatzl wrote: >>> Hi all, >>> >>> ?? can I have reviews for this change that moves two members >>> (_recorded_rs_length/_predicted_elapsed_time_ms) of HeapRegion into >>> G1CollectionSet that is the only user of these members? >>> >>> This also allows more efficient allocation of this information in the >>> future, as these are only needed for eden regions. >> >> Looks good in general, the array _inc_collection_set_stats should be >> initialized though. >> > > Good idea. > > http://cr.openjdk.java.net/~tschatzl/8234179/webrev.0_to_1 > http://cr.openjdk.java.net/~tschatzl/8234179/webrev.1 > > Now initializes the array with known bad values so that the asserts will > catch any use-before-init situations Thanks Thomas, it still looks good! /Leo > > Passes hs-tier1-5 > > Thanks, > ? Thomas From stefan.johansson at oracle.com Mon Nov 25 19:50:23 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Mon, 25 Nov 2019 20:50:23 +0100 Subject: RFR (S): 8234179: Move HeapRegion::_recorded_rs_length/_predicted_elapsed_time_ms into G1CollectionSet In-Reply-To: <4737b30d-f5cc-d075-aaa7-bbf8edf4f2fb@oracle.com> References: <4e22f801-827e-ef58-1c0a-297c2e6250e8@oracle.com> <08b639cb-7444-3de2-e1f9-c6b12d8bbecc@oracle.com> <4737b30d-f5cc-d075-aaa7-bbf8edf4f2fb@oracle.com> Message-ID: Hi Thomas, > 25 nov. 2019 kl. 14:34 skrev Thomas Schatzl : > > Hi Leo, > > On 22.11.19 17:37, Leo Korinth wrote: >> On 22/11/2019 10:57, Thomas Schatzl wrote: >>> Hi all, >>> >>> can I have reviews for this change that moves two members (_recorded_rs_length/_predicted_elapsed_time_ms) of HeapRegion into G1CollectionSet that is the only user of these members? >>> >>> This also allows more efficient allocation of this information in the future, as these are only needed for eden regions. >> Looks good in general, the array _inc_collection_set_stats should be initialized though. > > Good idea. > > http://cr.openjdk.java.net/~tschatzl/8234179/webrev.0_to_1 > http://cr.openjdk.java.net/~tschatzl/8234179/webrev.1 > Nice cleanup, look good. Thanks, Stefan > Now initializes the array with known bad values so that the asserts will catch any use-before-init situations. > > Passes hs-tier1-5 > > Thanks, > Thomas From zgu at redhat.com Mon Nov 25 20:35:13 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 25 Nov 2019 15:35:13 -0500 Subject: [14] RFR 8230765: Implement nmethod barrier for x86_32 platforms Message-ID: <932a10af-0f0e-067e-778d-fe853f01288c@redhat.com> Hi all, Please review this implementation of nmethod barrier for x86_32 platforms. x86_32 implementation mirrors x86_64's. The only difference is where it reads nmethod disarmed value. Unlike 64-bits, 32-bits platform does not have a dedicated register for current thread. So that it is cheaper to read disarmed value from global location than from per-thread GC data. Currently, only Shenandoah GC uses the implementation for its concurrent class unloading. This implementation, along with Shenandoah concurrent class unloading, has been baked in shenandoah/jdk repo for some time now, they are ready for integration. Bug: https://bugs.openjdk.java.net/browse/JDK-8230765 Webrev: http://cr.openjdk.java.net/~zgu/JDK-8230765/webrev.01/ Test: hotspot_gc with x86_64 and x86_32 JVM on Linux Submit test. Thanks, -Zhengyu From rkennke at redhat.com Mon Nov 25 21:07:50 2019 From: rkennke at redhat.com (Roman Kennke) Date: Mon, 25 Nov 2019 22:07:50 +0100 Subject: [14] RFR 8230765: Implement nmethod barrier for x86_32 platforms In-Reply-To: <932a10af-0f0e-067e-778d-fe853f01288c@redhat.com> References: <932a10af-0f0e-067e-778d-fe853f01288c@redhat.com> Message-ID: <3329cf09-55df-12ab-e40c-f8f9e480a1ce@redhat.com> It looks good to me, thanks! Roman > Hi all, > > Please review this implementation of nmethod barrier for x86_32 platforms. > > x86_32 implementation mirrors x86_64's. The only difference is where it > reads nmethod disarmed value. > > Unlike 64-bits, 32-bits platform does not have a dedicated register for > current thread. So that it is cheaper to read disarmed value from global > location than from per-thread GC data. > > Currently, only Shenandoah GC uses the implementation for its concurrent > class unloading. This implementation, along with Shenandoah concurrent > class unloading, has been baked in shenandoah/jdk repo for some time > now,? they are ready for integration. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8230765 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8230765/webrev.01/ > > > Test: > ? hotspot_gc with x86_64 and x86_32 JVM on Linux > ? Submit test. > > Thanks, > > -Zhengyu > From sangheon.kim at oracle.com Mon Nov 25 21:22:18 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Mon, 25 Nov 2019 13:22:18 -0800 Subject: RFR (M): 8233919: Incrementally calculate the occupied cards in a heap region remembered set In-Reply-To: References: <23b30b41-b109-4d05-606f-fa6a87a07897@oracle.com> <16491a20-6f2a-0d70-42c2-6c23c3a9e407@oracle.com> Message-ID: Hi Thomas, On 11/21/19 2:41 AM, Thomas Schatzl wrote: > Hi, > > On 20.11.19 11:42, Stefan Johansson wrote: >> Hi Thomas, >> >> On 2019-11-12 16:24, Thomas Schatzl wrote: >>> Hi all, >>> >>> ?? may I have reviews for this change that ultimately makes sure >>> that the number of occupied cards in a remembered set is only >>> growing by providing a per-OtherRegionsTable count that is >>> atomically updated when adding a remembered set entry. >>> >>> Note that this count may not be completely accurate due to races >>> when deleting a PerRegionTable (which is a known issue) from an >>> OtherRegionsTable; but that is no different than before. >>> >>> This helps improving the predictions in the young gen remset >>> sampling thread, and increase the performance of getting the >>> occupancy count. >>> >>> Based on JDK-8233997, and JDK-8233998 also out for review. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8233919 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8233919/webrev/ >> I like this change and it looks good in general, just one small comment: >> src/hotspot/share/gc/g1/heapRegionRemSet.cpp >> --- >> ??247?? bool added = prt->add_reference(from); >> ??248?? Atomic::add(num_added_by_coarsening + (added ? 1 : 0), >> &_num_occupied, memory_order_relaxed); >> >> I would prefer: >> if (prt->add_reference(from)) { >> ?? num_added_by_coarsening++; >> } >> Atomic::add... >> >> I you disagree, leave it as is. > > Fixed in > http://cr.openjdk.java.net/~tschatzl/8233919/webrev.0_to_1/ > http://cr.openjdk.java.net/~tschatzl/8233919/webrev.1/ Webrev.1 looks good in general. ======================== g1CollectionSet.cpp ?250?? assert(old_rs_length <= new_rs_length, ?251????????? "Remembered set sizes must increase (changed from " SIZE_FORMAT " to " SIZE_FORMAT " region %u type %s)", ?252????????? old_rs_length, new_rs_length, hr->hrm_index(), hr->get_short_type_str()); ?- I feel 'must increase' like 'old_rs_length < new_rs_length'. If you don't agree leave it as is. :) ======================== heapRegionRemSet.cpp ?200???????? Atomic::inc(&_num_occupied, memory_order_relaxed); - I already asked to Thomas offline. He said Atomic operation is not necessary in this version but it is necessary for future patch when the lock is removed. Thanks, Sangheon > > Thanks for your review, > ? Thomas > > From sangheon.kim at oracle.com Mon Nov 25 21:57:01 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Mon, 25 Nov 2019 13:57:01 -0800 Subject: RFR (S): 8233998: New young regions registered too early in collection set In-Reply-To: <1b0f8128-9508-716e-0160-b0625116376c@oracle.com> References: <6dc8783f-24ea-903f-a5a0-c2cc9df3b748@oracle.com> <1b0f8128-9508-716e-0160-b0625116376c@oracle.com> Message-ID: <7bd1bdb2-c1e3-5a6b-e92c-196d1b9a0f08@oracle.com> Hi Thomas, On 11/22/19 4:58 AM, Thomas Schatzl wrote: > Hi, > > ? ping for a second review :) > > Thomas > > On 20.11.19 11:49, Thomas Schatzl wrote: >> Hi, >> >> On 20.11.19 11:02, Stefan Johansson wrote: >>> Hi Thomas, >>> >>> On 2019-11-12 16:24, Thomas Schatzl wrote: >>>> Hi, >>>> >>>> ?? can I have reviews for this change that changes the place in >>>> which new mutator regions are published in the collection set list? >>>> >>>> Previously a new eden region has been published before some data >>>> that would be read by the young gen sampling thread could be visible. >>>> >>>> This change simply does the member updates before adding the >>>> regions to the collection set. >>>> >>>> CR: >>>> https://bugs.openjdk.java.net/browse/JDK-8233998 >>>> Webrev: >>>> http://cr.openjdk.java.net/~tschatzl/8233998/webrev/ Looks good to me too. Thanks, Sangheon >>> Looks good, >>> StefanJ >> >> ?? thanks for your review. >> >> Thomas > From zgu at redhat.com Mon Nov 25 22:01:15 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 25 Nov 2019 17:01:15 -0500 Subject: [14] RFR 8230765: Implement nmethod barrier for x86_32 platforms In-Reply-To: <3329cf09-55df-12ab-e40c-f8f9e480a1ce@redhat.com> References: <932a10af-0f0e-067e-778d-fe853f01288c@redhat.com> <3329cf09-55df-12ab-e40c-f8f9e480a1ce@redhat.com> Message-ID: Thanks, Roman. -Zhengyu On 11/25/19 4:07 PM, Roman Kennke wrote: > It looks good to me, thanks! > > Roman > > >> Hi all, >> >> Please review this implementation of nmethod barrier for x86_32 platforms. >> >> x86_32 implementation mirrors x86_64's. The only difference is where it >> reads nmethod disarmed value. >> >> Unlike 64-bits, 32-bits platform does not have a dedicated register for >> current thread. So that it is cheaper to read disarmed value from global >> location than from per-thread GC data. >> >> Currently, only Shenandoah GC uses the implementation for its concurrent >> class unloading. This implementation, along with Shenandoah concurrent >> class unloading, has been baked in shenandoah/jdk repo for some time >> now,? they are ready for integration. >> >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8230765 >> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8230765/webrev.01/ >> >> >> Test: >> ? hotspot_gc with x86_64 and x86_32 JVM on Linux >> ? Submit test. >> >> Thanks, >> >> -Zhengyu >> > From kdnilsen at amazon.com Tue Nov 26 04:36:04 2019 From: kdnilsen at amazon.com (Nilsen, Kelvin) Date: Tue, 26 Nov 2019 04:36:04 +0000 Subject: RFC: 8234440: ZGC: Change relocation set log level from debug to info Message-ID: <635B4BF3-B2D9-448F-B7A7-294B1A8FE02E@amazon.com> Hello list members, I am new to the OpenJDK effort and am looking for an opportunity to start with a relatively simple patch. The item mentioned in this subject header looks like something I might be able to manage. Is it ok if I work on this item? Alternatively, feel free to suggest some other initial problem report for me to begin with. I will coordinate with Paul Hohensee (hohensee at amazon.com) and/or Bernd Mathiske (mathiske at amazon.com) to commit the patch after appropriate reviews. My background: I have worked twenty years on a clean-room Java virtual machine but this is the first time I am contributing to the OpenJDK effort. The last four years, I have been working on gcc support for Power architecture. Thanks. From thomas.schatzl at oracle.com Tue Nov 26 09:01:37 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 26 Nov 2019 10:01:37 +0100 Subject: RFR (S): 8233998: New young regions registered too early in collection set In-Reply-To: <7bd1bdb2-c1e3-5a6b-e92c-196d1b9a0f08@oracle.com> References: <6dc8783f-24ea-903f-a5a0-c2cc9df3b748@oracle.com> <1b0f8128-9508-716e-0160-b0625116376c@oracle.com> <7bd1bdb2-c1e3-5a6b-e92c-196d1b9a0f08@oracle.com> Message-ID: <435f4d99-42b1-29d5-8f3c-43da0a640990@oracle.com> Hi Sangheon, On 25.11.19 22:57, sangheon.kim at oracle.com wrote: > Hi Thomas, > > On 11/22/19 4:58 AM, Thomas Schatzl wrote: >> Hi, >> >> ? ping for a second review :) >> >> Thomas >> >> On 20.11.19 11:49, Thomas Schatzl wrote: >>> Hi, >>> >>> On 20.11.19 11:02, Stefan Johansson wrote: >>>> Hi Thomas, >>>> [...] >>>>> >>>>> CR: >>>>> https://bugs.openjdk.java.net/browse/JDK-8233998 >>>>> Webrev: >>>>> http://cr.openjdk.java.net/~tschatzl/8233998/webrev/ > Looks good to me too. > > Thanks, > Sangheon thanks for your review. Thomas From thomas.schatzl at oracle.com Tue Nov 26 09:04:11 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 26 Nov 2019 10:04:11 +0100 Subject: RFR (M): 8233919: Incrementally calculate the occupied cards in a heap region remembered set In-Reply-To: References: <23b30b41-b109-4d05-606f-fa6a87a07897@oracle.com> <16491a20-6f2a-0d70-42c2-6c23c3a9e407@oracle.com> Message-ID: Hi Sangheon, thanks for looking at this. On 25.11.19 22:22, sangheon.kim at oracle.com wrote: > Hi Thomas, > > On 11/21/19 2:41 AM, Thomas Schatzl wrote: >> Hi, >> >> On 20.11.19 11:42, Stefan Johansson wrote: >>> Hi Thomas, >>> >>> On 2019-11-12 16:24, Thomas Schatzl wrote: >>>> Hi all, [...]>>> >>> I would prefer: >>> if (prt->add_reference(from)) { >>> ?? num_added_by_coarsening++; >>> } >>> Atomic::add... >>> >>> I you disagree, leave it as is. >> >> Fixed in >> http://cr.openjdk.java.net/~tschatzl/8233919/webrev.0_to_1/ >> http://cr.openjdk.java.net/~tschatzl/8233919/webrev.1/ > Webrev.1 looks good in general. > > ======================== > g1CollectionSet.cpp > ?250?? assert(old_rs_length <= new_rs_length, > ?251????????? "Remembered set sizes must increase (changed from " > SIZE_FORMAT " to " SIZE_FORMAT " region %u type %s)", > ?252????????? old_rs_length, new_rs_length, hr->hrm_index(), > hr->get_short_type_str()); > ?- I feel 'must increase' like 'old_rs_length < new_rs_length'. If you > don't agree leave it as is. :) > > ======================== > heapRegionRemSet.cpp > ?200???????? Atomic::inc(&_num_occupied, memory_order_relaxed); > - I already asked to Thomas offline. He said Atomic operation is not > necessary in this version but it is necessary for future patch when the > lock is removed. http://cr.openjdk.java.net/~tschatzl/8233919/webrev.1_to_2/ (diff) http://cr.openjdk.java.net/~tschatzl/8233919/webrev.2/ (full) Fixes the comment by changing the comment text to: 251 "Remembered set decreased (changed from " SIZE_FORMAT " to " SIZE_FORMAT " region %u type %s)", Thanks, Thomas From thomas.schatzl at oracle.com Tue Nov 26 09:05:24 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 26 Nov 2019 10:05:24 +0100 Subject: RFR (S): 8234179: Move HeapRegion::_recorded_rs_length/_predicted_elapsed_time_ms into G1CollectionSet In-Reply-To: References: <4e22f801-827e-ef58-1c0a-297c2e6250e8@oracle.com> <08b639cb-7444-3de2-e1f9-c6b12d8bbecc@oracle.com> <4737b30d-f5cc-d075-aaa7-bbf8edf4f2fb@oracle.com> Message-ID: Hi Leo, Stefan, On 25.11.19 20:50, Stefan Johansson wrote: > Hi Thomas, > >> 25 nov. 2019 kl. 14:34 skrev Thomas Schatzl : >> >> Hi Leo, >> >> On 22.11.19 17:37, Leo Korinth wrote: >>> On 22/11/2019 10:57, Thomas Schatzl wrote: >>>> Hi all, >>>> >>>> can I have reviews for this change that moves two members (_recorded_rs_length/_predicted_elapsed_time_ms) of HeapRegion into G1CollectionSet that is the only user of these members? >>>> >>>> This also allows more efficient allocation of this information in the future, as these are only needed for eden regions. >>> Looks good in general, the array _inc_collection_set_stats should be initialized though. >> >> Good idea. >> >> http://cr.openjdk.java.net/~tschatzl/8234179/webrev.0_to_1 >> http://cr.openjdk.java.net/~tschatzl/8234179/webrev.1 >> > Nice cleanup, look good. > thanks for your reviews. Thomas From per.liden at oracle.com Tue Nov 26 09:47:41 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 26 Nov 2019 10:47:41 +0100 Subject: RFC: 8234440: ZGC: Change relocation set log level from debug to info In-Reply-To: <635B4BF3-B2D9-448F-B7A7-294B1A8FE02E@amazon.com> References: <635B4BF3-B2D9-448F-B7A7-294B1A8FE02E@amazon.com> Message-ID: <1b3094cd-7d6b-87ee-c308-cda42e683e48@oracle.com> Hi Kelvin, On 11/26/19 5:36 AM, Nilsen, Kelvin wrote: > Hello list members, > > I am new to the OpenJDK effort and am looking for an opportunity to > start with a relatively simple patch.? The item mentioned in this > subject header looks like something I might be able to manage. > > Is it ok if I work on this item? I already have a patch for this particular item, but haven't sent it out for review yet (the change grew a bit). In general, when an item is assigned to someone in JIRA, it usually means that that person is working on it. It could also be good to know that GC enhancements (as opposed to bugs) always have fix version set to "tbd" until they are pushed, so "tbd" is not an indication that no one it working on it. > > Alternatively, feel free to suggest some other initial problem report > for me to begin with. If you're looking for starter bugs/enhancements to fix, you can look for the "starter" label in JIRA. Those are usually a good place to start for people that are new to the code base. For example: https://bugs.openjdk.java.net/issues/?jql=project%20%3D%20jdk%20and%20component%20%3D%20hotspot%20and%20status%20in%20(new%2C%20open)%20and%20labels%20in%20(starter) cheers, Per > > I will coordinate with Paul Hohensee (hohensee at amazon.com > ) and/or Bernd Mathiske (mathiske at amazon.com > ) to commit the patch after appropriate reviews. > > My background: I have worked twenty years on a clean-room Java virtual > machine but this is the first time I am contributing to the OpenJDK > effort.? The last four years, I have been working on gcc support for > Power architecture. > > Thanks. > From per.liden at oracle.com Tue Nov 26 10:00:10 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 26 Nov 2019 11:00:10 +0100 Subject: RFC: 8234440: ZGC: Change relocation set log level from debug to info In-Reply-To: <1b3094cd-7d6b-87ee-c308-cda42e683e48@oracle.com> References: <635B4BF3-B2D9-448F-B7A7-294B1A8FE02E@amazon.com> <1b3094cd-7d6b-87ee-c308-cda42e683e48@oracle.com> Message-ID: On 11/26/19 10:47 AM, Per Liden wrote: > always have fix version set to "tbd" until they are pushed Make that "often have fix version set to "tbd" until they are pushed". The exact workflow can differ depending on who you ask. Some like to set the fix version when they start working on it, or before they push, etc. /Per From erik.osterlund at oracle.com Tue Nov 26 10:32:02 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Tue, 26 Nov 2019 11:32:02 +0100 Subject: [14] RFR 8230765: Implement nmethod barrier for x86_32 platforms In-Reply-To: <932a10af-0f0e-067e-778d-fe853f01288c@redhat.com> References: <932a10af-0f0e-067e-778d-fe853f01288c@redhat.com> Message-ID: <8191daa9-0dfb-99a7-8433-25e3d9378082@oracle.com> Hi Zhengyu, Nice to see nmethod entry barriers added one more platform configuration. So now you use a global instead of thread-local on 32 bit x64 as it doesn't have a Thread register. That makes sense. I wonder though if that difference has spread too far into the runtime code. It does sting a bit in my eyes to read the seemingly unnecessary #ifdef _LP64 macros in BarrierSetNMethod. In particular, when computing the BarrierSetNMethod::disarmed_value(), it seems like it would work for everyone to simply read the global value, instead of doing it only sometimes, and sometimes read the thread-local. Here is a patch with my proposed cleanup: http://cr.openjdk.java.net/~eosterlund/8230765/webrev.02/ Incremental: http://cr.openjdk.java.net/~eosterlund/8230765/webrev.01_02/ Thanks, /Erik On 11/25/19 9:35 PM, Zhengyu Gu wrote: > Hi all, > > Please review this implementation of nmethod barrier for x86_32 > platforms. > > x86_32 implementation mirrors x86_64's. The only difference is where > it reads nmethod disarmed value. > > Unlike 64-bits, 32-bits platform does not have a dedicated register > for current thread. So that it is cheaper to read disarmed value from > global location than from per-thread GC data. > > Currently, only Shenandoah GC uses the implementation for its > concurrent class unloading. This implementation, along with Shenandoah > concurrent class unloading, has been baked in shenandoah/jdk repo for > some time now,? they are ready for integration. > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8230765 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8230765/webrev.01/ > > > Test: > ? hotspot_gc with x86_64 and x86_32 JVM on Linux > ? Submit test. > > Thanks, > > -Zhengyu > From erik.osterlund at oracle.com Tue Nov 26 10:57:56 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Tue, 26 Nov 2019 11:57:56 +0100 Subject: RFR: 8234619: ZGC: gc/z/TestSmallHeap.java failure with Out Of Memory In-Reply-To: References: Message-ID: <9af050b9-1e04-2469-7b5b-390dffc308fb@oracle.com> Hi Per, Looks great. Thanks, /Erik On 11/25/19 12:21 PM, Per Liden wrote: > The test gc/z/TestSmallHeap.java sometimes fails when run with and > -Xcomp. As it turns out, about 120K worth of additional intern Strings > are created when running with -Xcomp. On an 8M heap, means there's not > enough room left for the 512K array allocation this test does. This > patch simply disables this test when using -Xcomp. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234619 > Webrev: http://cr.openjdk.java.net/~pliden/8234619/webrev.0 > > Testing: Manual runs with gc/z/TestSmallHeap.java > > /Per From stefan.karlsson at oracle.com Tue Nov 26 11:07:20 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 26 Nov 2019 12:07:20 +0100 Subject: RFR: 8234798: Build failure after atomic changes in JDK-8234563 Message-ID: Hi all, Please review this trivial patch to fix a Shenandoah on Windows build breakage. https://cr.openjdk.java.net/~stefank/8234798/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8234798 Thanks, StefanK From thomas.schatzl at oracle.com Tue Nov 26 12:23:13 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 26 Nov 2019 13:23:13 +0100 Subject: RFR: 8234798: Build failure after atomic changes in JDK-8234563 In-Reply-To: References: Message-ID: <94a9fda3-7ce5-0693-dd9f-864cc40ec495@oracle.com> Hi, On 26.11.19 12:07, Stefan Karlsson wrote: > Hi all, > > Please review this trivial patch to fix a Shenandoah on Windows build > breakage. > > https://cr.openjdk.java.net/~stefank/8234798/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8234798 > > Thanks, > StefanK looks good to me and trivial enough. Thomas From thomas.schatzl at oracle.com Tue Nov 26 12:28:07 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 26 Nov 2019 13:28:07 +0100 Subject: RFR (M): 8227434: G1 predictions may over/underflow with high variance input In-Reply-To: <6d64cc59-a14e-48e8-d319-b3f9348e8304@oracle.com> References: <8112fc27-69a4-0249-de00-54e907ee38e4@oracle.com> <5dbdaed6-f5d9-eb9e-c708-3d67619ef5af@oracle.com> <6d64cc59-a14e-48e8-d319-b3f9348e8304@oracle.com> Message-ID: Hi Leo, On 22.11.19 12:23, Leo Korinth wrote: > On 22/11/2019 10:21, Thomas Schatzl wrote: >> Hi all, >> >> ?? ping for second review. > > Looks good > > /Leo thanks for your review. Thomas From thomas.schatzl at oracle.com Tue Nov 26 12:29:08 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 26 Nov 2019 13:29:08 +0100 Subject: RFR (S): 8234586: Rename survRateGroup.?pp files to g1SurvRateGroup.?pp In-Reply-To: <3f67db06-0138-0e4a-5ee2-311cca91fd60@oracle.com> References: <045ab345-0008-43d6-6917-c1d3aee1b9d7@oracle.com> <34380D46-927F-4BF5-A017-39B92CA2C60C@oracle.com> <3f67db06-0138-0e4a-5ee2-311cca91fd60@oracle.com> Message-ID: <32ef40d0-3c98-538f-679f-3e75c5685b3d@oracle.com> Hi Kim, Stefan, On 22.11.19 10:18, Stefan Johansson wrote: > > > On 2019-11-21 19:20, Kim Barrett wrote: >>> On Nov 21, 2019, at 9:09 AM, Thomas Schatzl >>> wrote: >>> >>> Hi all, >>> >>> ? can I have reviews for this rename of the survRateGroup* files to >>> g1SurvRateGroup* to follow the naming convention of (most) other G1 >>> specific files? >>> >>> Based on JDK-8233588. >>> >>> CR: >>> https://bugs.openjdk.java.net/browse/JDK-8234586 >>> Webrev: >>> http://cr.openjdk.java.net/~tschatzl/8234586/webrev/ >>> Testing: >>> local compilation >>> >>> Thanks, >>> ? Thomas >> >> Looks good, and trivial. >> > +1 thanks for your reviews. Thomas From rkennke at redhat.com Tue Nov 26 12:38:20 2019 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 26 Nov 2019 13:38:20 +0100 Subject: RFR: 8234798: Build failure after atomic changes in JDK-8234563 In-Reply-To: <94a9fda3-7ce5-0693-dd9f-864cc40ec495@oracle.com> References: <94a9fda3-7ce5-0693-dd9f-864cc40ec495@oracle.com> Message-ID: <446ec6be-3521-b7f5-cf40-9c459a4cbedb@redhat.com> Yes, looks good. Thanks! Roman > Hi, > > On 26.11.19 12:07, Stefan Karlsson wrote: >> Hi all, >> >> Please review this trivial patch to fix a Shenandoah on Windows build >> breakage. >> >> https://cr.openjdk.java.net/~stefank/8234798/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8234798 >> >> Thanks, >> StefanK > > ? looks good to me and trivial enough. > > Thomas > From zgu at redhat.com Tue Nov 26 12:38:53 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 26 Nov 2019 07:38:53 -0500 Subject: RFR: 8234798: Build failure after atomic changes in JDK-8234563 In-Reply-To: References: Message-ID: <3e6edf75-f057-18d8-7604-e282fbe63a75@redhat.com> Looks good and trivial. Thanks, -Zhengyu On 11/26/19 6:07 AM, Stefan Karlsson wrote: > Hi all, > > Please review this trivial patch to fix a Shenandoah on Windows build > breakage. > > https://cr.openjdk.java.net/~stefank/8234798/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8234798 > > Thanks, > StefanK > From stefan.karlsson at oracle.com Tue Nov 26 12:49:12 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 26 Nov 2019 13:49:12 +0100 Subject: RFR: 8234798: Build failure after atomic changes in JDK-8234563 In-Reply-To: References: Message-ID: Thanks, all. I'll go ahead and push this. StefanK On 2019-11-26 12:07, Stefan Karlsson wrote: > Hi all, > > Please review this trivial patch to fix a Shenandoah on Windows build > breakage. > > https://cr.openjdk.java.net/~stefank/8234798/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8234798 > > Thanks, > StefanK From zgu at redhat.com Tue Nov 26 13:24:08 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 26 Nov 2019 08:24:08 -0500 Subject: [14] RFR 8230765: Implement nmethod barrier for x86_32 platforms In-Reply-To: <8191daa9-0dfb-99a7-8433-25e3d9378082@oracle.com> References: <932a10af-0f0e-067e-778d-fe853f01288c@redhat.com> <8191daa9-0dfb-99a7-8433-25e3d9378082@oracle.com> Message-ID: <8232be6d-1942-ab42-b72a-6f92bafd70f9@redhat.com> Hi Erik, Thanks for the reviewing and suggestion. On 11/26/19 5:32 AM, erik.osterlund at oracle.com wrote: > Hi Zhengyu, > > Nice to see nmethod entry barriers added one more platform configuration. > So now you use a global instead of thread-local on 32 bit x64 as it > doesn't have a Thread register. > That makes sense. I wonder though if that difference has spread too far > into the runtime code. It does > sting a bit in my eyes to read the seemingly unnecessary #ifdef _LP64 > macros in BarrierSetNMethod. > > In particular, when computing the BarrierSetNMethod::disarmed_value(), > it seems like it would work > for everyone to simply read the global value, instead of doing it only > sometimes, and sometimes read > the thread-local. > > Here is a patch with my proposed cleanup: > http://cr.openjdk.java.net/~eosterlund/8230765/webrev.02/ > > Incremental: > http://cr.openjdk.java.net/~eosterlund/8230765/webrev.01_02/ > Yes, this indeed a much cleaner approach. I will take your proposed cleanup and run through submit. -Zhengyu > Thanks, > /Erik > > On 11/25/19 9:35 PM, Zhengyu Gu wrote: >> Hi all, >> >> Please review this implementation of nmethod barrier for x86_32 >> platforms. >> >> x86_32 implementation mirrors x86_64's. The only difference is where >> it reads nmethod disarmed value. >> >> Unlike 64-bits, 32-bits platform does not have a dedicated register >> for current thread. So that it is cheaper to read disarmed value from >> global location than from per-thread GC data. >> >> Currently, only Shenandoah GC uses the implementation for its >> concurrent class unloading. This implementation, along with Shenandoah >> concurrent class unloading, has been baked in shenandoah/jdk repo for >> some time now,? they are ready for integration. >> >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8230765 >> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8230765/webrev.01/ >> >> >> Test: >> ? hotspot_gc with x86_64 and x86_32 JVM on Linux >> ? Submit test. >> >> Thanks, >> >> -Zhengyu >> > From rkennke at redhat.com Tue Nov 26 13:48:19 2019 From: rkennke at redhat.com (Roman Kennke) Date: Tue, 26 Nov 2019 14:48:19 +0100 Subject: RFR: 8234768: Shenandoah: Streamline enqueueing runtime barriers Message-ID: Shenandoah's runtime barriers for SATB pre-date the GC interface, and have been very coarsly fitted into the new GC interfaces. It leaves room for improvements: - Rename methods to make more sense - Group enqueueing barrier methods together - Make them inlinable, including the ultimate enqueue() method - Avoid barriers on DEST_UNINITIALIZED pre-barriers - Benefit from static resolution of decorators when possible, and don't generate barriers at all in those cases As a bonus, add SATB and traversal store-value barriers to native oop stores. This is not doing anything now, but will enable concurrent roots scanning in the near future. Bug: https://bugs.openjdk.java.net/browse/JDK-8234768 Webrev: http://cr.openjdk.java.net/~rkennke/JDK-8234768/webrev.00/ Can I please get a review? Thanks, Roman From stefan.johansson at oracle.com Tue Nov 26 13:58:41 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 26 Nov 2019 14:58:41 +0100 Subject: RFR: 8141637: Parallelize single threaded heap region iteration during Pre Evacuate Collection Set In-Reply-To: References: <96d72eba-7ad9-30a0-18d2-944e6cf826be@oracle.com> <266d3b83-db50-3f56-c74c-fe4225006d51@oracle.com> <20803d82-e49d-0833-6134-ce881f5e0480@oracle.com> <7feccf49-0e79-204d-d31e-eb7064dc4cc3@oracle.com> Message-ID: <32f8239b-6d88-22ef-a305-8abdf2e33664@oracle.com> Hi all, Can I please have a second review. Re-basing on the latest had some minor conflicts due to the Atomic cleanups. Here are new webrevs: Full: http://cr.openjdk.java.net/~sjohanss/8141637/03 Inc: http://cr.openjdk.java.net/~sjohanss/8141637/02-03 Thanks, Stefan On 2019-11-19 11:30, Stefan Johansson wrote: > On 2019-11-19 10:27, Thomas Schatzl wrote: >> Hi Stefan, >> >> On 19.11.19 10:23, Stefan Johansson wrote: >>> Hi Thomas, >>> >>> On 2019-11-18 18:59, Thomas Schatzl wrote: >>>> Hi Stefan, >>>> [...] >>>> >>>>> >>>>> Updated webrevs: >>>>> Full: http://cr.openjdk.java.net/~sjohanss/8141637/01/ >>>>> Inc: http://cr.openjdk.java.net/~sjohanss/8141637/00-01/ >>>>> >>>> >>>> ?? looks good. >>>> >>>> It would be nice to rename the >>>> G1RemSet::prepare_for_scan_heap_roots() method to >>>> G1RemSet::exclude_from_scan() as we discussed internally in this >>>> change too. It does not seem to warrant an extra CR, unless you have >>>> more planned. >>>> >>>> I would not need a re-review for that rename, but you need to wait >>>> for another reviewer anyway.... >>> Not sure how I missed to include that, fixed and here are the new >>> webrevs: >>> Full: http://cr.openjdk.java.net/~sjohanss/8141637/02 >>> Inc: http://cr.openjdk.java.net/~sjohanss/8141637/01-02 >> >> Looks good. Thanks. > Thanks Thomas, I just realized the added assertion yesterday triggers > due to free regions, so I updated the latest webrevs inline to include a > fix for that, by reverting back to an else-if statement. > > Re-running mach5 now. > > Stefan > >> >> Thomas >> From zgu at redhat.com Tue Nov 26 14:00:11 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 26 Nov 2019 09:00:11 -0500 Subject: RFR: 8234768: Shenandoah: Streamline enqueueing runtime barriers In-Reply-To: References: Message-ID: Nice cleanup and looks good to me. Thanks, -Zhengyu On 11/26/19 8:48 AM, Roman Kennke wrote: > Shenandoah's runtime barriers for SATB pre-date the GC interface, and > have been very coarsly fitted into the new GC interfaces. It leaves room > for improvements: > - Rename methods to make more sense > - Group enqueueing barrier methods together > - Make them inlinable, including the ultimate enqueue() method > - Avoid barriers on DEST_UNINITIALIZED pre-barriers > - Benefit from static resolution of decorators when possible, and don't > generate barriers at all in those cases > > As a bonus, add SATB and traversal store-value barriers to native oop > stores. This is not doing anything now, but will enable concurrent roots > scanning in the near future. > > Bug: > https://bugs.openjdk.java.net/browse/JDK-8234768 > Webrev: > http://cr.openjdk.java.net/~rkennke/JDK-8234768/webrev.00/ > > Can I please get a review? > > Thanks, > Roman > From thomas.schatzl at oracle.com Tue Nov 26 14:24:35 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 26 Nov 2019 15:24:35 +0100 Subject: RFR: 8141637: Parallelize single threaded heap region iteration during Pre Evacuate Collection Set In-Reply-To: <32f8239b-6d88-22ef-a305-8abdf2e33664@oracle.com> References: <96d72eba-7ad9-30a0-18d2-944e6cf826be@oracle.com> <266d3b83-db50-3f56-c74c-fe4225006d51@oracle.com> <20803d82-e49d-0833-6134-ce881f5e0480@oracle.com> <7feccf49-0e79-204d-d31e-eb7064dc4cc3@oracle.com> <32f8239b-6d88-22ef-a305-8abdf2e33664@oracle.com> Message-ID: Hi, On 26.11.19 14:58, Stefan Johansson wrote: > Hi all, > > Can I please have a second review. Re-basing on the latest had some > minor conflicts due to the Atomic cleanups. Here are new webrevs: > Full: http://cr.openjdk.java.net/~sjohanss/8141637/03 > Inc: http://cr.openjdk.java.net/~sjohanss/8141637/02-03 > > Thanks, > Stefan still looks good. Thomaas From erik.osterlund at oracle.com Tue Nov 26 14:58:35 2019 From: erik.osterlund at oracle.com (erik.osterlund at oracle.com) Date: Tue, 26 Nov 2019 15:58:35 +0100 Subject: RFR: 8234543: ZGC: Parallel pre-touch In-Reply-To: References: <47dfbfe2-996c-771d-66f2-27683cb458b1@oracle.com> Message-ID: <9517db20-aa04-fc77-f74c-f87c6f323a93@oracle.com> Hi Per, Still good. Thanks, /Erik On 11/25/19 3:10 PM, Per Liden wrote: > I noticed that we didn't have a test for this on ZGC, so I added one: > > Updated webrev: http://cr.openjdk.java.net/~pliden/8234543/webrev.1 > > /Per > > On 11/21/19 10:32 AM, Per Liden wrote: >> When using -XX:+AlwaysPreTouch, ZGC is currently doing single >> threaded pre-touch. This patch makes this a parallel operation. This >> improves startup time, especially when using large heaps. For >> example, when using a 3TB heap (-XX:+AlwaysPreTouch -Xms3T -Xmx3T), >> startup time is improved by about 30x. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234543 >> Webrev: http://cr.openjdk.java.net/~pliden/8234543/webrev.0 >> >> /Per From stefan.johansson at oracle.com Tue Nov 26 15:01:53 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Tue, 26 Nov 2019 16:01:53 +0100 Subject: RFR: 8165443: Free Collection Set serial phase takes very long on large heaps Message-ID: <94565e78-149c-b8b5-ff17-752481e0e36e@oracle.com> Hi, Please review this fix to improve freeing the collection set when having a lot of regions. Issue: https://bugs.openjdk.java.net/browse/JDK-8165443 Webrev: http://cr.openjdk.java.net/~sjohanss/8165443/00/ Summary For heaps with many regions freeing the collections set takes a considerable amount of time and the big bottleneck is freeing regions and creating the free list. The approach to fix the problem is to split the collection set freeing and the free list rebuild into two different phases. This makes it easier to parallelize the free collection set phase and to work around sorting regions into the free list we now rebuild the free list from scratch, handing chunks of the heap to different workers and then appending them together. This changes removes the simple heuristic around how many workers to use for the collection set freeing and instead just caps in the number of workers to regions in the collection set. I did this because I saw not improvement using a work chunk size. I also added a new event and logging for the collection set freeing that captures the whole parallel time for each worker. I left the old events and logging for how much time has been spent working on young vs old regions, but I'm not sure how useful this information is in the logs. I get that it is needed for predictions, but maybe we can skip the logs and events? Current version keeps them but I'm very willing to remove them if others agree. Testing Manual performance testing on various benchmarks show good results for heap with a lot of regions and no regression for heap with a few regions. Functional tests using mach5 tier 1-5. Thanks, Stefan From sangheon.kim at oracle.com Tue Nov 26 15:41:26 2019 From: sangheon.kim at oracle.com (sangheon.kim at oracle.com) Date: Tue, 26 Nov 2019 07:41:26 -0800 Subject: RFR (M): 8233919: Incrementally calculate the occupied cards in a heap region remembered set In-Reply-To: References: <23b30b41-b109-4d05-606f-fa6a87a07897@oracle.com> <16491a20-6f2a-0d70-42c2-6c23c3a9e407@oracle.com> Message-ID: Hi Thomas, On 11/26/19 1:04 AM, Thomas Schatzl wrote: > Hi Sangheon, > > ? thanks for looking at this. > > On 25.11.19 22:22, sangheon.kim at oracle.com wrote: >> Hi Thomas, >> >> On 11/21/19 2:41 AM, Thomas Schatzl wrote: >>> Hi, >>> >>> On 20.11.19 11:42, Stefan Johansson wrote: >>>> Hi Thomas, >>>> >>>> On 2019-11-12 16:24, Thomas Schatzl wrote: >>>>> Hi all, > [...]>>> >>>> I would prefer: >>>> if (prt->add_reference(from)) { >>>> ?? num_added_by_coarsening++; >>>> } >>>> Atomic::add... >>>> >>>> I you disagree, leave it as is. >>> >>> Fixed in >>> http://cr.openjdk.java.net/~tschatzl/8233919/webrev.0_to_1/ >>> http://cr.openjdk.java.net/~tschatzl/8233919/webrev.1/ >> Webrev.1 looks good in general. >> >> ======================== >> g1CollectionSet.cpp >> ??250?? assert(old_rs_length <= new_rs_length, >> ??251????????? "Remembered set sizes must increase (changed from " >> SIZE_FORMAT " to " SIZE_FORMAT " region %u type %s)", >> ??252????????? old_rs_length, new_rs_length, hr->hrm_index(), >> hr->get_short_type_str()); >> ??- I feel 'must increase' like 'old_rs_length < new_rs_length'. If >> you don't agree leave it as is. :) >> >> ======================== >> heapRegionRemSet.cpp >> ??200???????? Atomic::inc(&_num_occupied, memory_order_relaxed); >> - I already asked to Thomas offline. He said Atomic operation is not >> necessary in this version but it is necessary for future patch when >> the lock is removed. > > http://cr.openjdk.java.net/~tschatzl/8233919/webrev.1_to_2/ (diff) > http://cr.openjdk.java.net/~tschatzl/8233919/webrev.2/ (full) > > Fixes the comment by changing the comment text to: > > ?251????????? "Remembered set decreased (changed from " SIZE_FORMAT " > to " SIZE_FORMAT " region %u type %s)", > Looks good. Thanks, Sangheon > Thanks, > ? Thomas From stefan.karlsson at oracle.com Tue Nov 26 17:24:09 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 26 Nov 2019 18:24:09 +0100 Subject: RFR: 8234822: Limit ZGC jtreg-support to Windows 2019 Server Message-ID: <1170772c-8194-7db0-81dd-4b729486c799@oracle.com> Hi all, Please review this patch that makes sure we don't try to run ZGC on older Windows versions. https://cr.openjdk.java.net/~stefank/8234822/webrev.01/ https://bugs.openjdk.java.net/browse/JDK-8234822 It causes @requires vm.gc.Z to filter out the test if run on older Windows version. It's a bit of a hack, since it checks against a hard-coded Windows version. It makes the implementation easy, and doesn't drag in new dependencies to sun.hotspot.gc.GC, but it's not as precise as it could be. I'd prefer to push the code in webrev.01, but if I get push-back an alternative would be to extend GCConfig and add ZGC specific checks, and add suitable code to ZGC: +bool GCConfig::is_gc_supported_on_os(CollectedHeap::Name name) { +? if (name != CollectedHeap::Z) { +??? return true; +? } + +#if INCLUDE_ZGC +? if (ZInitialize::is_os_supported()) { +??? return true; +? } +#endif + +? return false; +} and call that from GCConfig::is_gc_supported. Thanks, StefanK From per.liden at oracle.com Tue Nov 26 17:38:15 2019 From: per.liden at oracle.com (Per Liden) Date: Tue, 26 Nov 2019 18:38:15 +0100 Subject: RFR: 8234543: ZGC: Parallel pre-touch In-Reply-To: <9517db20-aa04-fc77-f74c-f87c6f323a93@oracle.com> References: <47dfbfe2-996c-771d-66f2-27683cb458b1@oracle.com> <9517db20-aa04-fc77-f74c-f87c6f323a93@oracle.com> Message-ID: Thanks Erik! /Per On 11/26/19 3:58 PM, erik.osterlund at oracle.com wrote: > Hi Per, > > Still good. > > Thanks, > /Erik > > On 11/25/19 3:10 PM, Per Liden wrote: >> I noticed that we didn't have a test for this on ZGC, so I added one: >> >> Updated webrev: http://cr.openjdk.java.net/~pliden/8234543/webrev.1 >> >> /Per >> >> On 11/21/19 10:32 AM, Per Liden wrote: >>> When using -XX:+AlwaysPreTouch, ZGC is currently doing single >>> threaded pre-touch. This patch makes this a parallel operation. This >>> improves startup time, especially when using large heaps. For >>> example, when using a 3TB heap (-XX:+AlwaysPreTouch -Xms3T -Xmx3T), >>> startup time is improved by about 30x. >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8234543 >>> Webrev: http://cr.openjdk.java.net/~pliden/8234543/webrev.0 >>> >>> /Per > From zgu at redhat.com Tue Nov 26 18:07:07 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 26 Nov 2019 13:07:07 -0500 Subject: [14] RFR 8230765: Implement nmethod barrier for x86_32 platforms In-Reply-To: <8232be6d-1942-ab42-b72a-6f92bafd70f9@redhat.com> References: <932a10af-0f0e-067e-778d-fe853f01288c@redhat.com> <8191daa9-0dfb-99a7-8433-25e3d9378082@oracle.com> <8232be6d-1942-ab42-b72a-6f92bafd70f9@redhat.com> Message-ID: <1eb4f382-e76f-0688-df26-906b3f62c63f@redhat.com> Hi Erik, On 11/26/19 8:24 AM, Zhengyu Gu wrote: >> Here is a patch with my proposed cleanup: >> http://cr.openjdk.java.net/~eosterlund/8230765/webrev.02/ >> >> Incremental: >> http://cr.openjdk.java.net/~eosterlund/8230765/webrev.01_02/ >> > > Yes, this indeed a much cleaner approach. I will take your proposed > cleanup and run through submit. I took your patch. There is just one little hiccup: compiler expects intptr_t instead of int* on x86_32, the fix is straightforward. __ push(tmp); - __ movptr(tmp, bs_nm->disarmed_value_address()); + __ movptr(tmp, (intptr_t)bs_nm->disarmed_value_address()); Address disarmed_addr(tmp, 0); __ align(4); __ cmpl(disarmed_addr, 0); Full webrev: http://cr.openjdk.java.net/~zgu/JDK-8230765/webrev.02/ and patch passed submit tests. Okay to push? Thanks, -Zhengyu > > -Zhengyu > > >> Thanks, >> /Erik >> >> On 11/25/19 9:35 PM, Zhengyu Gu wrote: >>> Hi all, >>> >>> Please review this implementation of nmethod barrier for x86_32 >>> platforms. >>> >>> x86_32 implementation mirrors x86_64's. The only difference is where >>> it reads nmethod disarmed value. >>> >>> Unlike 64-bits, 32-bits platform does not have a dedicated register >>> for current thread. So that it is cheaper to read disarmed value from >>> global location than from per-thread GC data. >>> >>> Currently, only Shenandoah GC uses the implementation for its >>> concurrent class unloading. This implementation, along with >>> Shenandoah concurrent class unloading, has been baked in >>> shenandoah/jdk repo for some time now,? they are ready for integration. >>> >>> >>> Bug: https://bugs.openjdk.java.net/browse/JDK-8230765 >>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8230765/webrev.01/ >>> >>> >>> Test: >>> ? hotspot_gc with x86_64 and x86_32 JVM on Linux >>> ? Submit test. >>> >>> Thanks, >>> >>> -Zhengyu >>> >> From igor.ignatyev at oracle.com Tue Nov 26 18:13:33 2019 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 26 Nov 2019 10:13:33 -0800 Subject: RFR: 8234822: Limit ZGC jtreg-support to Windows 2019 Server In-Reply-To: <1170772c-8194-7db0-81dd-4b729486c799@oracle.com> References: <1170772c-8194-7db0-81dd-4b729486c799@oracle.com> Message-ID: Hi Stefan, it looks reasonable to me. could you please add a TODO comment saying that the check isn't precise as it can be? feel free to ignore this comment, but given sun.hotspot.gc.GC is a enum, I'd suggest you to override isSupported() method in Z instead of checking 'this != Z' in supportsOSVersion() so you will get smth like this: > diff -r 5bda975bc9ea test/lib/sun/hotspot/gc/GC.java > --- a/test/lib/sun/hotspot/gc/GC.java Mon Nov 25 17:02:08 2019 -0800 > +++ b/test/lib/sun/hotspot/gc/GC.java Tue Nov 26 10:08:48 2019 -0800 > @@ -37,7 +37,20 @@ > Parallel(2), > G1(3), > Epsilon(4), > - Z(5), > + Z(5) { > + public boolean isSelected() { > + if (!WB.isGCSelected(Z.name)) { > + return false; > + } > + > + String osName = System.getProperty("os.name"); > + if (!osName.startsWith("Windows")) { > + return true; > + } > + return osName.equals("Windows Server 2019"); > + } > + > + }, or you can define s.h.gc.GC:supportsOSVersion() which always returns true and override supportsOSVersion() in Z to check os.name. Thanks, -- Igor > On Nov 26, 2019, at 9:24 AM, Stefan Karlsson wrote: > > Hi all, > > Please review this patch that makes sure we don't try to run ZGC on older Windows versions. > > https://cr.openjdk.java.net/~stefank/8234822/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8234822 > > It causes @requires vm.gc.Z to filter out the test if run on older Windows version. > > It's a bit of a hack, since it checks against a hard-coded Windows version. It makes the implementation easy, and doesn't drag in new dependencies to sun.hotspot.gc.GC, but it's not as precise as it could be. I'd prefer to push the code in webrev.01, but if I get push-back an alternative would be to extend GCConfig and add ZGC specific checks, and add suitable code to ZGC: > > +bool GCConfig::is_gc_supported_on_os(CollectedHeap::Name name) { > + if (name != CollectedHeap::Z) { > + return true; > + } > + > +#if INCLUDE_ZGC > + if (ZInitialize::is_os_supported()) { > + return true; > + } > +#endif > + > + return false; > +} > > and call that from GCConfig::is_gc_supported. > > Thanks, > StefanK From erik.osterlund at oracle.com Tue Nov 26 18:43:00 2019 From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=) Date: Tue, 26 Nov 2019 19:43:00 +0100 Subject: [14] RFR 8230765: Implement nmethod barrier for x86_32 platforms In-Reply-To: <1eb4f382-e76f-0688-df26-906b3f62c63f@redhat.com> References: <1eb4f382-e76f-0688-df26-906b3f62c63f@redhat.com> Message-ID: <13594682-4B7F-41D9-9D4B-84419BADC408@oracle.com> Hi Zhengyu, Looks good; ship it. Thanks /Erik > On 26 Nov 2019, at 19:07, Zhengyu Gu wrote: > > ?Hi Erik, > > On 11/26/19 8:24 AM, Zhengyu Gu wrote: >>> Here is a patch with my proposed cleanup: >>> http://cr.openjdk.java.net/~eosterlund/8230765/webrev.02/ >>> >>> Incremental: >>> http://cr.openjdk.java.net/~eosterlund/8230765/webrev.01_02/ >>> >> Yes, this indeed a much cleaner approach. I will take your proposed cleanup and run through submit. > > I took your patch. There is just one little hiccup: compiler expects intptr_t instead of int* on x86_32, the fix is straightforward. > > __ push(tmp); > - __ movptr(tmp, bs_nm->disarmed_value_address()); > + __ movptr(tmp, (intptr_t)bs_nm->disarmed_value_address()); > Address disarmed_addr(tmp, 0); > __ align(4); > __ cmpl(disarmed_addr, 0); > > > Full webrev: http://cr.openjdk.java.net/~zgu/JDK-8230765/webrev.02/ > > and patch passed submit tests. > > Okay to push? > > Thanks, > > -Zhengyu > > >> -Zhengyu >>> Thanks, >>> /Erik >>> >>>> On 11/25/19 9:35 PM, Zhengyu Gu wrote: >>>>> Hi all, >>>>> >>>>> Please review this implementation of nmethod barrier for x86_32 platforms. >>>>> >>>>> x86_32 implementation mirrors x86_64's. The only difference is where it reads nmethod disarmed value. >>>>> >>>>> Unlike 64-bits, 32-bits platform does not have a dedicated register for current thread. So that it is cheaper to read disarmed value from global location than from per-thread GC data. >>>>> >>>>> Currently, only Shenandoah GC uses the implementation for its concurrent class unloading. This implementation, along with Shenandoah concurrent class unloading, has been baked in shenandoah/jdk repo for some time now, they are ready for integration. >>>>> >>>>> >>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8230765 >>>>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8230765/webrev.01/ >>>>> >>>>> >>>>> Test: >>>>> hotspot_gc with x86_64 and x86_32 JVM on Linux >>>>> Submit test. >>>>> >>>>> Thanks, >>>>> >>>>> -Zhengyu >>>>> >>> > From zgu at redhat.com Tue Nov 26 20:46:15 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 26 Nov 2019 15:46:15 -0500 Subject: [14] RFR 8230765: Implement nmethod barrier for x86_32 platforms In-Reply-To: <13594682-4B7F-41D9-9D4B-84419BADC408@oracle.com> References: <1eb4f382-e76f-0688-df26-906b3f62c63f@redhat.com> <13594682-4B7F-41D9-9D4B-84419BADC408@oracle.com> Message-ID: On 11/26/19 1:43 PM, Erik ?sterlund wrote: > Hi Zhengyu, > > Looks good; ship it. Pushed. Thanks, -Zhengyu > > Thanks > /Erik > >> On 26 Nov 2019, at 19:07, Zhengyu Gu wrote: >> >> ?Hi Erik, >> >> On 11/26/19 8:24 AM, Zhengyu Gu wrote: >>>> Here is a patch with my proposed cleanup: >>>> http://cr.openjdk.java.net/~eosterlund/8230765/webrev.02/ >>>> >>>> Incremental: >>>> http://cr.openjdk.java.net/~eosterlund/8230765/webrev.01_02/ >>>> >>> Yes, this indeed a much cleaner approach. I will take your proposed cleanup and run through submit. >> >> I took your patch. There is just one little hiccup: compiler expects intptr_t instead of int* on x86_32, the fix is straightforward. >> >> __ push(tmp); >> - __ movptr(tmp, bs_nm->disarmed_value_address()); >> + __ movptr(tmp, (intptr_t)bs_nm->disarmed_value_address()); >> Address disarmed_addr(tmp, 0); >> __ align(4); >> __ cmpl(disarmed_addr, 0); >> >> >> Full webrev: http://cr.openjdk.java.net/~zgu/JDK-8230765/webrev.02/ >> >> and patch passed submit tests. >> >> Okay to push? >> >> Thanks, >> >> -Zhengyu >> >> >>> -Zhengyu >>>> Thanks, >>>> /Erik >>>> >>>>> On 11/25/19 9:35 PM, Zhengyu Gu wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review this implementation of nmethod barrier for x86_32 platforms. >>>>>> >>>>>> x86_32 implementation mirrors x86_64's. The only difference is where it reads nmethod disarmed value. >>>>>> >>>>>> Unlike 64-bits, 32-bits platform does not have a dedicated register for current thread. So that it is cheaper to read disarmed value from global location than from per-thread GC data. >>>>>> >>>>>> Currently, only Shenandoah GC uses the implementation for its concurrent class unloading. This implementation, along with Shenandoah concurrent class unloading, has been baked in shenandoah/jdk repo for some time now, they are ready for integration. >>>>>> >>>>>> >>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8230765 >>>>>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8230765/webrev.01/ >>>>>> >>>>>> >>>>>> Test: >>>>>> hotspot_gc with x86_64 and x86_32 JVM on Linux >>>>>> Submit test. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -Zhengyu >>>>>> >>>> >> > From leihouyju at gmail.com Wed Nov 27 02:15:32 2019 From: leihouyju at gmail.com (Haoyu Li) Date: Wed, 27 Nov 2019 10:15:32 +0800 Subject: RFR: 8220465: Use shadow regions for faster ParallelGC full GCs Message-ID: Hi Thomas, Thanks for your constructive suggestions! I've addressed the issues you mentioned, and the updated patches are attached. Please find the details below. > - static const ints can be initialized in the definition (UNUSED, > SHADOW, ...); also they should be CamelCased; they are very unspecific > too - I added some prefix to distinguish them a bit. > > Now these values of _shadow_state are initilized in the definition and CamelCased. I have also changed their names to ba more specific: a region is *UnusedRegion* when untouched, and will become *NormalRegion* if processed with the original parallel full GC algorithm; if an idle GC thread steals an unavailable region to process it with the help of a shadow region, the thread will mark it to *ShadowRegion*, *FilledShadow*, and *CopiedShadow* in sequence. - the documentation about this change is imho lacking. > > - It would be nice to explain the idea of shadow regions somewhere > assuming that you know how parallel works. Including the reference to > the paper. :) > > - some of the comments just show what code (often a single statement) > does, not the what and why or the reason why a particular method or > member exists. Or explains one or the other. > > E.g. > > "The shadow region array, we use it in a LIFO fashion, so that we can > reuse shadow regions for better data locality and utilization" > > - at this point we have no idea what a "shadow region" is and we > can't find out easily because it is called "shadow record" or "steal > record" elsewhere. > > Something better could be: > > "Contains currently free shadow regions (assuming we converge on that > name). We use it in a LIFO fashion for better data locality and > utilization." > > Thanks for pointing out the lack of illustration. I've added several paragraphs to demonstrate the main idea, the typical workflow, and the source paper of shadow region optimization in the comments of ParallelCompact::initialize_shadow_region(). Besides, I've also checked other comments in the patch and made them more precise. > - I think there is a missed optimization opportunity in (now) > PSParallelCompact::initialize_shadow_regions(). There, the code > initializes the "free" region ids to region_at_top+1 to end_region of a > particular space. > > If the top for a given space is at a region boundary (e.g. if a space is > empty, which is probably common for one of the survivor spaces), you > loose a single region per space. > > One reason might that the code uses region "0" as sentinel to indicate > "there is no shadow region available" in > ParCompactionManager::acquire_shadow_region(). > > This could be fixed by improving the code in > PSParallelCompact::initialize_shadow_regions() and use a sentinel region > value of (size_t)~0 (as an explicit constant). > > Even if you do not change this, please introduce an explicit constant > for this sentinel value. This makes the code more self-explanatory. > > Sorry for the misleading +1 operation. The +1 can be safely removed. The sentinel value 0 does not cause this design because the first region (in old space) cannot be a shadow region. > - at least in ParallelCompactData::RegionData::try_steal I would add a > dirty read of the _shadow_state to avoid the overhead of obviously > unsuccessful steal attempts (I do not know about frequencies of those, > so ymmv, but probably it would be easiest to add it everywhere). > > Also all the cmpxchg can/should use memory_order_relaxed to avoid the > two full fences every time accessed as far as I can tell. > > Excellent suggestions! I didn't consider the performance factors in these atomic instructions before, and you're right that GC threads may suffer many failures the first time getting shadow regions. Changing the memory order to memory_order_relaxed is also helpful. > - not sure about whether "acquire_shadow_region()/release_shadow_region" > are good names for > "PSParallelCompact::try_pop_shadow_region/push_shadow_region" (or > something else). > > "Acquire"/"release" has a very specific semantic related to a completely > different area (memory ordering in MP systems), so we should probably > avoid using them. There are other well-used pairs of names to add and > remove elements to a container too. > > Naming functions with "Acuire" and "Release" is indeed misleading. I've changed these functions to push/pop_shadow_region, push/pop_shadow_region_mt_safe, and remove_all_shadow_regions in PSParallelCompact to fit the underlying LIFO _shadow_region_array. Do you think these names are appropriate now? > - the changes in PSParallelCompact sometimes use the terms > "steal_record", "shadow_record" and "shadow_region" (e.g. > _shadow_region_array) interchangeably. > > Can you give a reason for this? I am good with any (with a preference > for "shadow_region" since it gives an idea of the contents while > "record" is quite generic), but it makes reading the code harder than > necessary. > > Sorry for the inconsistency between variable names. I introduced steal_record to record the index of the next shadow region, so that a GC thread could seek shadow regions from the last point instead of the beginning. To make the code more specific, I've changed the variable _shadow_record to _next_shadow_region and moved the code in PSParallelCompact::initialize_shadow_record to PSParallelCompact::initialize_shadow_regions. > - the names of the new methods e.g. in PSParallelCompact::RegionData > should be more precise; e.g. please add what does "try_push" wants to > push? Or "try_steal" steal? > Not even the comments for these contain that information, and I believe > that by better naming of the methods, we can avoid the comments > completely in most cases. > > Sorry for the vague code. These five atomic interfaces intend to change the _shadow_state of the current region to reflect the collection process, not to push or steal anything. I've changed try_push to mark_normal and try_steal to mark_shadow, respectively. The _shadow_state and the return value of these functions can help the collector to determine 1) whether a region should be collected by the shadow region optimization and 2) if the data in a shadow region are ready to be copied back to the corresponding heap region. Thanks again for your valuable reviews. If there are any further problems, please contact me at any time. I was also wondering could you please CC the following mails to me? There seem some problems with my email, and I didn't receive your last mail until I searched the mail lists in OpenJDK website. Thanks very much! Best Regards, Haoyu Li, Institute of Parallel and Distributed Systems(IPADS), School of Software, Shanghai Jiao Tong University -------------- next part -------------- A non-text attachment was scrubbed... Name: shadow-region-v5.patch Type: text/x-patch Size: 29986 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: shadow-region-incr.patch Type: text/x-patch Size: 23414 bytes Desc: not available URL: From erik.osterlund at oracle.com Wed Nov 27 07:56:25 2019 From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=) Date: Wed, 27 Nov 2019 08:56:25 +0100 Subject: RFR: 8234822: Limit ZGC jtreg-support to Windows 2019 Server In-Reply-To: <1170772c-8194-7db0-81dd-4b729486c799@oracle.com> References: <1170772c-8194-7db0-81dd-4b729486c799@oracle.com> Message-ID: Hi Stefan, I think your proposal is good enough. Getting a filter in place is urgent. Making the filter precise is not, and can easily be tweaked later on. Looks good. Thanks, /Erik > On 26 Nov 2019, at 18:25, Stefan Karlsson wrote: > > ?Hi all, > > Please review this patch that makes sure we don't try to run ZGC on older Windows versions. > > https://cr.openjdk.java.net/~stefank/8234822/webrev.01/ > https://bugs.openjdk.java.net/browse/JDK-8234822 > > It causes @requires vm.gc.Z to filter out the test if run on older Windows version. > > It's a bit of a hack, since it checks against a hard-coded Windows version. It makes the implementation easy, and doesn't drag in new dependencies to sun.hotspot.gc.GC, but it's not as precise as it could be. I'd prefer to push the code in webrev.01, but if I get push-back an alternative would be to extend GCConfig and add ZGC specific checks, and add suitable code to ZGC: > > +bool GCConfig::is_gc_supported_on_os(CollectedHeap::Name name) { > + if (name != CollectedHeap::Z) { > + return true; > + } > + > +#if INCLUDE_ZGC > + if (ZInitialize::is_os_supported()) { > + return true; > + } > +#endif > + > + return false; > +} > > and call that from GCConfig::is_gc_supported. > > Thanks, > StefanK From leo.korinth at oracle.com Wed Nov 27 09:47:48 2019 From: leo.korinth at oracle.com (Leo Korinth) Date: Wed, 27 Nov 2019 10:47:48 +0100 Subject: RFR: 8141637: Parallelize single threaded heap region iteration during Pre Evacuate Collection Set In-Reply-To: References: <96d72eba-7ad9-30a0-18d2-944e6cf826be@oracle.com> <266d3b83-db50-3f56-c74c-fe4225006d51@oracle.com> <20803d82-e49d-0833-6134-ce881f5e0480@oracle.com> <7feccf49-0e79-204d-d31e-eb7064dc4cc3@oracle.com> <32f8239b-6d88-22ef-a305-8abdf2e33664@oracle.com> Message-ID: On 26/11/2019 15:24, Thomas Schatzl wrote: > Hi, > > On 26.11.19 14:58, Stefan Johansson wrote: >> Hi all, >> >> Can I please have a second review. Re-basing on the latest had some >> minor conflicts due to the Atomic cleanups. Here are new webrevs: >> Full: http://cr.openjdk.java.net/~sjohanss/8141637/03 >> Inc: http://cr.openjdk.java.net/~sjohanss/8141637/02-03 >> >> Thanks, >> Stefan > > ? still looks good. > > Thomaas Looks good to me too. Thanks, Leo From stefan.johansson at oracle.com Wed Nov 27 11:16:12 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 27 Nov 2019 12:16:12 +0100 Subject: RFR: 8141637: Parallelize single threaded heap region iteration during Pre Evacuate Collection Set In-Reply-To: References: <96d72eba-7ad9-30a0-18d2-944e6cf826be@oracle.com> <266d3b83-db50-3f56-c74c-fe4225006d51@oracle.com> <20803d82-e49d-0833-6134-ce881f5e0480@oracle.com> <7feccf49-0e79-204d-d31e-eb7064dc4cc3@oracle.com> <32f8239b-6d88-22ef-a305-8abdf2e33664@oracle.com> Message-ID: Thanks Thomas and Leo, Stefan On 2019-11-27 10:47, Leo Korinth wrote: > > > On 26/11/2019 15:24, Thomas Schatzl wrote: >> Hi, >> >> On 26.11.19 14:58, Stefan Johansson wrote: >>> Hi all, >>> >>> Can I please have a second review. Re-basing on the latest had some >>> minor conflicts due to the Atomic cleanups. Here are new webrevs: >>> Full: http://cr.openjdk.java.net/~sjohanss/8141637/03 >>> Inc: http://cr.openjdk.java.net/~sjohanss/8141637/02-03 >>> >>> Thanks, >>> Stefan >> >> ?? still looks good. >> >> Thomaas > > Looks good to me too. > > Thanks, > Leo From stefan.johansson at oracle.com Wed Nov 27 13:22:23 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Wed, 27 Nov 2019 14:22:23 +0100 Subject: RFR: 8220465: Use shadow regions for faster ParallelGC full GCs In-Reply-To: References: Message-ID: <7ac0dde2-a32d-a641-d5ed-88c9f5fd8157@oracle.com> Hi Haoyu, I've quickly looked through the changes and they look good in general, the renaming makes the code easier to follow. One small knit, the paragraph referring your paper starts with "More more details", which I guess should be changed to "For more detials". Here are updated webrevs: Full: http://cr.openjdk.java.net/~sjohanss/8220465/04/ Inc: http://cr.openjdk.java.net/~sjohanss/8220465/03-04/ Cheers, Stefan On 2019-11-27 03:15, Haoyu Li wrote: > Hi Thomas, > > Thanks for your constructive suggestions! I've addressed the issues you > mentioned, and the updated patches are attached. Please find the details > below. > >> - static const ints can be initialized in the definition (UNUSED, >> SHADOW, ...); also they should be CamelCased; they are very unspecific >> too - I added some prefix to distinguish them a bit. >> >> Now these values of _shadow_state are initilized in the definition and > CamelCased. I have also changed their names to ba more specific: a region > is *UnusedRegion* when untouched, and will become *NormalRegion* if > processed with the original parallel full GC algorithm; if an idle GC > thread steals an unavailable region to process it with the help of a shadow > region, the thread will mark it to *ShadowRegion*, *FilledShadow*, and > *CopiedShadow* in sequence. > > - the documentation about this change is imho lacking. >> >> - It would be nice to explain the idea of shadow regions somewhere >> assuming that you know how parallel works. Including the reference to >> the paper. :) >> >> - some of the comments just show what code (often a single statement) >> does, not the what and why or the reason why a particular method or >> member exists. Or explains one or the other. >> >> E.g. >> >> "The shadow region array, we use it in a LIFO fashion, so that we can >> reuse shadow regions for better data locality and utilization" >> >> - at this point we have no idea what a "shadow region" is and we >> can't find out easily because it is called "shadow record" or "steal >> record" elsewhere. >> >> Something better could be: >> >> "Contains currently free shadow regions (assuming we converge on that >> name). We use it in a LIFO fashion for better data locality and >> utilization." >> >> Thanks for pointing out the lack of illustration. I've added several > paragraphs to demonstrate the main idea, the typical workflow, and the > source paper of shadow region optimization in the comments of > ParallelCompact::initialize_shadow_region(). Besides, I've also checked > other comments in the patch and made them more precise. > > >> - I think there is a missed optimization opportunity in (now) >> PSParallelCompact::initialize_shadow_regions(). There, the code >> initializes the "free" region ids to region_at_top+1 to end_region of a >> particular space. >> >> If the top for a given space is at a region boundary (e.g. if a space is >> empty, which is probably common for one of the survivor spaces), you >> loose a single region per space. >> >> One reason might that the code uses region "0" as sentinel to indicate >> "there is no shadow region available" in >> ParCompactionManager::acquire_shadow_region(). >> >> This could be fixed by improving the code in >> PSParallelCompact::initialize_shadow_regions() and use a sentinel region >> value of (size_t)~0 (as an explicit constant). >> >> Even if you do not change this, please introduce an explicit constant >> for this sentinel value. This makes the code more self-explanatory. >> >> Sorry for the misleading +1 operation. The +1 can be safely removed. The > sentinel value 0 does not cause this design because the first region (in > old space) cannot be a shadow region. > > >> - at least in ParallelCompactData::RegionData::try_steal I would add a >> dirty read of the _shadow_state to avoid the overhead of obviously >> unsuccessful steal attempts (I do not know about frequencies of those, >> so ymmv, but probably it would be easiest to add it everywhere). >> >> Also all the cmpxchg can/should use memory_order_relaxed to avoid the >> two full fences every time accessed as far as I can tell. >> >> Excellent suggestions! I didn't consider the performance factors in these > atomic instructions before, and you're right that GC threads may suffer > many failures the first time getting shadow regions. Changing the memory > order to memory_order_relaxed is also helpful. > > >> - not sure about whether "acquire_shadow_region()/release_shadow_region" >> are good names for >> "PSParallelCompact::try_pop_shadow_region/push_shadow_region" (or >> something else). >> >> "Acquire"/"release" has a very specific semantic related to a completely >> different area (memory ordering in MP systems), so we should probably >> avoid using them. There are other well-used pairs of names to add and >> remove elements to a container too. >> >> Naming functions with "Acuire" and "Release" is indeed misleading. I've > changed these functions to push/pop_shadow_region, > push/pop_shadow_region_mt_safe, and remove_all_shadow_regions in > PSParallelCompact to fit the underlying LIFO _shadow_region_array. Do you > think these names are appropriate now? > > >> - the changes in PSParallelCompact sometimes use the terms >> "steal_record", "shadow_record" and "shadow_region" (e.g. >> _shadow_region_array) interchangeably. >> >> Can you give a reason for this? I am good with any (with a preference >> for "shadow_region" since it gives an idea of the contents while >> "record" is quite generic), but it makes reading the code harder than >> necessary. >> >> Sorry for the inconsistency between variable names. I introduced > steal_record to record the index of the next shadow region, so that a GC > thread could seek shadow regions from the last point instead of the > beginning. To make the code more specific, I've changed the variable > _shadow_record to _next_shadow_region and moved the code in > PSParallelCompact::initialize_shadow_record to > PSParallelCompact::initialize_shadow_regions. > > >> - the names of the new methods e.g. in PSParallelCompact::RegionData >> should be more precise; e.g. please add what does "try_push" wants to >> push? Or "try_steal" steal? >> Not even the comments for these contain that information, and I believe >> that by better naming of the methods, we can avoid the comments >> completely in most cases. >> >> Sorry for the vague code. These five atomic interfaces intend to change > the _shadow_state of the current region to reflect the collection process, > not to push or steal anything. I've changed try_push to mark_normal and > try_steal to mark_shadow, respectively. The _shadow_state and the return > value of these functions can help the collector to determine 1) whether a > region should be collected by the shadow region optimization and 2) if the > data in a shadow region are ready to be copied back to the corresponding > heap region. > > Thanks again for your valuable reviews. If there are any further problems, > please contact me at any time. I was also wondering could you please CC the > following mails to me? There seem some problems with my email, and I didn't > receive your last mail until I searched the mail lists in OpenJDK website. > Thanks very much! > > Best Regards, > Haoyu Li, > Institute of Parallel and Distributed Systems(IPADS), > School of Software, > Shanghai Jiao Tong University > From zgu at redhat.com Wed Nov 27 14:17:26 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 27 Nov 2019 09:17:26 -0500 Subject: [14] RFR 8228720: Shenandoah: Implementation of concurrent class unloading Message-ID: <2b6697ac-68a1-a398-9030-d8517bd45c31@redhat.com> Shenandoah concurrent class unloading has been baked in shenandoah/jdk repo for quite some time, it is ready for integration into 14. The implementation is similar to ZGC's with added complexity, due to additional GC modes other than concurrent mode. E.g. degenerated GC and full GC. This patch only contains Shenandoah specific changes, the shared part has been upstreamed under JDK-8230765 to support x86_32 platforms. A few key points: 1) Available on x86_64 and x86_32 platforms with Shenandoah concurrent GC (not yet with traversal GC) 2) Concurrent class unloading is enabled by default with Shenandoah concurrent GC for every GC cycle. Class unloading can be disabled with -XX:-ClassUnloading/-XX:-ClassUnloadingWithConcurrentMark and frequency can be changed via experimental flag -XX:ShenandoahRefProcFrequency=n 3) For degenerated GC and full GC, class unloading falls back to STW. Bug: https://bugs.openjdk.java.net/browse/JDK-8228720 Webrev: http://cr.openjdk.java.net/~zgu/JDK-8228720/webrev.02 Test: hotspot_gc_shenandoah (fastdebug and release) with x86_64 and x86_32 JVM on Linux. Thanks, -Zhengyu From stefan.karlsson at oracle.com Wed Nov 27 14:29:32 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 27 Nov 2019 15:29:32 +0100 Subject: RFR: 8234822: Limit ZGC jtreg-support to Windows 2019 Server In-Reply-To: References: <1170772c-8194-7db0-81dd-4b729486c799@oracle.com> Message-ID: On 2019-11-26 19:13, Igor Ignatyev wrote: > Hi Stefan, > > it looks reasonable to me. could you please add a TODO comment saying that the check isn't precise as it can be? Sure. > > feel free to ignore this comment, but given sun.hotspot.gc.GC is a enum, I'd suggest you to override isSupported() method in Z instead of checking 'this != Z' in supportsOSVersion() so you will get smth like this: > >> diff -r 5bda975bc9ea test/lib/sun/hotspot/gc/GC.java >> --- a/test/lib/sun/hotspot/gc/GC.java Mon Nov 25 17:02:08 2019 -0800 >> +++ b/test/lib/sun/hotspot/gc/GC.java Tue Nov 26 10:08:48 2019 -0800 >> @@ -37,7 +37,20 @@ >> Parallel(2), >> G1(3), >> Epsilon(4), >> - Z(5), >> + Z(5) { >> + public boolean isSelected() { >> + if (!WB.isGCSelected(Z.name)) { >> + return false; >> + } >> + >> + String osName = System.getProperty("os.name"); >> + if (!osName.startsWith("Windows")) { >> + return true; >> + } >> + return osName.equals("Windows Server 2019"); >> + } >> + >> + }, > > or you can define s.h.gc.GC:supportsOSVersion() which always returns true and override supportsOSVersion() in Z to check os.name. > Thanks for the suggestion. If others strongly agree that this is nicer, I'll do it, otherwise I'll go with the somewhat smaller patch in the webrev. Thanks, StefanK > Thanks, > -- Igor > > >> On Nov 26, 2019, at 9:24 AM, Stefan Karlsson wrote: >> >> Hi all, >> >> Please review this patch that makes sure we don't try to run ZGC on older Windows versions. >> >> https://cr.openjdk.java.net/~stefank/8234822/webrev.01/ >> https://bugs.openjdk.java.net/browse/JDK-8234822 >> >> It causes @requires vm.gc.Z to filter out the test if run on older Windows version. >> >> It's a bit of a hack, since it checks against a hard-coded Windows version. It makes the implementation easy, and doesn't drag in new dependencies to sun.hotspot.gc.GC, but it's not as precise as it could be. I'd prefer to push the code in webrev.01, but if I get push-back an alternative would be to extend GCConfig and add ZGC specific checks, and add suitable code to ZGC: >> >> +bool GCConfig::is_gc_supported_on_os(CollectedHeap::Name name) { >> + if (name != CollectedHeap::Z) { >> + return true; >> + } >> + >> +#if INCLUDE_ZGC >> + if (ZInitialize::is_os_supported()) { >> + return true; >> + } >> +#endif >> + >> + return false; >> +} >> >> and call that from GCConfig::is_gc_supported. >> >> Thanks, >> StefanK > From rkennke at redhat.com Wed Nov 27 16:15:05 2019 From: rkennke at redhat.com (Roman Kennke) Date: Wed, 27 Nov 2019 17:15:05 +0100 Subject: [14] RFR 8228720: Shenandoah: Implementation of concurrent class unloading In-Reply-To: <2b6697ac-68a1-a398-9030-d8517bd45c31@redhat.com> References: <2b6697ac-68a1-a398-9030-d8517bd45c31@redhat.com> Message-ID: Hi Zhengyu, this is great work! I looked over the changeset and couldn't find anything to complain about. Thumbs up! Thanks, Roman > Shenandoah concurrent class unloading has been baked in shenandoah/jdk > repo for quite some time, it is ready for integration into 14. > > The implementation is similar to ZGC's with added complexity, due to > additional GC modes other than concurrent mode. E.g. degenerated GC and > full GC. > > This patch only contains Shenandoah specific changes, the shared part > has been upstreamed under JDK-8230765 to support x86_32 platforms. > > A few key points: > > 1) Available on x86_64 and x86_32 platforms with Shenandoah concurrent > GC (not yet with traversal GC) > > 2) Concurrent class unloading is enabled by default with Shenandoah > concurrent GC for every GC cycle. Class unloading can be disabled with > -XX:-ClassUnloading/-XX:-ClassUnloadingWithConcurrentMark and frequency > can be changed via experimental flag -XX:ShenandoahRefProcFrequency=n > > 3) For degenerated GC and full GC, class unloading falls back to STW. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8228720 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8228720/webrev.02 > > Test: > ? hotspot_gc_shenandoah (fastdebug and release) with x86_64 and x86_32 > ? JVM on Linux. > > Thanks, > > -Zhengyu > From zgu at redhat.com Wed Nov 27 16:25:43 2019 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 27 Nov 2019 11:25:43 -0500 Subject: [14] RFR 8228720: Shenandoah: Implementation of concurrent class unloading In-Reply-To: References: <2b6697ac-68a1-a398-9030-d8517bd45c31@redhat.com> Message-ID: Thanks, Roman. -Zhengyu On 11/27/19 11:15 AM, Roman Kennke wrote: > Hi Zhengyu, > > this is great work! > > I looked over the changeset and couldn't find anything to complain > about. Thumbs up! > > Thanks, > Roman > >> Shenandoah concurrent class unloading has been baked in shenandoah/jdk >> repo for quite some time, it is ready for integration into 14. >> >> The implementation is similar to ZGC's with added complexity, due to >> additional GC modes other than concurrent mode. E.g. degenerated GC and >> full GC. >> >> This patch only contains Shenandoah specific changes, the shared part >> has been upstreamed under JDK-8230765 to support x86_32 platforms. >> >> A few key points: >> >> 1) Available on x86_64 and x86_32 platforms with Shenandoah concurrent >> GC (not yet with traversal GC) >> >> 2) Concurrent class unloading is enabled by default with Shenandoah >> concurrent GC for every GC cycle. Class unloading can be disabled with >> -XX:-ClassUnloading/-XX:-ClassUnloadingWithConcurrentMark and frequency >> can be changed via experimental flag -XX:ShenandoahRefProcFrequency=n >> >> 3) For degenerated GC and full GC, class unloading falls back to STW. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8228720 >> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8228720/webrev.02 >> >> Test: >> ? hotspot_gc_shenandoah (fastdebug and release) with x86_64 and x86_32 >> ? JVM on Linux. >> >> Thanks, >> >> -Zhengyu >> > From kim.barrett at oracle.com Wed Nov 27 22:43:36 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 27 Nov 2019 17:43:36 -0500 Subject: RFR (M): 8231579: G1's incremental calculation of region elapsed time always uses the same age group for prediction In-Reply-To: <3fab7724-302f-faab-40bf-b95df0de5b05@oracle.com> References: <6bb0be27-c32c-f2c0-b21e-b03802590cf4@oracle.com> <3fab7724-302f-faab-40bf-b95df0de5b05@oracle.com> Message-ID: <0F26F96D-1861-4956-9F64-C217D8B7CD6A@oracle.com> > On Nov 20, 2019, at 7:49 AM, Thomas Schatzl wrote: > http://cr.openjdk.java.net/~tschatzl/8231579/webrev.0_to_1/ (diff) > http://cr.openjdk.java.net/~tschatzl/8231579/webrev.1/ (full) > > Thanks, > Thomas Looks good. From kim.barrett at oracle.com Wed Nov 27 22:51:45 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 27 Nov 2019 17:51:45 -0500 Subject: RFR (M): 8231579: G1's incremental calculation of region elapsed time always uses the same age group for prediction In-Reply-To: <0F26F96D-1861-4956-9F64-C217D8B7CD6A@oracle.com> References: <6bb0be27-c32c-f2c0-b21e-b03802590cf4@oracle.com> <3fab7724-302f-faab-40bf-b95df0de5b05@oracle.com> <0F26F96D-1861-4956-9F64-C217D8B7CD6A@oracle.com> Message-ID: > On Nov 27, 2019, at 5:43 PM, Kim Barrett wrote: > >> On Nov 20, 2019, at 7:49 AM, Thomas Schatzl wrote: >> http://cr.openjdk.java.net/~tschatzl/8231579/webrev.0_to_1/ (diff) >> http://cr.openjdk.java.net/~tschatzl/8231579/webrev.1/ (full) >> >> Thanks, >> Thomas > > Looks good. Oops, I did have one comment: src/hotspot/share/gc/g1/g1Policy.cpp 136 G1YoungLengthPredictor(bool during_cm, After the other changes to that class, the during_cm constructor argument seems to no longer be used. From kim.barrett at oracle.com Wed Nov 27 23:12:17 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 27 Nov 2019 18:12:17 -0500 Subject: RFR (M): 8227739: Merge cost predictions for scanning cards and log buffer entries In-Reply-To: <6f66ee24-112d-032d-a28e-1a08850a9268@oracle.com> References: <6f66ee24-112d-032d-a28e-1a08850a9268@oracle.com> Message-ID: <7B21E3CD-4264-481C-B455-B240A45D1631@oracle.com> > On Nov 20, 2019, at 10:26 AM, Thomas Schatzl wrote: > > http://cr.openjdk.java.net/~tschatzl/8227739/webrev.0_to_1/ (diff) > http://cr.openjdk.java.net/~tschatzl/8227739/webrev.1/ (full) ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1Policy.cpp 760 _analytics->report_card_merge_to_scan_ratio(merge_to_scan_ratio, this_pause_was_young_only); misindentation. ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1HotCardCache.hpp 87 bool _cache_wrapped_around; Although always updated to the same value, it's still written by multiple threads, so should be "atomic" in the sense we're presently using, e.g. declare the variable volatile, and consider using Atomic::store for the write. ------------------------------------------------------------------------------ From thomas.schatzl at oracle.com Thu Nov 28 11:13:58 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 28 Nov 2019 12:13:58 +0100 Subject: RFR (M): 8227739: Merge cost predictions for scanning cards and log buffer entries In-Reply-To: <7B21E3CD-4264-481C-B455-B240A45D1631@oracle.com> References: <6f66ee24-112d-032d-a28e-1a08850a9268@oracle.com> <7B21E3CD-4264-481C-B455-B240A45D1631@oracle.com> Message-ID: <221df4ba221a56e2384135321361ca1bf90866a8.camel@oracle.com> Hi, On Wed, 2019-11-27 at 18:12 -0500, Kim Barrett wrote: > > On Nov 20, 2019, at 10:26 AM, Thomas Schatzl < > > thomas.schatzl at oracle.com> wrote: > > > > http://cr.openjdk.java.net/~tschatzl/8227739/webrev.0_to_1/ (diff) > > http://cr.openjdk.java.net/~tschatzl/8227739/webrev.1/ (full) > > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/g1/g1Policy.cpp > 760 _analytics- > >report_card_merge_to_scan_ratio(merge_to_scan_ratio, > this_pause_was_young_only); > > misindentation. > > ------------------------------------------------------------------- > ----------- > src/hotspot/share/gc/g1/g1HotCardCache.hpp > 87 bool _cache_wrapped_around; > > Although always updated to the same value, it's still written by > multiple threads, so should be "atomic" in the sense we're presently > using, e.g. declare the variable volatile, and consider using > Atomic::store for the write. > > ------------------------------------------------------------------- > ----------- Fixed in: http://cr.openjdk.java.net/~tschatzl/8227739/webrev.1_to_2/ (diff) http://cr.openjdk.java.net/~tschatzl/8227739/webrev.2/ (full) Thanks, Thomas > From leihouyju at gmail.com Thu Nov 28 13:27:30 2019 From: leihouyju at gmail.com (Haoyu Li) Date: Thu, 28 Nov 2019 21:27:30 +0800 Subject: RFR: 8220465: Use shadow regions for faster ParallelGC full GCs In-Reply-To: <7ac0dde2-a32d-a641-d5ed-88c9f5fd8157@oracle.com> References: <7ac0dde2-a32d-a641-d5ed-88c9f5fd8157@oracle.com> Message-ID: Hi Stefan, Thanks for your reviewing. I've checked the comments again and updated some more comments to make it more precise. Please find the attached patches. Best Regards, Haoyu Li Stefan Johansson ?2019?11?27??? ??9:23??? > Hi Haoyu, > > I've quickly looked through the changes and they look good in general, > the renaming makes the code easier to follow. > > One small knit, the paragraph referring your paper starts with "More > more details", which I guess should be changed to "For more detials". > > Here are updated webrevs: > Full: http://cr.openjdk.java.net/~sjohanss/8220465/04/ > Inc: http://cr.openjdk.java.net/~sjohanss/8220465/03-04/ > > Cheers, > Stefan > > On 2019-11-27 03:15, Haoyu Li wrote: > > Hi Thomas, > > > > Thanks for your constructive suggestions! I've addressed the issues you > > mentioned, and the updated patches are attached. Please find the details > > below. > > > >> - static const ints can be initialized in the definition (UNUSED, > >> SHADOW, ...); also they should be CamelCased; they are very unspecific > >> too - I added some prefix to distinguish them a bit. > >> > >> Now these values of _shadow_state are initilized in the definition and > > CamelCased. I have also changed their names to ba more specific: a region > > is *UnusedRegion* when untouched, and will become *NormalRegion* if > > processed with the original parallel full GC algorithm; if an idle GC > > thread steals an unavailable region to process it with the help of a > shadow > > region, the thread will mark it to *ShadowRegion*, *FilledShadow*, and > > *CopiedShadow* in sequence. > > > > - the documentation about this change is imho lacking. > >> > >> - It would be nice to explain the idea of shadow regions somewhere > >> assuming that you know how parallel works. Including the reference to > >> the paper. :) > >> > >> - some of the comments just show what code (often a single statement) > >> does, not the what and why or the reason why a particular method or > >> member exists. Or explains one or the other. > >> > >> E.g. > >> > >> "The shadow region array, we use it in a LIFO fashion, so that we can > >> reuse shadow regions for better data locality and utilization" > >> > >> - at this point we have no idea what a "shadow region" is and we > >> can't find out easily because it is called "shadow record" or "steal > >> record" elsewhere. > >> > >> Something better could be: > >> > >> "Contains currently free shadow regions (assuming we converge on that > >> name). We use it in a LIFO fashion for better data locality and > >> utilization." > >> > >> Thanks for pointing out the lack of illustration. I've added several > > paragraphs to demonstrate the main idea, the typical workflow, and the > > source paper of shadow region optimization in the comments of > > ParallelCompact::initialize_shadow_region(). Besides, I've also checked > > other comments in the patch and made them more precise. > > > > > >> - I think there is a missed optimization opportunity in (now) > >> PSParallelCompact::initialize_shadow_regions(). There, the code > >> initializes the "free" region ids to region_at_top+1 to end_region of a > >> particular space. > >> > >> If the top for a given space is at a region boundary (e.g. if a space is > >> empty, which is probably common for one of the survivor spaces), you > >> loose a single region per space. > >> > >> One reason might that the code uses region "0" as sentinel to indicate > >> "there is no shadow region available" in > >> ParCompactionManager::acquire_shadow_region(). > >> > >> This could be fixed by improving the code in > >> PSParallelCompact::initialize_shadow_regions() and use a sentinel region > >> value of (size_t)~0 (as an explicit constant). > >> > >> Even if you do not change this, please introduce an explicit constant > >> for this sentinel value. This makes the code more self-explanatory. > >> > >> Sorry for the misleading +1 operation. The +1 can be safely removed. The > > sentinel value 0 does not cause this design because the first region (in > > old space) cannot be a shadow region. > > > > > >> - at least in ParallelCompactData::RegionData::try_steal I would add a > >> dirty read of the _shadow_state to avoid the overhead of obviously > >> unsuccessful steal attempts (I do not know about frequencies of those, > >> so ymmv, but probably it would be easiest to add it everywhere). > >> > >> Also all the cmpxchg can/should use memory_order_relaxed to avoid the > >> two full fences every time accessed as far as I can tell. > >> > >> Excellent suggestions! I didn't consider the performance factors in > these > > atomic instructions before, and you're right that GC threads may suffer > > many failures the first time getting shadow regions. Changing the memory > > order to memory_order_relaxed is also helpful. > > > > > >> - not sure about whether "acquire_shadow_region()/release_shadow_region" > >> are good names for > >> "PSParallelCompact::try_pop_shadow_region/push_shadow_region" (or > >> something else). > >> > >> "Acquire"/"release" has a very specific semantic related to a completely > >> different area (memory ordering in MP systems), so we should probably > >> avoid using them. There are other well-used pairs of names to add and > >> remove elements to a container too. > >> > >> Naming functions with "Acuire" and "Release" is indeed misleading. I've > > changed these functions to push/pop_shadow_region, > > push/pop_shadow_region_mt_safe, and remove_all_shadow_regions in > > PSParallelCompact to fit the underlying LIFO _shadow_region_array. Do you > > think these names are appropriate now? > > > > > >> - the changes in PSParallelCompact sometimes use the terms > >> "steal_record", "shadow_record" and "shadow_region" (e.g. > >> _shadow_region_array) interchangeably. > >> > >> Can you give a reason for this? I am good with any (with a preference > >> for "shadow_region" since it gives an idea of the contents while > >> "record" is quite generic), but it makes reading the code harder than > >> necessary. > >> > >> Sorry for the inconsistency between variable names. I introduced > > steal_record to record the index of the next shadow region, so that a GC > > thread could seek shadow regions from the last point instead of the > > beginning. To make the code more specific, I've changed the variable > > _shadow_record to _next_shadow_region and moved the code in > > PSParallelCompact::initialize_shadow_record to > > PSParallelCompact::initialize_shadow_regions. > > > > > >> - the names of the new methods e.g. in PSParallelCompact::RegionData > >> should be more precise; e.g. please add what does "try_push" wants to > >> push? Or "try_steal" steal? > >> Not even the comments for these contain that information, and I believe > >> that by better naming of the methods, we can avoid the comments > >> completely in most cases. > >> > >> Sorry for the vague code. These five atomic interfaces intend to change > > the _shadow_state of the current region to reflect the collection > process, > > not to push or steal anything. I've changed try_push to mark_normal and > > try_steal to mark_shadow, respectively. The _shadow_state and the return > > value of these functions can help the collector to determine 1) whether a > > region should be collected by the shadow region optimization and 2) if > the > > data in a shadow region are ready to be copied back to the corresponding > > heap region. > > > > Thanks again for your valuable reviews. If there are any further > problems, > > please contact me at any time. I was also wondering could you please CC > the > > following mails to me? There seem some problems with my email, and I > didn't > > receive your last mail until I searched the mail lists in OpenJDK > website. > > Thanks very much! > > > > Best Regards, > > Haoyu Li, > > Institute of Parallel and Distributed Systems(IPADS), > > School of Software, > > Shanghai Jiao Tong University > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: shadow-region-incr.patch Type: text/x-patch Size: 3509 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: shadow-region-v6.patch Type: text/x-patch Size: 30353 bytes Desc: not available URL: From stefan.karlsson at oracle.com Thu Nov 28 13:57:50 2019 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Thu, 28 Nov 2019 14:57:50 +0100 Subject: RFR: 8234543: ZGC: Parallel pre-touch In-Reply-To: <47dfbfe2-996c-771d-66f2-27683cb458b1@oracle.com> References: <47dfbfe2-996c-771d-66f2-27683cb458b1@oracle.com> Message-ID: <94f76eab-09e6-f618-b50b-8185d2c2829c@oracle.com> Looks good. StefanK On 2019-11-21 10:32, Per Liden wrote: > When using -XX:+AlwaysPreTouch, ZGC is currently doing single threaded > pre-touch. This patch makes this a parallel operation. This improves > startup time, especially when using large heaps. For example, when using > a 3TB heap (-XX:+AlwaysPreTouch -Xms3T -Xmx3T), startup time is improved > by about 30x. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8234543 > Webrev: http://cr.openjdk.java.net/~pliden/8234543/webrev.0 > > /Per From per.liden at oracle.com Thu Nov 28 14:00:50 2019 From: per.liden at oracle.com (Per Liden) Date: Thu, 28 Nov 2019 15:00:50 +0100 Subject: RFR: 8234543: ZGC: Parallel pre-touch In-Reply-To: <94f76eab-09e6-f618-b50b-8185d2c2829c@oracle.com> References: <47dfbfe2-996c-771d-66f2-27683cb458b1@oracle.com> <94f76eab-09e6-f618-b50b-8185d2c2829c@oracle.com> Message-ID: Thanks Stefan! /Per On 11/28/19 2:57 PM, Stefan Karlsson wrote: > Looks good. > > StefanK > > On 2019-11-21 10:32, Per Liden wrote: >> When using -XX:+AlwaysPreTouch, ZGC is currently doing single threaded >> pre-touch. This patch makes this a parallel operation. This improves >> startup time, especially when using large heaps. For example, when >> using a 3TB heap (-XX:+AlwaysPreTouch -Xms3T -Xmx3T), startup time is >> improved by about 30x. >> >> Bug: https://bugs.openjdk.java.net/browse/JDK-8234543 >> Webrev: http://cr.openjdk.java.net/~pliden/8234543/webrev.0 >> >> /Per From thomas.schatzl at oracle.com Thu Nov 28 17:39:32 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Thu, 28 Nov 2019 18:39:32 +0100 Subject: RFR (M): 8231579: G1's incremental calculation of region elapsed time always uses the same age group for prediction In-Reply-To: References: <6bb0be27-c32c-f2c0-b21e-b03802590cf4@oracle.com> <3fab7724-302f-faab-40bf-b95df0de5b05@oracle.com> <0F26F96D-1861-4956-9F64-C217D8B7CD6A@oracle.com> Message-ID: <30950ef6-6d42-ac47-30e1-8e0d9a4edf61@oracle.com> Hi Kim, thanks for your review. On 27.11.19 23:51, Kim Barrett wrote: >> On Nov 27, 2019, at 5:43 PM, Kim Barrett wrote: >> >>> On Nov 20, 2019, at 7:49 AM, Thomas Schatzl wrote: >>> http://cr.openjdk.java.net/~tschatzl/8231579/webrev.0_to_1/ (diff) >>> http://cr.openjdk.java.net/~tschatzl/8231579/webrev.1/ (full) >>> >>> Thanks, >>> Thomas >> >> Looks good. > > Oops, I did have one comment: > > src/hotspot/share/gc/g1/g1Policy.cpp > 136 G1YoungLengthPredictor(bool during_cm, > > After the other changes to that class, the during_cm constructor > argument seems to no longer be used. > Fixed in: http://cr.openjdk.java.net/~tschatzl/8231579/webrev.1_to_2/ (diff) http://cr.openjdk.java.net/~tschatzl/8231579/webrev.2/ (full) Thanks, Thomas From kim.barrett at oracle.com Thu Nov 28 19:22:36 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 28 Nov 2019 14:22:36 -0500 Subject: RFR (M): 8227739: Merge cost predictions for scanning cards and log buffer entries In-Reply-To: <221df4ba221a56e2384135321361ca1bf90866a8.camel@oracle.com> References: <6f66ee24-112d-032d-a28e-1a08850a9268@oracle.com> <7B21E3CD-4264-481C-B455-B240A45D1631@oracle.com> <221df4ba221a56e2384135321361ca1bf90866a8.camel@oracle.com> Message-ID: <33D631EA-8734-4FDC-82E6-8BA395EA85FC@oracle.com> > On Nov 28, 2019, at 6:13 AM, Thomas Schatzl wrote: > > http://cr.openjdk.java.net/~tschatzl/8227739/webrev.1_to_2/ (diff) > http://cr.openjdk.java.net/~tschatzl/8227739/webrev.2/ (full) > > Thanks, > Thomas ------------------------------------------------------------------------------ src/hotspot/share/gc/g1/g1HotCardCache.cpp 74 // This does not need an atomic update. Racing threads may at most write the 75 // same value. 76 if (index == _hot_cache_size) { 77 Atomic::store(&_cache_wrapped_around, true); That comment seems like it has some wording problems. It's also about the store, so perhaps should be between lines 76 and 77. Maybe reword something like Can use relaxed store because all racing threads are writing the same value and there aren't any concurrent readers. ------------------------------------------------------------------------------ Looks good. I don't need a new webrev for futzing with the comment. From kim.barrett at oracle.com Thu Nov 28 19:25:10 2019 From: kim.barrett at oracle.com (Kim Barrett) Date: Thu, 28 Nov 2019 14:25:10 -0500 Subject: RFR (M): 8231579: G1's incremental calculation of region elapsed time always uses the same age group for prediction In-Reply-To: <30950ef6-6d42-ac47-30e1-8e0d9a4edf61@oracle.com> References: <6bb0be27-c32c-f2c0-b21e-b03802590cf4@oracle.com> <3fab7724-302f-faab-40bf-b95df0de5b05@oracle.com> <0F26F96D-1861-4956-9F64-C217D8B7CD6A@oracle.com> <30950ef6-6d42-ac47-30e1-8e0d9a4edf61@oracle.com> Message-ID: <740CF987-D657-48F3-BDBC-414D338F72D6@oracle.com> > On Nov 28, 2019, at 12:39 PM, Thomas Schatzl wrote: > > http://cr.openjdk.java.net/~tschatzl/8231579/webrev.1_to_2/ (diff) > http://cr.openjdk.java.net/~tschatzl/8231579/webrev.2/ (full) > > Thanks, > Thomas Looks good. From stefan.johansson at oracle.com Fri Nov 29 08:29:03 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 29 Nov 2019 09:29:03 +0100 Subject: RFR (M): 8231579: G1's incremental calculation of region elapsed time always uses the same age group for prediction In-Reply-To: <30950ef6-6d42-ac47-30e1-8e0d9a4edf61@oracle.com> References: <6bb0be27-c32c-f2c0-b21e-b03802590cf4@oracle.com> <3fab7724-302f-faab-40bf-b95df0de5b05@oracle.com> <0F26F96D-1861-4956-9F64-C217D8B7CD6A@oracle.com> <30950ef6-6d42-ac47-30e1-8e0d9a4edf61@oracle.com> Message-ID: <3A2FE376-E3F3-4825-8C42-8313C5D09C57@oracle.com> Hi Thomas, > 28 nov. 2019 kl. 18:39 skrev Thomas Schatzl : > > Hi Kim, > > thanks for your review. > > On 27.11.19 23:51, Kim Barrett wrote: >>> On Nov 27, 2019, at 5:43 PM, Kim Barrett wrote: >>> >>>> On Nov 20, 2019, at 7:49 AM, Thomas Schatzl wrote: >>>> http://cr.openjdk.java.net/~tschatzl/8231579/webrev.0_to_1/ (diff) >>>> http://cr.openjdk.java.net/~tschatzl/8231579/webrev.1/ (full) >>>> >>>> Thanks, >>>> Thomas >>> >>> Looks good. >> Oops, I did have one comment: >> src/hotspot/share/gc/g1/g1Policy.cpp >> 136 G1YoungLengthPredictor(bool during_cm, >> After the other changes to that class, the during_cm constructor >> argument seems to no longer be used. > > Fixed in: > > http://cr.openjdk.java.net/~tschatzl/8231579/webrev.1_to_2/ (diff) > http://cr.openjdk.java.net/~tschatzl/8231579/webrev.2/ (full) Still good, Stefan > > Thanks, > Thomas From stefan.johansson at oracle.com Fri Nov 29 08:31:30 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 29 Nov 2019 09:31:30 +0100 Subject: RFR (M): 8227739: Merge cost predictions for scanning cards and log buffer entries In-Reply-To: <33D631EA-8734-4FDC-82E6-8BA395EA85FC@oracle.com> References: <6f66ee24-112d-032d-a28e-1a08850a9268@oracle.com> <7B21E3CD-4264-481C-B455-B240A45D1631@oracle.com> <221df4ba221a56e2384135321361ca1bf90866a8.camel@oracle.com> <33D631EA-8734-4FDC-82E6-8BA395EA85FC@oracle.com> Message-ID: <1D5F7A44-C248-419A-9B40-B497B946ECE2@oracle.com> > 28 nov. 2019 kl. 20:22 skrev Kim Barrett : > >> On Nov 28, 2019, at 6:13 AM, Thomas Schatzl wrote: >> >> http://cr.openjdk.java.net/~tschatzl/8227739/webrev.1_to_2/ (diff) >> http://cr.openjdk.java.net/~tschatzl/8227739/webrev.2/ (full) >> >> Thanks, >> Thomas > > ------------------------------------------------------------------------------ > src/hotspot/share/gc/g1/g1HotCardCache.cpp > 74 // This does not need an atomic update. Racing threads may at most write the > 75 // same value. > 76 if (index == _hot_cache_size) { > 77 Atomic::store(&_cache_wrapped_around, true); > > That comment seems like it has some wording problems. It's also about > the store, so perhaps should be between lines 76 and 77. Maybe reword > something like > > Can use relaxed store because all racing threads are writing the same > value and there aren't any concurrent readers. > ------------------------------------------------------------------------------ > > Looks good. I don't need a new webrev for futzing with the comment. > Looks good to me to and no need for further webrevs. Stefan From thomas.schatzl at oracle.com Fri Nov 29 08:56:46 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 29 Nov 2019 09:56:46 +0100 Subject: RFR (M): 8227739: Merge cost predictions for scanning cards and log buffer entries In-Reply-To: <1D5F7A44-C248-419A-9B40-B497B946ECE2@oracle.com> References: <6f66ee24-112d-032d-a28e-1a08850a9268@oracle.com> <7B21E3CD-4264-481C-B455-B240A45D1631@oracle.com> <221df4ba221a56e2384135321361ca1bf90866a8.camel@oracle.com> <33D631EA-8734-4FDC-82E6-8BA395EA85FC@oracle.com> <1D5F7A44-C248-419A-9B40-B497B946ECE2@oracle.com> Message-ID: <278d00c8-42d1-910f-6954-a0da17a92f10@oracle.com> Hi Kim, Stefan, On 29.11.19 09:31, Stefan Johansson wrote: > > >> 28 nov. 2019 kl. 20:22 skrev Kim Barrett : >> >>> On Nov 28, 2019, at 6:13 AM, Thomas Schatzl wrote: >>> >>> http://cr.openjdk.java.net/~tschatzl/8227739/webrev.1_to_2/ (diff) >>> http://cr.openjdk.java.net/~tschatzl/8227739/webrev.2/ (full) [...] >> >> That comment seems like it has some wording problems. It's also about >> the store, so perhaps should be between lines 76 and 77. Maybe reword >> something like >> >> Can use relaxed store because all racing threads are writing the same >> value and there aren't any concurrent readers. >> ------------------------------------------------------------------------------ >> >> Looks good. I don't need a new webrev for futzing with the comment. >> > > Looks good to me to and no need for further webrevs. > > Stefan > thanks for your reviews. Thomas From thomas.schatzl at oracle.com Fri Nov 29 09:25:32 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 29 Nov 2019 10:25:32 +0100 Subject: RFR (M): 8233919: Incrementally calculate the occupied cards in a heap region remembered set In-Reply-To: References: <23b30b41-b109-4d05-606f-fa6a87a07897@oracle.com> <16491a20-6f2a-0d70-42c2-6c23c3a9e407@oracle.com> Message-ID: <83dc575a-a994-a6e8-22d4-4dc621f29eac@oracle.com> Hi all, thanks for your reviews - however I unfortunately found a small bug in the current implementation that I would like to have fixed. In particular, when adding sparse cards the code incremented num_occupied always when SparsePRT::add_card returns true. However it also returns true if the card is duplicate, i.e. has already been found, which means that we should not increase the count. The actual difference is rather small as most of the duplicates are filtered out by the FromCardCache, but it occurs. The following patch fixes this. The change looks big, but is basically reshuffling code to accomodate the use of SparsePRTEntry::AddCardResult in SparsePRT::add_card. http://cr.openjdk.java.net/~tschatzl/8233919/webrev.2_to_3/ (diff) http://cr.openjdk.java.net/~tschatzl/8233919/webrev.3/ (full) Thanks, and sorry for the inconvenience, Thomas On 26.11.19 16:41, sangheon.kim at oracle.com wrote: > Hi Thomas, > > On 11/26/19 1:04 AM, Thomas Schatzl wrote: >> Hi Sangheon, >> >> ? thanks for looking at this. >> >> On 25.11.19 22:22, sangheon.kim at oracle.com wrote: >>> Hi Thomas, >>> >>> On 11/21/19 2:41 AM, Thomas Schatzl wrote: >>>> Hi, >>>> >>>> On 20.11.19 11:42, Stefan Johansson wrote: >>>>> Hi Thomas, >>>>> >>>>> On 2019-11-12 16:24, Thomas Schatzl wrote: >>>>>> Hi all, >> [...]>>> >>>>> I would prefer: >>>>> if (prt->add_reference(from)) { >>>>> ?? num_added_by_coarsening++; >>>>> } >>>>> Atomic::add... >>>>> >>>>> I you disagree, leave it as is. >>>> >>>> Fixed in >>>> http://cr.openjdk.java.net/~tschatzl/8233919/webrev.0_to_1/ >>>> http://cr.openjdk.java.net/~tschatzl/8233919/webrev.1/ >>> Webrev.1 looks good in general. >>> >>> ======================== >>> g1CollectionSet.cpp >>> ??250?? assert(old_rs_length <= new_rs_length, >>> ??251????????? "Remembered set sizes must increase (changed from " >>> SIZE_FORMAT " to " SIZE_FORMAT " region %u type %s)", >>> ??252????????? old_rs_length, new_rs_length, hr->hrm_index(), >>> hr->get_short_type_str()); >>> ??- I feel 'must increase' like 'old_rs_length < new_rs_length'. If >>> you don't agree leave it as is. :) >>> >>> ======================== >>> heapRegionRemSet.cpp >>> ??200???????? Atomic::inc(&_num_occupied, memory_order_relaxed); >>> - I already asked to Thomas offline. He said Atomic operation is not >>> necessary in this version but it is necessary for future patch when >>> the lock is removed. >> >> http://cr.openjdk.java.net/~tschatzl/8233919/webrev.1_to_2/ (diff) >> http://cr.openjdk.java.net/~tschatzl/8233919/webrev.2/ (full) >> >> Fixes the comment by changing the comment text to: >> >> ?251????????? "Remembered set decreased (changed from " SIZE_FORMAT " >> to " SIZE_FORMAT " region %u type %s)", >> > Looks good. > > Thanks, > Sangheon > > >> Thanks, >> ? Thomas > From stefan.johansson at oracle.com Fri Nov 29 09:32:01 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 29 Nov 2019 01:32:01 -0800 (PST) Subject: RFR: 8220465: Use shadow regions for faster ParallelGC full GCs In-Reply-To: References: <7ac0dde2-a32d-a641-d5ed-88c9f5fd8157@oracle.com> Message-ID: <88ECD3FD-485D-4633-B4EE-DF10BC5F3AFB@oracle.com> Hi Haoyu, Looks good, here are the updated webrevs: Full: http://cr.openjdk.java.net/~sjohanss/8220465/05/ Inc: http://cr.openjdk.java.net/~sjohanss/8220465/04-05/ Thanks, Stefan > 28 nov. 2019 kl. 14:27 skrev Haoyu Li : > > Hi Stefan, > > Thanks for your reviewing. I've checked the comments again and updated some more comments to make it more precise. Please find the attached patches. > > Best Regards, > Haoyu Li > > > Stefan Johansson ?2019?11?27??? ??9:23??? > Hi Haoyu, > > I've quickly looked through the changes and they look good in general, > the renaming makes the code easier to follow. > > One small knit, the paragraph referring your paper starts with "More > more details", which I guess should be changed to "For more detials". > > Here are updated webrevs: > Full: http://cr.openjdk.java.net/~sjohanss/8220465/04/ > Inc: http://cr.openjdk.java.net/~sjohanss/8220465/03-04/ > > Cheers, > Stefan > > On 2019-11-27 03:15, Haoyu Li wrote: > > Hi Thomas, > > > > Thanks for your constructive suggestions! I've addressed the issues you > > mentioned, and the updated patches are attached. Please find the details > > below. > > > >> - static const ints can be initialized in the definition (UNUSED, > >> SHADOW, ...); also they should be CamelCased; they are very unspecific > >> too - I added some prefix to distinguish them a bit. > >> > >> Now these values of _shadow_state are initilized in the definition and > > CamelCased. I have also changed their names to ba more specific: a region > > is *UnusedRegion* when untouched, and will become *NormalRegion* if > > processed with the original parallel full GC algorithm; if an idle GC > > thread steals an unavailable region to process it with the help of a shadow > > region, the thread will mark it to *ShadowRegion*, *FilledShadow*, and > > *CopiedShadow* in sequence. > > > > - the documentation about this change is imho lacking. > >> > >> - It would be nice to explain the idea of shadow regions somewhere > >> assuming that you know how parallel works. Including the reference to > >> the paper. :) > >> > >> - some of the comments just show what code (often a single statement) > >> does, not the what and why or the reason why a particular method or > >> member exists. Or explains one or the other. > >> > >> E.g. > >> > >> "The shadow region array, we use it in a LIFO fashion, so that we can > >> reuse shadow regions for better data locality and utilization" > >> > >> - at this point we have no idea what a "shadow region" is and we > >> can't find out easily because it is called "shadow record" or "steal > >> record" elsewhere. > >> > >> Something better could be: > >> > >> "Contains currently free shadow regions (assuming we converge on that > >> name). We use it in a LIFO fashion for better data locality and > >> utilization." > >> > >> Thanks for pointing out the lack of illustration. I've added several > > paragraphs to demonstrate the main idea, the typical workflow, and the > > source paper of shadow region optimization in the comments of > > ParallelCompact::initialize_shadow_region(). Besides, I've also checked > > other comments in the patch and made them more precise. > > > > > >> - I think there is a missed optimization opportunity in (now) > >> PSParallelCompact::initialize_shadow_regions(). There, the code > >> initializes the "free" region ids to region_at_top+1 to end_region of a > >> particular space. > >> > >> If the top for a given space is at a region boundary (e.g. if a space is > >> empty, which is probably common for one of the survivor spaces), you > >> loose a single region per space. > >> > >> One reason might that the code uses region "0" as sentinel to indicate > >> "there is no shadow region available" in > >> ParCompactionManager::acquire_shadow_region(). > >> > >> This could be fixed by improving the code in > >> PSParallelCompact::initialize_shadow_regions() and use a sentinel region > >> value of (size_t)~0 (as an explicit constant). > >> > >> Even if you do not change this, please introduce an explicit constant > >> for this sentinel value. This makes the code more self-explanatory. > >> > >> Sorry for the misleading +1 operation. The +1 can be safely removed. The > > sentinel value 0 does not cause this design because the first region (in > > old space) cannot be a shadow region. > > > > > >> - at least in ParallelCompactData::RegionData::try_steal I would add a > >> dirty read of the _shadow_state to avoid the overhead of obviously > >> unsuccessful steal attempts (I do not know about frequencies of those, > >> so ymmv, but probably it would be easiest to add it everywhere). > >> > >> Also all the cmpxchg can/should use memory_order_relaxed to avoid the > >> two full fences every time accessed as far as I can tell. > >> > >> Excellent suggestions! I didn't consider the performance factors in these > > atomic instructions before, and you're right that GC threads may suffer > > many failures the first time getting shadow regions. Changing the memory > > order to memory_order_relaxed is also helpful. > > > > > >> - not sure about whether "acquire_shadow_region()/release_shadow_region" > >> are good names for > >> "PSParallelCompact::try_pop_shadow_region/push_shadow_region" (or > >> something else). > >> > >> "Acquire"/"release" has a very specific semantic related to a completely > >> different area (memory ordering in MP systems), so we should probably > >> avoid using them. There are other well-used pairs of names to add and > >> remove elements to a container too. > >> > >> Naming functions with "Acuire" and "Release" is indeed misleading. I've > > changed these functions to push/pop_shadow_region, > > push/pop_shadow_region_mt_safe, and remove_all_shadow_regions in > > PSParallelCompact to fit the underlying LIFO _shadow_region_array. Do you > > think these names are appropriate now? > > > > > >> - the changes in PSParallelCompact sometimes use the terms > >> "steal_record", "shadow_record" and "shadow_region" (e.g. > >> _shadow_region_array) interchangeably. > >> > >> Can you give a reason for this? I am good with any (with a preference > >> for "shadow_region" since it gives an idea of the contents while > >> "record" is quite generic), but it makes reading the code harder than > >> necessary. > >> > >> Sorry for the inconsistency between variable names. I introduced > > steal_record to record the index of the next shadow region, so that a GC > > thread could seek shadow regions from the last point instead of the > > beginning. To make the code more specific, I've changed the variable > > _shadow_record to _next_shadow_region and moved the code in > > PSParallelCompact::initialize_shadow_record to > > PSParallelCompact::initialize_shadow_regions. > > > > > >> - the names of the new methods e.g. in PSParallelCompact::RegionData > >> should be more precise; e.g. please add what does "try_push" wants to > >> push? Or "try_steal" steal? > >> Not even the comments for these contain that information, and I believe > >> that by better naming of the methods, we can avoid the comments > >> completely in most cases. > >> > >> Sorry for the vague code. These five atomic interfaces intend to change > > the _shadow_state of the current region to reflect the collection process, > > not to push or steal anything. I've changed try_push to mark_normal and > > try_steal to mark_shadow, respectively. The _shadow_state and the return > > value of these functions can help the collector to determine 1) whether a > > region should be collected by the shadow region optimization and 2) if the > > data in a shadow region are ready to be copied back to the corresponding > > heap region. > > > > Thanks again for your valuable reviews. If there are any further problems, > > please contact me at any time. I was also wondering could you please CC the > > following mails to me? There seem some problems with my email, and I didn't > > receive your last mail until I searched the mail lists in OpenJDK website. > > Thanks very much! > > > > Best Regards, > > Haoyu Li, > > Institute of Parallel and Distributed Systems(IPADS), > > School of Software, > > Shanghai Jiao Tong University > > > From thomas.schatzl at oracle.com Fri Nov 29 10:55:28 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 29 Nov 2019 11:55:28 +0100 Subject: RFR: 8165443: Free Collection Set serial phase takes very long on large heaps In-Reply-To: <94565e78-149c-b8b5-ff17-752481e0e36e@oracle.com> References: <94565e78-149c-b8b5-ff17-752481e0e36e@oracle.com> Message-ID: Hi, On 26.11.19 16:01, Stefan Johansson wrote: > Hi, > > Please review this fix to improve freeing the collection set when having > a lot of regions. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8165443 > Webrev: http://cr.openjdk.java.net/~sjohanss/8165443/00/ > some initial comments: - g1CollectionSet::iterate_part_from: s/lenght/length - HeapRegion::handle_evacuation_failed() -> handle_evacuation_failure() - heapRegionManager.hpp:176: s/region/regions - could the FreeRegionList refactoring be factored out? Not insisting on this, but might decrease the webrev quite a bit. - G1FreeCollectionSetTask::G1FreeCollectionSetClosure::EventForRegion: maybe rename to JFREventForRegion to make it more clear that this is for JFR events. - I would prefer that if a change modifies timing code, move it to use Ticks internally, i.e. in case of ...::TimerForRegion use Ticks/Tickspan as time base, not double, to slowly move to using Ticks everywhere. Same with the double's for _young_time and _non_young_time in the area. Also G1CollectedHeap::free_collection_set() should be changed imho. (Yes, this introduces some ugly Tickspan::seconds() * 1000.0 calls, but they are probably easier to clean up later). - there is imho no need for the G1FreeCollectionSetClosure::stats() getter as it is only ever used locally and is trivial. Please at least make it private :) - not sure how I feel about not calling the destructor for the worker's FreeCSetStats. While it is empty I would still recommend calling it before freeing the containing array. - same with the local FreRegionLists in ~G1RebuildFreeListTask. - HeapRegionManager::is_available() is (mostly) meant as internal function, but due to assert'ing I was forced to make it public (probably should have a non-public version and a public one that is only available with assertions). The use in G1RebuildFreeListTask::work() kind of violates this idea (and sorry for not mentioning it anywhere in the code). Maybe move the entire freelist rebuild task into HeapRegionManager where it imho fits much better? HeapRegionManager is and should be the "owner" of the free list. The call to HeapRegion::unlink_from_list() can probably be made earlier, or see below. - another change I do not really like is the difference between "abandoning" the free list (and then later clearing the HeapRegion links in parallel) and "removing" the free list. While it makes sense from a performance POV, I would be happier if we could get away without introducing this tiny semantic difference. An option would be that there were a FreeRegionList::add_to_tail that just overwrites the links. This would remove the need for the new "abandon" and other support methods in a few places and hide the ugliness there. What do you think? - while you mentioned that you did not look into balancing the work for the rebuild free list action, but please limit the workers in the G1RebuildFreeListTask by at least the number of regions in the heap. (Move the chunk sizing outside of the G1RebuildFreeListTask for that.) This sounds comical, but current machines' number of available threads are quickly approaching small heap sizes... - instead of workers()->run_task() please use G1CollectedHeap::run_task(). That also removes some timing code for you around these places :) - in the log, I believe that the "(Non-)Young Free collection set" "phases" should be indented one more place. I.e. GC(0) Free Collection Set: 0.0ms GC(0) Serial Free Collection Set: 0.0ms GC(0) Parallel Free Collection Set (ms): Min: ... GC(0) Young Free Collection Set (ms): Min: ... GC(0) Non-Young Free Collection Set (ms): skipped should be GC(0) Free Collection Set: 0.0ms GC(0) Serial Free Collection Set: 0.0ms GC(0) Parallel Free Collection Set (ms): Min: ... GC(0) Young Free Collection Set (ms): Min: ... GC(0) Non-Young Free Collection Set (ms): skipped imo. Thanks, Thomas From thomas.schatzl at oracle.com Fri Nov 29 11:03:36 2019 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Fri, 29 Nov 2019 12:03:36 +0100 Subject: RFR: 8165443: Free Collection Set serial phase takes very long on large heaps In-Reply-To: References: <94565e78-149c-b8b5-ff17-752481e0e36e@oracle.com> Message-ID: Hi again, also, I am tempted to suggest to remove the update-methods for the internal FreeCSetStats class. Since they are all single-use and very local, I would not be against removing them and making the members public. Thanks, Thomas On 29.11.19 11:55, Thomas Schatzl wrote: > Hi, > > On 26.11.19 16:01, Stefan Johansson wrote: >> Hi, >> >> Please review this fix to improve freeing the collection set when >> having a lot of regions. >> >> Issue: https://bugs.openjdk.java.net/browse/JDK-8165443 >> Webrev: http://cr.openjdk.java.net/~sjohanss/8165443/00/ >> > > ? some initial comments: > > - g1CollectionSet::iterate_part_from: s/lenght/length > > - HeapRegion::handle_evacuation_failed() -> handle_evacuation_failure() > > - heapRegionManager.hpp:176: s/region/regions > > - could the FreeRegionList refactoring be factored out? Not insisting on > this, but might decrease the webrev quite a bit. > > - G1FreeCollectionSetTask::G1FreeCollectionSetClosure::EventForRegion: > maybe rename to JFREventForRegion to make it more clear that this is for > JFR events. > > - I would prefer that if a change modifies timing code, move it to use > Ticks internally, i.e. in case of ...::TimerForRegion use Ticks/Tickspan > as time base, not double, to slowly move to using Ticks everywhere. > > Same with the double's for _young_time and _non_young_time in the area. > > Also G1CollectedHeap::free_collection_set() should be changed imho. > > (Yes, this introduces some ugly Tickspan::seconds() * 1000.0 calls, but > they are probably easier to clean up later). > > - there is imho no need for the G1FreeCollectionSetClosure::stats() > getter as it is only ever used locally and is trivial. Please at least > make it private :) > > - not sure how I feel about not calling the destructor for the worker's > FreeCSetStats. While it is empty I would still recommend calling it > before freeing the containing array. > > - same with the local FreRegionLists in ~G1RebuildFreeListTask. > > - HeapRegionManager::is_available() is (mostly) meant as internal > function, but due to assert'ing I was forced to make it public (probably > should have a non-public version and a public one that is only available > with assertions). > The use in G1RebuildFreeListTask::work() kind of violates this idea (and > sorry for not mentioning it anywhere in the code). > > Maybe move the entire freelist rebuild task into HeapRegionManager where > it imho fits much better? HeapRegionManager is and should be the "owner" > of the free list. > > The call to HeapRegion::unlink_from_list() can probably be made earlier, > or see below. > > - another change I do not really like is the difference between > "abandoning" the free list (and then later clearing the HeapRegion links > in parallel) and "removing" the free list. While it makes sense from a > performance POV, I would be happier if we could get away without > introducing this tiny semantic difference. > > An option would be that there were a FreeRegionList::add_to_tail that > just overwrites the links. This would remove the need for the new > "abandon" and other support methods in a few places and hide the > ugliness there. > > What do you think? > > - while you mentioned that you did not look into balancing the work for > the rebuild free list action, but please limit the workers in the > G1RebuildFreeListTask by at least the number of regions in the heap. > > (Move the chunk sizing outside of the G1RebuildFreeListTask for that.) > > This sounds comical, but current machines' number of available threads > are quickly approaching small heap sizes... > > - instead of workers()->run_task() please use > G1CollectedHeap::run_task(). That also removes some timing code for you > around these places :) > > - in the log, I believe that the "(Non-)Young Free collection set" > "phases" should be indented one more place. > > I.e. > > GC(0)???? Free Collection Set: 0.0ms > GC(0)?????? Serial Free Collection Set: 0.0ms > GC(0)?????? Parallel Free Collection Set (ms): Min: ... > GC(0)?????? Young Free Collection Set (ms): Min:? ... > GC(0)?????? Non-Young Free Collection Set (ms): skipped > > should be > > GC(0)???? Free Collection Set: 0.0ms > GC(0)?????? Serial Free Collection Set: 0.0ms > GC(0)?????? Parallel Free Collection Set (ms): Min: ... > GC(0)???????? Young Free Collection Set (ms): Min:? ... > GC(0)???????? Non-Young Free Collection Set (ms): skipped > > imo. > > Thanks, > ? Thomas From stefan.johansson at oracle.com Fri Nov 29 13:38:13 2019 From: stefan.johansson at oracle.com (Stefan Johansson) Date: Fri, 29 Nov 2019 14:38:13 +0100 Subject: RFR (M): 8233919: Incrementally calculate the occupied cards in a heap region remembered set In-Reply-To: <83dc575a-a994-a6e8-22d4-4dc621f29eac@oracle.com> References: <23b30b41-b109-4d05-606f-fa6a87a07897@oracle.com> <16491a20-6f2a-0d70-42c2-6c23c3a9e407@oracle.com> <83dc575a-a994-a6e8-22d4-4dc621f29eac@oracle.com> Message-ID: <9eb19105-c1f3-2592-a1a9-7afb1546feff@oracle.com> Looks good, StefanJ On 2019-11-29 10:25, Thomas Schatzl wrote: > Hi all, > > ? thanks for your reviews - however I unfortunately found a small bug > in the current implementation that I would like to have fixed. > > In particular, when adding sparse cards the code incremented > num_occupied always when SparsePRT::add_card returns true. However it > also returns true if the card is duplicate, i.e. has already been found, > which means that we should not increase the count. > > The actual difference is rather small as most of the duplicates are > filtered out by the FromCardCache, but it occurs. > > The following patch fixes this. The change looks big, but is basically > reshuffling code to accomodate the use of SparsePRTEntry::AddCardResult > in SparsePRT::add_card. > > http://cr.openjdk.java.net/~tschatzl/8233919/webrev.2_to_3/ (diff) > http://cr.openjdk.java.net/~tschatzl/8233919/webrev.3/ (full) > > Thanks, and sorry for the inconvenience, > ? Thomas > > On 26.11.19 16:41, sangheon.kim at oracle.com wrote: >> Hi Thomas, >> >> On 11/26/19 1:04 AM, Thomas Schatzl wrote: >>> Hi Sangheon, >>> >>> ? thanks for looking at this. >>> >>> On 25.11.19 22:22, sangheon.kim at oracle.com wrote: >>>> Hi Thomas, >>>> >>>> On 11/21/19 2:41 AM, Thomas Schatzl wrote: >>>>> Hi, >>>>> >>>>> On 20.11.19 11:42, Stefan Johansson wrote: >>>>>> Hi Thomas, >>>>>> >>>>>> On 2019-11-12 16:24, Thomas Schatzl wrote: >>>>>>> Hi all, >>> [...]>>> >>>>>> I would prefer: >>>>>> if (prt->add_reference(from)) { >>>>>> ?? num_added_by_coarsening++; >>>>>> } >>>>>> Atomic::add... >>>>>> >>>>>> I you disagree, leave it as is. >>>>> >>>>> Fixed in >>>>> http://cr.openjdk.java.net/~tschatzl/8233919/webrev.0_to_1/ >>>>> http://cr.openjdk.java.net/~tschatzl/8233919/webrev.1/ >>>> Webrev.1 looks good in general. >>>> >>>> ======================== >>>> g1CollectionSet.cpp >>>> ??250?? assert(old_rs_length <= new_rs_length, >>>> ??251????????? "Remembered set sizes must increase (changed from " >>>> SIZE_FORMAT " to " SIZE_FORMAT " region %u type %s)", >>>> ??252????????? old_rs_length, new_rs_length, hr->hrm_index(), >>>> hr->get_short_type_str()); >>>> ??- I feel 'must increase' like 'old_rs_length < new_rs_length'. If >>>> you don't agree leave it as is. :) >>>> >>>> ======================== >>>> heapRegionRemSet.cpp >>>> ??200???????? Atomic::inc(&_num_occupied, memory_order_relaxed); >>>> - I already asked to Thomas offline. He said Atomic operation is not >>>> necessary in this version but it is necessary for future patch when >>>> the lock is removed. >>> >>> http://cr.openjdk.java.net/~tschatzl/8233919/webrev.1_to_2/ (diff) >>> http://cr.openjdk.java.net/~tschatzl/8233919/webrev.2/ (full) >>> >>> Fixes the comment by changing the comment text to: >>> >>> ?251????????? "Remembered set decreased (changed from " SIZE_FORMAT " >>> to " SIZE_FORMAT " region %u type %s)", >>> >> Looks good. >> >> Thanks, >> Sangheon >> >> >>> Thanks, >>> ? Thomas >> > From leihouyju at gmail.com Sat Nov 30 11:52:42 2019 From: leihouyju at gmail.com (Haoyu Li) Date: Sat, 30 Nov 2019 19:52:42 +0800 Subject: RFR: 8220465: Use shadow regions for faster ParallelGC full GCs In-Reply-To: <88ECD3FD-485D-4633-B4EE-DF10BC5F3AFB@oracle.com> References: <7ac0dde2-a32d-a641-d5ed-88c9f5fd8157@oracle.com> <88ECD3FD-485D-4633-B4EE-DF10BC5F3AFB@oracle.com> Message-ID: Hi Stefan, Thanks for your reviewing. If there are any further problems in the code, please feel free to contact me! I am always more than happy to improve the code. Best Regards, Haoyu Li Stefan Johansson ?2019?11?29??? ??5:32??? > Hi Haoyu, > > Looks good, here are the updated webrevs: > Full: http://cr.openjdk.java.net/~sjohanss/8220465/05/ > Inc: http://cr.openjdk.java.net/~sjohanss/8220465/04-05/ > > Thanks, > Stefan > > > > 28 nov. 2019 kl. 14:27 skrev Haoyu Li : > > > > Hi Stefan, > > > > Thanks for your reviewing. I've checked the comments again and updated > some more comments to make it more precise. Please find the attached > patches. > > > > Best Regards, > > Haoyu Li > > > > > > Stefan Johansson ?2019?11?27??? ??9:23??? > > Hi Haoyu, > > > > I've quickly looked through the changes and they look good in general, > > the renaming makes the code easier to follow. > > > > One small knit, the paragraph referring your paper starts with "More > > more details", which I guess should be changed to "For more detials". > > > > Here are updated webrevs: > > Full: http://cr.openjdk.java.net/~sjohanss/8220465/04/ > > Inc: http://cr.openjdk.java.net/~sjohanss/8220465/03-04/ > > > > Cheers, > > Stefan > > > > On 2019-11-27 03:15, Haoyu Li wrote: > > > Hi Thomas, > > > > > > Thanks for your constructive suggestions! I've addressed the issues > you > > > mentioned, and the updated patches are attached. Please find the > details > > > below. > > > > > >> - static const ints can be initialized in the definition (UNUSED, > > >> SHADOW, ...); also they should be CamelCased; they are very unspecific > > >> too - I added some prefix to distinguish them a bit. > > >> > > >> Now these values of _shadow_state are initilized in the definition and > > > CamelCased. I have also changed their names to ba more specific: a > region > > > is *UnusedRegion* when untouched, and will become *NormalRegion* if > > > processed with the original parallel full GC algorithm; if an idle GC > > > thread steals an unavailable region to process it with the help of a > shadow > > > region, the thread will mark it to *ShadowRegion*, *FilledShadow*, and > > > *CopiedShadow* in sequence. > > > > > > - the documentation about this change is imho lacking. > > >> > > >> - It would be nice to explain the idea of shadow regions somewhere > > >> assuming that you know how parallel works. Including the reference to > > >> the paper. :) > > >> > > >> - some of the comments just show what code (often a single > statement) > > >> does, not the what and why or the reason why a particular method or > > >> member exists. Or explains one or the other. > > >> > > >> E.g. > > >> > > >> "The shadow region array, we use it in a LIFO fashion, so that we can > > >> reuse shadow regions for better data locality and utilization" > > >> > > >> - at this point we have no idea what a "shadow region" is and we > > >> can't find out easily because it is called "shadow record" or "steal > > >> record" elsewhere. > > >> > > >> Something better could be: > > >> > > >> "Contains currently free shadow regions (assuming we converge on that > > >> name). We use it in a LIFO fashion for better data locality and > > >> utilization." > > >> > > >> Thanks for pointing out the lack of illustration. I've added several > > > paragraphs to demonstrate the main idea, the typical workflow, and the > > > source paper of shadow region optimization in the comments of > > > ParallelCompact::initialize_shadow_region(). Besides, I've also checked > > > other comments in the patch and made them more precise. > > > > > > > > >> - I think there is a missed optimization opportunity in (now) > > >> PSParallelCompact::initialize_shadow_regions(). There, the code > > >> initializes the "free" region ids to region_at_top+1 to end_region of > a > > >> particular space. > > >> > > >> If the top for a given space is at a region boundary (e.g. if a space > is > > >> empty, which is probably common for one of the survivor spaces), you > > >> loose a single region per space. > > >> > > >> One reason might that the code uses region "0" as sentinel to indicate > > >> "there is no shadow region available" in > > >> ParCompactionManager::acquire_shadow_region(). > > >> > > >> This could be fixed by improving the code in > > >> PSParallelCompact::initialize_shadow_regions() and use a sentinel > region > > >> value of (size_t)~0 (as an explicit constant). > > >> > > >> Even if you do not change this, please introduce an explicit constant > > >> for this sentinel value. This makes the code more self-explanatory. > > >> > > >> Sorry for the misleading +1 operation. The +1 can be safely removed. > The > > > sentinel value 0 does not cause this design because the first region > (in > > > old space) cannot be a shadow region. > > > > > > > > >> - at least in ParallelCompactData::RegionData::try_steal I would add a > > >> dirty read of the _shadow_state to avoid the overhead of obviously > > >> unsuccessful steal attempts (I do not know about frequencies of those, > > >> so ymmv, but probably it would be easiest to add it everywhere). > > >> > > >> Also all the cmpxchg can/should use memory_order_relaxed to avoid the > > >> two full fences every time accessed as far as I can tell. > > >> > > >> Excellent suggestions! I didn't consider the performance factors in > these > > > atomic instructions before, and you're right that GC threads may suffer > > > many failures the first time getting shadow regions. Changing the > memory > > > order to memory_order_relaxed is also helpful. > > > > > > > > >> - not sure about whether > "acquire_shadow_region()/release_shadow_region" > > >> are good names for > > >> "PSParallelCompact::try_pop_shadow_region/push_shadow_region" (or > > >> something else). > > >> > > >> "Acquire"/"release" has a very specific semantic related to a > completely > > >> different area (memory ordering in MP systems), so we should probably > > >> avoid using them. There are other well-used pairs of names to add and > > >> remove elements to a container too. > > >> > > >> Naming functions with "Acuire" and "Release" is indeed misleading. > I've > > > changed these functions to push/pop_shadow_region, > > > push/pop_shadow_region_mt_safe, and remove_all_shadow_regions in > > > PSParallelCompact to fit the underlying LIFO _shadow_region_array. Do > you > > > think these names are appropriate now? > > > > > > > > >> - the changes in PSParallelCompact sometimes use the terms > > >> "steal_record", "shadow_record" and "shadow_region" (e.g. > > >> _shadow_region_array) interchangeably. > > >> > > >> Can you give a reason for this? I am good with any (with a preference > > >> for "shadow_region" since it gives an idea of the contents while > > >> "record" is quite generic), but it makes reading the code harder than > > >> necessary. > > >> > > >> Sorry for the inconsistency between variable names. I introduced > > > steal_record to record the index of the next shadow region, so that a > GC > > > thread could seek shadow regions from the last point instead of the > > > beginning. To make the code more specific, I've changed the variable > > > _shadow_record to _next_shadow_region and moved the code in > > > PSParallelCompact::initialize_shadow_record to > > > PSParallelCompact::initialize_shadow_regions. > > > > > > > > >> - the names of the new methods e.g. in PSParallelCompact::RegionData > > >> should be more precise; e.g. please add what does "try_push" wants to > > >> push? Or "try_steal" steal? > > >> Not even the comments for these contain that information, and I > believe > > >> that by better naming of the methods, we can avoid the comments > > >> completely in most cases. > > >> > > >> Sorry for the vague code. These five atomic interfaces intend to > change > > > the _shadow_state of the current region to reflect the collection > process, > > > not to push or steal anything. I've changed try_push to mark_normal > and > > > try_steal to mark_shadow, respectively. The _shadow_state and the > return > > > value of these functions can help the collector to determine 1) > whether a > > > region should be collected by the shadow region optimization and 2) if > the > > > data in a shadow region are ready to be copied back to the > corresponding > > > heap region. > > > > > > Thanks again for your valuable reviews. If there are any further > problems, > > > please contact me at any time. I was also wondering could you please > CC the > > > following mails to me? There seem some problems with my email, and I > didn't > > > receive your last mail until I searched the mail lists in OpenJDK > website. > > > Thanks very much! > > > > > > Best Regards, > > > Haoyu Li, > > > Institute of Parallel and Distributed Systems(IPADS), > > > School of Software, > > > Shanghai Jiao Tong University > > > > > > >