From zhongwei.yao at linaro.org Thu Feb 1 01:26:40 2018 From: zhongwei.yao at linaro.org (Zhongwei Yao) Date: Thu, 1 Feb 2018 09:26:40 +0800 Subject: [aarch64-port-dev ] RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: <730a2120-cec3-18e8-4a0f-700289adfc7c@redhat.com> References: <730a2120-cec3-18e8-4a0f-700289adfc7c@redhat.com> Message-ID: Updated, it is at http://cr.openjdk.java.net/~zyao/8196064/ArrObj.java On 31 January 2018 at 22:57, Andrew Haley wrote: > On 30/01/18 03:30, Zhongwei Yao wrote: >> My micro-benchmark case > Where is it? Please put it in http://cr.openjdk.java.net/~zyao/8196064/. > >> gives about 8% performance improvement with this patch. > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 -- Best regards, Zhongwei From aph at redhat.com Thu Feb 1 15:36:34 2018 From: aph at redhat.com (Andrew Haley) Date: Thu, 1 Feb 2018 15:36:34 +0000 Subject: RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: References: Message-ID: <4cdcb619-754f-e6fc-bc18-67df9b892248@redhat.com> On 30/01/18 03:30, Zhongwei Yao wrote: > Bug: > https://bugs.openjdk.java.net/browse/JDK-8196064 > > Webrev: > http://cr.openjdk.java.net/~zyao/8196064/webrev.00 > > This patch merges adjacent load/store into ldp/stp in macro-assembler > as discussed in previous thread: > http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-December/027957.html > > My micro-benchmark case gives about 8% performance improvement with this patch. > > And this patch is based on commit be48be2 due to recent build failure > on jdk/hs master, and it passes all jtreg tests. > > Please help to review it. Very nice. You've done a good job. This merging opportunity happens a lot, and it reduces not only the size of compiled code but also the size of the interpreter and the generated stubs. It's definitely worth doing. before: ArrObj.testArrObjCopy avgt 20 3.075 ? 0.002 us/op after: ArrObj.testArrObjCopy avgt 20 2.779 ? 0.004 us/op So, some improvement. One thing I noticed: Compiled method (c2) 6812 627 4 org.linaro.benchmarks.ArrObj::testArrObjCopy (32 bytes) ;; B44: # B45 <- B43 Freq: 40.1247 0x000003ff94b0db50: ldr x17, [sp] 0x000003ff94b0db54: ldp x18, x1, [sp,#8] 0x000003ff94b0db58: ldp x10, x3, [sp,#24] 0x000003ff94b0db5c: ldp x4, x11, [sp,#40] 0x000003ff94b0db60: ldr x6, [sp,#56] It seems to me like there could be four ldp instructions here. There aren't (as far as I can see) because they're in an unfortunate order, so the MacroAssembler doesn't see the opportunity: ;; B44: # B45 <- B43 Freq: 39.2717 0x000003ffad17bbf8: ldr x17, [sp] 0x000003ffad17bbfc: ldr x1, [sp,#16] 0x000003ffad17bc00: ldr x18, [sp,#8] 0x000003ffad17bc04: ldr x10, [sp,#24] 0x000003ffad17bc08: ldr x3, [sp,#32] 0x000003ffad17bc0c: ldr x4, [sp,#40] 0x000003ffad17bc10: ldr x11, [sp,#48] 0x000003ffad17bc14: ldr x6, [sp,#56] This is OK: I don't think we want to make MacroAssembler even more complicated in order to handle occasional out-of-order spills. Also, last_membar and last_ldst seem to be doing the same thing. I think that you could simply have a single last_insn field rather than both last_membar and last_ldst, and then inspect the instruction to see if it's a membar or a ld/st. Finally, please add BLOCK_COMMENTs in merge_ldst, like this: if (!is_store) { BLOCK_COMMENT("merged ldr pair"); if (sz == 8) { ldp(rt_low, rt_high, adr_p); } else { ldpw(rt_low, rt_high, adr_p); } } else { BLOCK_COMMENT("merged str pair"); if (sz == 8) { stp(rt_low, rt_high, adr_p); } else { stpw(rt_low, rt_high, adr_p); } } Thanks. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From vladimir.kozlov at oracle.com Thu Feb 1 23:02:37 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 1 Feb 2018 15:02:37 -0800 Subject: 8u152 sigsegv in CounterDecay::do_method during VMThead safepoint processing In-Reply-To: References: Message-ID: <79848ed2-9358-6af4-30c3-b9d8050216d2@oracle.com> Hi Vitaly, I would suggest to file a bug. I looked through our bugs DB and did not find anything similar except 8156721 which you pointed. Based on disassembler the problem happened on first instruction: static void do_method(Method* m) { MethodCounters* mcs = m->method_counters(); 0x82000000: mov 0x18(%rdi),%rcx 0x82000004: push %rbp 0x82000005: mov %rsp,%rbp 0x82000008: test %rcx,%rcx 0x8200000b: je 0x82000035 RDI=0x0000001a00190005 Which means Method* m pointer is corrupted/incorrect (but not 0). CounterDecay::do_method() is called from InstanceKlass::methods_do() which has a fix in JDK 9 to process only loaded classes: https://bugs.openjdk.java.net/browse/JDK-8042727 http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/2c2aa6de8f60#l4.6 That is the only related changes I found. May be it is a different problem. CCing to runtime group. Regards, Vladimir On 1/22/18 7:36 AM, Vitaly Davidovich wrote: > Hi all, > > Are there any known issues with this method crashing the JVM? Here's a (slightly redacted) snippet > from the hs_err log: > > #____ > > # A fatal error has been detected by the Java Runtime Environment:____ > > #____ > > #? SIGSEGV (0xb) at pc=0x00002b14765b7210, pid=140880, tid=0x00002b149a643700____ > > #____ > > # JRE version: Java(TM) SE Runtime Environment (8.0_152-b16) (build 1.8.0_152-b16)____ > > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.152-b16 mixed mode linux-amd64 compressed oops)____ > > # Problematic frame:____ > > # V? [libjvm.so+0x49c210]? CounterDecay::do_method(Method*)+0x0____ > > #____ > > # Core dump written. Default location: or core.140880____ > > #____ > > # If you would like to submit a bug report, please visit:____ > > # http://bugreport.java.com/bugreport/crash.jsp ____ > > #____ > > __ __ > > ---------------? T H R E A D? ---------------____ > > __ __ > > Current thread (0x00002b147cb12800):? VMThread [stack: 0x00002b149a543000,0x00002b149a644000] > [id=140909]____ > > __ __ > > siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: 0x0000001a0019001d____ > > __ __ > > Registers:____ > > RAX=0x0000000000000000, RBX=0x0000000000000001, RCX=0x00002b156839ca18, RDX=0x00002b156799fc68____ > > RSP=0x00002b149a6429b8, RBP=0x00002b149a6429e0, RSI=0x00002b14765b7210, RDI=0x0000001a00190005____ > > R8 =0x0000000000000010, R9 =0x0000000000000001, R10=0x0000000000000000, R11=0x0000000000000001____ > > R12=0x0000000000000007, R13=0x00000007c03c8428, R14=0x00002b14765b7210, R15=0x0000000000000000____ > > RIP=0x00002b14765b7210, EFLAGS=0x0000000000010202, CSGSFS=0x0000000000000033, ERR=0x0000000000000004____ > > ? TRAPNO=0x000000000000000e____ > > __ __ > > Top of Stack: (sp=0x00002b149a6429b8)____ > > 0x00002b149a6429b8:?? 00002b147675e83d 0000000000000060____ > > 0x00002b149a6429c8:?? 00002b149a642a10 0000000000000000____ > > 0x00002b149a6429d8:?? 0000000000000000 00002b149a642a00____ > > 0x00002b149a6429e8:?? 00002b14765b3cd1 40590bbbbbbbbbbc____ > > 0x00002b149a6429f8:?? 00002b14770db352 00002b149a642a50____ > > 0x00002b149a642a08:?? 00002b1476adfbd6 00000000018f0100____ > > 0x00002b149a642a18:?? 0000000000000000 00002b149a642a40____ > > 0x00002b149a642a28:?? 00002b1476ba2100 431bde82d7b634db____ > > 0x00002b149a642a38:?? 00002b14f9ce9800 431bde82d7b634db____ > > 0x00002b149a642a48:?? 00002b14f9ce9800 00002b149a642b00____ > > 0x00002b149a642a58:?? 00002b1476ae08a6 00002b14770e6140____ > > 0x00002b149a642a68:?? 00002b149a642aa0 00002b147709ef83____ > > 0x00002b149a642a78:?? 0000000000ae12f0 000000307cb12800____ > > 0x00002b149a642a88:?? 0000000000000000 00000040000003e8____ > > 0x00002b149a642a98:?? 0000001a0000001a 00002b14ce44b580____ > > 0x00002b149a642aa8:?? 00002b1478ccda09 00002b1478cbb5d0____ > > 0x00002b149a642ab8:?? 00002b1400000000 00002b14ce44b5d0____ > > 0x00002b149a642ac8:?? 00002b14ce44b580 00002b14770db3d8____ > > 0x00002b149a642ad8:?? 0000000000000000 0000000000000000____ > > 0x00002b149a642ae8:?? 00002b14770db3d8 00002b147cb12800____ > > 0x00002b149a642af8:?? 00002b14770e5950 00002b149a642ca0____ > > 0x00002b149a642b08:?? 00002b1476bf22ef 00002b149a642b20____ > > 0x00002b149a642b18:?? 00002b149a642c30 00002b149a642b28____ > > 0x00002b149a642b28:?? 6e69747563657845 65706f204d562067____ > > 0x00002b149a642b38:?? 203a6e6f69746172 6c6f43636e493147____ > > 0x00002b149a642b48:?? 506e6f697463656c 6e6f640065737561____ > > 0x00002b149a642b58:?? 6e6f64206e6f0065 0000000000000065____ > > 0x00002b149a642b68:?? 0000001577100ce0 0000000000000000____ > > 0x00002b149a642b78:?? 00002b14770ae164 00002b1476116e40____ > > 0x00002b149a642b88:?? 0000000000000148 00002b147cb12800____ > > 0x00002b149a642b98:?? 0000000000000002 00002b149a642c40____ > > 0x00002b149a642ba8:?? 00002b1475e08a40 00002b149a543000____ > > __ __ > > Instructions: (pc=0x00002b14765b7210)____ > > 0x00002b14765b71f0:?? 55 31 c0 48 89 e5 c9 c3 90 90 90 90 90 90 90 90____ > > 0x00002b14765b7200:?? 55 b8 04 00 00 00 48 89 e5 c9 c3 90 90 90 90 90____ > > 0x00002b14765b7210:?? 48 8b 4f 18 55 48 89 e5 48 85 c9 74 28 8b 51 08____ > > 0x00002b14765b7220:?? 89 d0 c1 e8 03 89 c6 d1 fe 85 c0 7e 09 85 f6 b8____ > > __ __ > > Register to memory mapping:____ > > __ __ > > RAX=0x0000000000000000 is an unknown value____ > > RBX=0x0000000000000001 is an unknown value____ > > RCX=0x00002b156839ca18 is an unknown value____ > > RDX=0x00002b156799fc68 is pointing into metadata____ > > RSP=0x00002b149a6429b8 is an unknown value____ > > RBP=0x00002b149a6429e0 is an unknown value____ > > RSI=0x00002b14765b7210: in /jre/lib/amd64/server/libjvm.so at > 0x00002b147611b000____ > > RDI=0x0000001a00190005 is an unknown value____ > > R8 =0x0000000000000010 is an unknown value____ > > R9 =0x0000000000000001 is an unknown value____ > > R10=0x0000000000000000 is an unknown value____ > > R11=0x0000000000000001 is an unknown value____ > > R12=0x0000000000000007 is an unknown value____ > > R13=0x00000007c03c8428 is pointing into metadata____ > > R14=0x00002b14765b7210: in /jre/lib/amd64/server/libjvm.so at > 0x00002b147611b000____ > > R15=0x0000000000000000 is an unknown value____ > > __ __ > > __ __ > > Stack: [0x00002b149a543000,0x00002b149a644000],? sp=0x00002b149a6429b8,? free space=1022k____ > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)____ > > V? [libjvm.so+0x49c210]? CounterDecay::do_method(Method*)+0x0____ > > V? [libjvm.so+0x498cd1]? NonTieredCompPolicy::do_safepoint_work()+0x91____ > > V? [libjvm.so+0x9c4bd6]? SafepointSynchronize::do_cleanup_tasks()+0x76____ > > V? [libjvm.so+0x9c58a6]? SafepointSynchronize::begin()+0x406____ > > V? [libjvm.so+0xad72ef]? VMThread::loop()+0x1bf____ > > V? [libjvm.so+0xad7770]? VMThread::run()+0x70____ > > V? [libjvm.so+0x92d8d8]? java_start(Thread*)+0x108____ > > __ __ > > VM_Operation (0x00002b15140011a0): G1IncCollectionPause, mode: safepoint, requested by thread > 0x00002b14f9bb8000 > > > This is on a Debian Wheezy linux machine running Xeon Broadwell cores.? The reason I mention this > part is a quick google did show https://bugs.openjdk.java.net/browse/JDK-8156721 but that JBS is for > a different platform (with an overclocked CPU, apparently) and it's marked Incomplete. > > This crash was observed on about 17 separate JVMs (different hosts) at about the same time, all > running the same application code after about 3 weeks of uptime. > > I can provide more details if you'd like but wanted to see if this is a known (but rarely witnessed) > bug. > > Thanks > From vitalyd at gmail.com Fri Feb 2 00:00:56 2018 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Fri, 02 Feb 2018 00:00:56 +0000 Subject: 8u152 sigsegv in CounterDecay::do_method during VMThead safepoint processing In-Reply-To: <79848ed2-9358-6af4-30c3-b9d8050216d2@oracle.com> References: <79848ed2-9358-6af4-30c3-b9d8050216d2@oracle.com> Message-ID: On Thu, Feb 1, 2018 at 5:59 PM Vladimir Kozlov wrote: > Hi Vitaly, > > I would suggest to file a bug. I looked through our bugs DB and did not > find anything similar except > 8156721 which you pointed. Hi Vladimir, Thanks for following up on this. Would you like me to file a bug or did you mean someone on the hotspot team? > > > Based on disassembler the problem happened on first instruction: > > static void do_method(Method* m) { > MethodCounters* mcs = m->method_counters(); > > 0x82000000: mov 0x18(%rdi),%rcx > 0x82000004: push %rbp > 0x82000005: mov %rsp,%rbp > 0x82000008: test %rcx,%rcx > 0x8200000b: je 0x82000035 > > RDI=0x0000001a00190005 > > Which means Method* m pointer is corrupted/incorrect (but not 0). Indeed. Doesn?t even look like a pointer at all with that 5 in there. > > > CounterDecay::do_method() is called from InstanceKlass::methods_do() which > has a fix in JDK 9 to > process only loaded classes: > > https://bugs.openjdk.java.net/browse/JDK-8042727 > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/2c2aa6de8f60#l4.6 > > That is the only related changes I found. May be it is a different > problem. CCing to runtime group. Thanks for the JDK9 pointer. We?re slowly trying to get to 9 but it?s taking a long time and a lot of effort for a variety of reasons. Do you think that 9 fix could be backported to 8 in case it?s the same issue? > > > Regards, > Vladimir > > On 1/22/18 7:36 AM, Vitaly Davidovich wrote: > > Hi all, > > > > Are there any known issues with this method crashing the JVM? Here's a > (slightly redacted) snippet > > from the hs_err log: > > > > #____ > > > > # A fatal error has been detected by the Java Runtime Environment:____ > > > > #____ > > > > # SIGSEGV (0xb) at pc=0x00002b14765b7210, pid=140880, > tid=0x00002b149a643700____ > > > > #____ > > > > # JRE version: Java(TM) SE Runtime Environment (8.0_152-b16) (build > 1.8.0_152-b16)____ > > > > # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.152-b16 mixed mode > linux-amd64 compressed oops)____ > > > > # Problematic frame:____ > > > > # V [libjvm.so+0x49c210] CounterDecay::do_method(Method*)+0x0____ > > > > #____ > > > > # Core dump written. Default location: or core.140880____ > > > > #____ > > > > # If you would like to submit a bug report, please visit:____ > > > > # http://bugreport.java.com/bugreport/crash.jsp < > http://bugreport.java.com/bugreport/crash.jsp>____ > > > > #____ > > > > __ __ > > > > --------------- T H R E A D ---------------____ > > > > __ __ > > > > Current thread (0x00002b147cb12800): VMThread [stack: > 0x00002b149a543000,0x00002b149a644000] > > [id=140909]____ > > > > __ __ > > > > siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: > 0x0000001a0019001d____ > > > > __ __ > > > > Registers:____ > > > > RAX=0x0000000000000000, RBX=0x0000000000000001, RCX=0x00002b156839ca18, > RDX=0x00002b156799fc68____ > > > > RSP=0x00002b149a6429b8, RBP=0x00002b149a6429e0, RSI=0x00002b14765b7210, > RDI=0x0000001a00190005____ > > > > R8 =0x0000000000000010, R9 =0x0000000000000001, R10=0x0000000000000000, > R11=0x0000000000000001____ > > > > R12=0x0000000000000007, R13=0x00000007c03c8428, R14=0x00002b14765b7210, > R15=0x0000000000000000____ > > > > RIP=0x00002b14765b7210, EFLAGS=0x0000000000010202, > CSGSFS=0x0000000000000033, ERR=0x0000000000000004____ > > > > TRAPNO=0x000000000000000e____ > > > > __ __ > > > > Top of Stack: (sp=0x00002b149a6429b8)____ > > > > 0x00002b149a6429b8: 00002b147675e83d 0000000000000060____ > > > > 0x00002b149a6429c8: 00002b149a642a10 0000000000000000____ > > > > 0x00002b149a6429d8: 0000000000000000 00002b149a642a00____ > > > > 0x00002b149a6429e8: 00002b14765b3cd1 40590bbbbbbbbbbc____ > > > > 0x00002b149a6429f8: 00002b14770db352 00002b149a642a50____ > > > > 0x00002b149a642a08: 00002b1476adfbd6 00000000018f0100____ > > > > 0x00002b149a642a18: 0000000000000000 00002b149a642a40____ > > > > 0x00002b149a642a28: 00002b1476ba2100 431bde82d7b634db____ > > > > 0x00002b149a642a38: 00002b14f9ce9800 431bde82d7b634db____ > > > > 0x00002b149a642a48: 00002b14f9ce9800 00002b149a642b00____ > > > > 0x00002b149a642a58: 00002b1476ae08a6 00002b14770e6140____ > > > > 0x00002b149a642a68: 00002b149a642aa0 00002b147709ef83____ > > > > 0x00002b149a642a78: 0000000000ae12f0 000000307cb12800____ > > > > 0x00002b149a642a88: 0000000000000000 00000040000003e8____ > > > > 0x00002b149a642a98: 0000001a0000001a 00002b14ce44b580____ > > > > 0x00002b149a642aa8: 00002b1478ccda09 00002b1478cbb5d0____ > > > > 0x00002b149a642ab8: 00002b1400000000 00002b14ce44b5d0____ > > > > 0x00002b149a642ac8: 00002b14ce44b580 00002b14770db3d8____ > > > > 0x00002b149a642ad8: 0000000000000000 0000000000000000____ > > > > 0x00002b149a642ae8: 00002b14770db3d8 00002b147cb12800____ > > > > 0x00002b149a642af8: 00002b14770e5950 00002b149a642ca0____ > > > > 0x00002b149a642b08: 00002b1476bf22ef 00002b149a642b20____ > > > > 0x00002b149a642b18: 00002b149a642c30 00002b149a642b28____ > > > > 0x00002b149a642b28: 6e69747563657845 65706f204d562067____ > > > > 0x00002b149a642b38: 203a6e6f69746172 6c6f43636e493147____ > > > > 0x00002b149a642b48: 506e6f697463656c 6e6f640065737561____ > > > > 0x00002b149a642b58: 6e6f64206e6f0065 0000000000000065____ > > > > 0x00002b149a642b68: 0000001577100ce0 0000000000000000____ > > > > 0x00002b149a642b78: 00002b14770ae164 00002b1476116e40____ > > > > 0x00002b149a642b88: 0000000000000148 00002b147cb12800____ > > > > 0x00002b149a642b98: 0000000000000002 00002b149a642c40____ > > > > 0x00002b149a642ba8: 00002b1475e08a40 00002b149a543000____ > > > > __ __ > > > > Instructions: (pc=0x00002b14765b7210)____ > > > > 0x00002b14765b71f0: 55 31 c0 48 89 e5 c9 c3 90 90 90 90 90 90 90 90____ > > > > 0x00002b14765b7200: 55 b8 04 00 00 00 48 89 e5 c9 c3 90 90 90 90 90____ > > > > 0x00002b14765b7210: 48 8b 4f 18 55 48 89 e5 48 85 c9 74 28 8b 51 08____ > > > > 0x00002b14765b7220: 89 d0 c1 e8 03 89 c6 d1 fe 85 c0 7e 09 85 f6 b8____ > > > > __ __ > > > > Register to memory mapping:____ > > > > __ __ > > > > RAX=0x0000000000000000 is an unknown value____ > > > > RBX=0x0000000000000001 is an unknown value____ > > > > RCX=0x00002b156839ca18 is an unknown value____ > > > > RDX=0x00002b156799fc68 is pointing into metadata____ > > > > RSP=0x00002b149a6429b8 is an unknown value____ > > > > RBP=0x00002b149a6429e0 is an unknown value____ > > > > RSI=0x00002b14765b7210: in > /jre/lib/amd64/server/libjvm.so at > > 0x00002b147611b000____ > > > > RDI=0x0000001a00190005 is an unknown value____ > > > > R8 =0x0000000000000010 is an unknown value____ > > > > R9 =0x0000000000000001 is an unknown value____ > > > > R10=0x0000000000000000 is an unknown value____ > > > > R11=0x0000000000000001 is an unknown value____ > > > > R12=0x0000000000000007 is an unknown value____ > > > > R13=0x00000007c03c8428 is pointing into metadata____ > > > > R14=0x00002b14765b7210: in > /jre/lib/amd64/server/libjvm.so at > > 0x00002b147611b000____ > > > > R15=0x0000000000000000 is an unknown value____ > > > > __ __ > > > > __ __ > > > > Stack: [0x00002b149a543000,0x00002b149a644000], sp=0x00002b149a6429b8, > free space=1022k____ > > > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, > C=native code)____ > > > > V [libjvm.so+0x49c210] CounterDecay::do_method(Method*)+0x0____ > > > > V [libjvm.so+0x498cd1] > NonTieredCompPolicy::do_safepoint_work()+0x91____ > > > > V [libjvm.so+0x9c4bd6] > SafepointSynchronize::do_cleanup_tasks()+0x76____ > > > > V [libjvm.so+0x9c58a6] SafepointSynchronize::begin()+0x406____ > > > > V [libjvm.so+0xad72ef] VMThread::loop()+0x1bf____ > > > > V [libjvm.so+0xad7770] VMThread::run()+0x70____ > > > > V [libjvm.so+0x92d8d8] java_start(Thread*)+0x108____ > > > > __ __ > > > > VM_Operation (0x00002b15140011a0): G1IncCollectionPause, mode: > safepoint, requested by thread > > 0x00002b14f9bb8000 > > > > > > This is on a Debian Wheezy linux machine running Xeon Broadwell cores. > The reason I mention this > > part is a quick google did show > https://bugs.openjdk.java.net/browse/JDK-8156721 but that JBS is for > > a different platform (with an overclocked CPU, apparently) and it's > marked Incomplete. > > > > This crash was observed on about 17 separate JVMs (different hosts) at > about the same time, all > > running the same application code after about 3 weeks of uptime. > > > > I can provide more details if you'd like but wanted to see if this is a > known (but rarely witnessed) > > bug. > > > > Thanks > > > -- Sent from my phone -------------- next part -------------- An HTML attachment was scrubbed... URL: From vladimir.kozlov at oracle.com Fri Feb 2 01:01:33 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 1 Feb 2018 17:01:33 -0800 Subject: 8u152 sigsegv in CounterDecay::do_method during VMThead safepoint processing In-Reply-To: References: <79848ed2-9358-6af4-30c3-b9d8050216d2@oracle.com> Message-ID: <5aab0de7-9252-cbdb-0464-f6a19b2a9aed@oracle.com> I filed https://bugs.openjdk.java.net/browse/JDK-8196624 > Do you think that 9 fix could be backported to 8 in case it?s the same > issue? It would greatly help if you could build jdk8u with 8042727 fix and verify it. It would be good justification for backporting. Otherwise someone in sustaining group have to investigate this problem. Could you build fastdebug version and run with it? I assume you don't have a test which we can use to verify. Right? It would help if you do have it. Thanks, Vladimir On 2/1/18 4:00 PM, Vitaly Davidovich wrote: > On Thu, Feb 1, 2018 at 5:59 PM Vladimir Kozlov > wrote: > >> Hi Vitaly, >> >> I would suggest to file a bug. I looked through our bugs DB and did not >> find anything similar except >> 8156721 which you pointed. > > Hi Vladimir, > > Thanks for following up on this. Would you like me to file a bug or did > you mean someone on the hotspot team? > >> >> >> Based on disassembler the problem happened on first instruction: >> >> static void do_method(Method* m) { >> MethodCounters* mcs = m->method_counters(); >> >> 0x82000000: mov 0x18(%rdi),%rcx >> 0x82000004: push %rbp >> 0x82000005: mov %rsp,%rbp >> 0x82000008: test %rcx,%rcx >> 0x8200000b: je 0x82000035 >> >> RDI=0x0000001a00190005 >> >> Which means Method* m pointer is corrupted/incorrect (but not 0). > > Indeed. Doesn?t even look like a pointer at all with that 5 in there. > >> >> >> CounterDecay::do_method() is called from InstanceKlass::methods_do() which >> has a fix in JDK 9 to >> process only loaded classes: >> >> https://bugs.openjdk.java.net/browse/JDK-8042727 >> >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/2c2aa6de8f60#l4.6 >> >> That is the only related changes I found. May be it is a different >> problem. CCing to runtime group. > > Thanks for the JDK9 pointer. We?re slowly trying to get to 9 but it?s > taking a long time and a lot of effort for a variety of reasons. > > Do you think that 9 fix could be backported to 8 in case it?s the same > issue? > >> >> >> Regards, >> Vladimir >> >> On 1/22/18 7:36 AM, Vitaly Davidovich wrote: >>> Hi all, >>> >>> Are there any known issues with this method crashing the JVM? Here's a >> (slightly redacted) snippet >>> from the hs_err log: >>> >>> #____ >>> >>> # A fatal error has been detected by the Java Runtime Environment:____ >>> >>> #____ >>> >>> # SIGSEGV (0xb) at pc=0x00002b14765b7210, pid=140880, >> tid=0x00002b149a643700____ >>> >>> #____ >>> >>> # JRE version: Java(TM) SE Runtime Environment (8.0_152-b16) (build >> 1.8.0_152-b16)____ >>> >>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.152-b16 mixed mode >> linux-amd64 compressed oops)____ >>> >>> # Problematic frame:____ >>> >>> # V [libjvm.so+0x49c210] CounterDecay::do_method(Method*)+0x0____ >>> >>> #____ >>> >>> # Core dump written. Default location: or core.140880____ >>> >>> #____ >>> >>> # If you would like to submit a bug report, please visit:____ >>> >>> # http://bugreport.java.com/bugreport/crash.jsp < >> http://bugreport.java.com/bugreport/crash.jsp>____ >>> >>> #____ >>> >>> __ __ >>> >>> --------------- T H R E A D ---------------____ >>> >>> __ __ >>> >>> Current thread (0x00002b147cb12800): VMThread [stack: >> 0x00002b149a543000,0x00002b149a644000] >>> [id=140909]____ >>> >>> __ __ >>> >>> siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: >> 0x0000001a0019001d____ >>> >>> __ __ >>> >>> Registers:____ >>> >>> RAX=0x0000000000000000, RBX=0x0000000000000001, RCX=0x00002b156839ca18, >> RDX=0x00002b156799fc68____ >>> >>> RSP=0x00002b149a6429b8, RBP=0x00002b149a6429e0, RSI=0x00002b14765b7210, >> RDI=0x0000001a00190005____ >>> >>> R8 =0x0000000000000010, R9 =0x0000000000000001, R10=0x0000000000000000, >> R11=0x0000000000000001____ >>> >>> R12=0x0000000000000007, R13=0x00000007c03c8428, R14=0x00002b14765b7210, >> R15=0x0000000000000000____ >>> >>> RIP=0x00002b14765b7210, EFLAGS=0x0000000000010202, >> CSGSFS=0x0000000000000033, ERR=0x0000000000000004____ >>> >>> TRAPNO=0x000000000000000e____ >>> >>> __ __ >>> >>> Top of Stack: (sp=0x00002b149a6429b8)____ >>> >>> 0x00002b149a6429b8: 00002b147675e83d 0000000000000060____ >>> >>> 0x00002b149a6429c8: 00002b149a642a10 0000000000000000____ >>> >>> 0x00002b149a6429d8: 0000000000000000 00002b149a642a00____ >>> >>> 0x00002b149a6429e8: 00002b14765b3cd1 40590bbbbbbbbbbc____ >>> >>> 0x00002b149a6429f8: 00002b14770db352 00002b149a642a50____ >>> >>> 0x00002b149a642a08: 00002b1476adfbd6 00000000018f0100____ >>> >>> 0x00002b149a642a18: 0000000000000000 00002b149a642a40____ >>> >>> 0x00002b149a642a28: 00002b1476ba2100 431bde82d7b634db____ >>> >>> 0x00002b149a642a38: 00002b14f9ce9800 431bde82d7b634db____ >>> >>> 0x00002b149a642a48: 00002b14f9ce9800 00002b149a642b00____ >>> >>> 0x00002b149a642a58: 00002b1476ae08a6 00002b14770e6140____ >>> >>> 0x00002b149a642a68: 00002b149a642aa0 00002b147709ef83____ >>> >>> 0x00002b149a642a78: 0000000000ae12f0 000000307cb12800____ >>> >>> 0x00002b149a642a88: 0000000000000000 00000040000003e8____ >>> >>> 0x00002b149a642a98: 0000001a0000001a 00002b14ce44b580____ >>> >>> 0x00002b149a642aa8: 00002b1478ccda09 00002b1478cbb5d0____ >>> >>> 0x00002b149a642ab8: 00002b1400000000 00002b14ce44b5d0____ >>> >>> 0x00002b149a642ac8: 00002b14ce44b580 00002b14770db3d8____ >>> >>> 0x00002b149a642ad8: 0000000000000000 0000000000000000____ >>> >>> 0x00002b149a642ae8: 00002b14770db3d8 00002b147cb12800____ >>> >>> 0x00002b149a642af8: 00002b14770e5950 00002b149a642ca0____ >>> >>> 0x00002b149a642b08: 00002b1476bf22ef 00002b149a642b20____ >>> >>> 0x00002b149a642b18: 00002b149a642c30 00002b149a642b28____ >>> >>> 0x00002b149a642b28: 6e69747563657845 65706f204d562067____ >>> >>> 0x00002b149a642b38: 203a6e6f69746172 6c6f43636e493147____ >>> >>> 0x00002b149a642b48: 506e6f697463656c 6e6f640065737561____ >>> >>> 0x00002b149a642b58: 6e6f64206e6f0065 0000000000000065____ >>> >>> 0x00002b149a642b68: 0000001577100ce0 0000000000000000____ >>> >>> 0x00002b149a642b78: 00002b14770ae164 00002b1476116e40____ >>> >>> 0x00002b149a642b88: 0000000000000148 00002b147cb12800____ >>> >>> 0x00002b149a642b98: 0000000000000002 00002b149a642c40____ >>> >>> 0x00002b149a642ba8: 00002b1475e08a40 00002b149a543000____ >>> >>> __ __ >>> >>> Instructions: (pc=0x00002b14765b7210)____ >>> >>> 0x00002b14765b71f0: 55 31 c0 48 89 e5 c9 c3 90 90 90 90 90 90 90 90____ >>> >>> 0x00002b14765b7200: 55 b8 04 00 00 00 48 89 e5 c9 c3 90 90 90 90 90____ >>> >>> 0x00002b14765b7210: 48 8b 4f 18 55 48 89 e5 48 85 c9 74 28 8b 51 08____ >>> >>> 0x00002b14765b7220: 89 d0 c1 e8 03 89 c6 d1 fe 85 c0 7e 09 85 f6 b8____ >>> >>> __ __ >>> >>> Register to memory mapping:____ >>> >>> __ __ >>> >>> RAX=0x0000000000000000 is an unknown value____ >>> >>> RBX=0x0000000000000001 is an unknown value____ >>> >>> RCX=0x00002b156839ca18 is an unknown value____ >>> >>> RDX=0x00002b156799fc68 is pointing into metadata____ >>> >>> RSP=0x00002b149a6429b8 is an unknown value____ >>> >>> RBP=0x00002b149a6429e0 is an unknown value____ >>> >>> RSI=0x00002b14765b7210: in >> /jre/lib/amd64/server/libjvm.so at >>> 0x00002b147611b000____ >>> >>> RDI=0x0000001a00190005 is an unknown value____ >>> >>> R8 =0x0000000000000010 is an unknown value____ >>> >>> R9 =0x0000000000000001 is an unknown value____ >>> >>> R10=0x0000000000000000 is an unknown value____ >>> >>> R11=0x0000000000000001 is an unknown value____ >>> >>> R12=0x0000000000000007 is an unknown value____ >>> >>> R13=0x00000007c03c8428 is pointing into metadata____ >>> >>> R14=0x00002b14765b7210: in >> /jre/lib/amd64/server/libjvm.so at >>> 0x00002b147611b000____ >>> >>> R15=0x0000000000000000 is an unknown value____ >>> >>> __ __ >>> >>> __ __ >>> >>> Stack: [0x00002b149a543000,0x00002b149a644000], sp=0x00002b149a6429b8, >> free space=1022k____ >>> >>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, >> C=native code)____ >>> >>> V [libjvm.so+0x49c210] CounterDecay::do_method(Method*)+0x0____ >>> >>> V [libjvm.so+0x498cd1] >> NonTieredCompPolicy::do_safepoint_work()+0x91____ >>> >>> V [libjvm.so+0x9c4bd6] >> SafepointSynchronize::do_cleanup_tasks()+0x76____ >>> >>> V [libjvm.so+0x9c58a6] SafepointSynchronize::begin()+0x406____ >>> >>> V [libjvm.so+0xad72ef] VMThread::loop()+0x1bf____ >>> >>> V [libjvm.so+0xad7770] VMThread::run()+0x70____ >>> >>> V [libjvm.so+0x92d8d8] java_start(Thread*)+0x108____ >>> >>> __ __ >>> >>> VM_Operation (0x00002b15140011a0): G1IncCollectionPause, mode: >> safepoint, requested by thread >>> 0x00002b14f9bb8000 >>> >>> >>> This is on a Debian Wheezy linux machine running Xeon Broadwell cores. >> The reason I mention this >>> part is a quick google did show >> https://bugs.openjdk.java.net/browse/JDK-8156721 but that JBS is for >>> a different platform (with an overclocked CPU, apparently) and it's >> marked Incomplete. >>> >>> This crash was observed on about 17 separate JVMs (different hosts) at >> about the same time, all >>> running the same application code after about 3 weeks of uptime. >>> >>> I can provide more details if you'd like but wanted to see if this is a >> known (but rarely witnessed) >>> bug. >>> >>> Thanks >>> >> From vladimir.kozlov at oracle.com Fri Feb 2 01:21:48 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 1 Feb 2018 17:21:48 -0800 Subject: RFR(M): 8194490: [JVMCI] Move `iterateFrames` to C++ In-Reply-To: <6b9d58d4-9015-dfd7-a3bf-485c40516d24@oracle.com> References: <6b9d58d4-9015-dfd7-a3bf-485c40516d24@oracle.com> Message-ID: Thank you, Gilles Seems fine to me. Who reviewed it in Labs? And thank you for running testing. Vladimir On 1/22/18 6:58 AM, Gilles Duboscq wrote: > Hi, > > Please review the following fix for `HotSpotStackIntrospection.iterateFrames`. > It moves the iteration code from Java to C++: this helps with an issue that would arise if the nmethod containing the `iterateFrames` hits and uncommon trap during iteration. IT would change the layout of the top frames which would confuse the stack walking logic. Having this loop in C++ ensure there can be no uncommon trap. > > Webrev: http://cr.openjdk.java.net/~gdub/webrev-8194490/ > Issue: https://bugs.openjdk.java.net/browse/JDK-8194490 > Testing: hs-tier1,hs-tier2,hs-precheckin-comp > > It's bit unfortunate that we have tests for implementation details of JVMCI (i.e., tests for CompilerToVM) instead of tests for the actual API. > > Thanks, > Gilles > From vitalyd at gmail.com Fri Feb 2 02:25:58 2018 From: vitalyd at gmail.com (Vitaly Davidovich) Date: Fri, 02 Feb 2018 02:25:58 +0000 Subject: 8u152 sigsegv in CounterDecay::do_method during VMThead safepoint processing In-Reply-To: <5aab0de7-9252-cbdb-0464-f6a19b2a9aed@oracle.com> References: <79848ed2-9358-6af4-30c3-b9d8050216d2@oracle.com> <5aab0de7-9252-cbdb-0464-f6a19b2a9aed@oracle.com> Message-ID: On Thu, Feb 1, 2018 at 7:56 PM Vladimir Kozlov wrote: > I filed https://bugs.openjdk.java.net/browse/JDK-8196624 Thanks! > > > > Do you think that 9 fix could be backported to 8 in case it?s the same > > issue? > > It would greatly help if you could build jdk8u with 8042727 fix and verify > it. It would be good > justification for backporting. Otherwise someone in sustaining group have > to investigate this problem. I?ll see what I can do. We don?t have any experience building hotspot from source but I?ll see how much enthusiasm I can garner :). > > > Could you build fastdebug version and run with it? > I assume you don't have a test which we can use to verify. Right? It would > help if you do have it. Yeah, no repro unfortunately. In fact, it?s not easily reproducible in the real system either. That makes sense since this seems like a race or timing bug where quite a few things have to come together to trigger it (presumably). Gotta love these types of bugs ... > > > Thanks, > Vladimir > > On 2/1/18 4:00 PM, Vitaly Davidovich wrote: > > On Thu, Feb 1, 2018 at 5:59 PM Vladimir Kozlov < > vladimir.kozlov at oracle.com> > > wrote: > > > >> Hi Vitaly, > >> > >> I would suggest to file a bug. I looked through our bugs DB and did not > >> find anything similar except > >> 8156721 which you pointed. > > > > Hi Vladimir, > > > > Thanks for following up on this. Would you like me to file a bug or did > > you mean someone on the hotspot team? > > > >> > >> > >> Based on disassembler the problem happened on first instruction: > >> > >> static void do_method(Method* m) { > >> MethodCounters* mcs = m->method_counters(); > >> > >> 0x82000000: mov 0x18(%rdi),%rcx > >> 0x82000004: push %rbp > >> 0x82000005: mov %rsp,%rbp > >> 0x82000008: test %rcx,%rcx > >> 0x8200000b: je 0x82000035 > >> > >> RDI=0x0000001a00190005 > >> > >> Which means Method* m pointer is corrupted/incorrect (but not 0). > > > > Indeed. Doesn?t even look like a pointer at all with that 5 in there. > > > >> > >> > >> CounterDecay::do_method() is called from InstanceKlass::methods_do() > which > >> has a fix in JDK 9 to > >> process only loaded classes: > >> > >> https://bugs.openjdk.java.net/browse/JDK-8042727 > >> > >> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/2c2aa6de8f60#l4.6 > >> > >> That is the only related changes I found. May be it is a different > >> problem. CCing to runtime group. > > > > Thanks for the JDK9 pointer. We?re slowly trying to get to 9 but it?s > > taking a long time and a lot of effort for a variety of reasons. > > > > Do you think that 9 fix could be backported to 8 in case it?s the same > > issue? > > > >> > >> > >> Regards, > >> Vladimir > >> > >> On 1/22/18 7:36 AM, Vitaly Davidovich wrote: > >>> Hi all, > >>> > >>> Are there any known issues with this method crashing the JVM? Here's a > >> (slightly redacted) snippet > >>> from the hs_err log: > >>> > >>> #____ > >>> > >>> # A fatal error has been detected by the Java Runtime Environment:____ > >>> > >>> #____ > >>> > >>> # SIGSEGV (0xb) at pc=0x00002b14765b7210, pid=140880, > >> tid=0x00002b149a643700____ > >>> > >>> #____ > >>> > >>> # JRE version: Java(TM) SE Runtime Environment (8.0_152-b16) (build > >> 1.8.0_152-b16)____ > >>> > >>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.152-b16 mixed mode > >> linux-amd64 compressed oops)____ > >>> > >>> # Problematic frame:____ > >>> > >>> # V [libjvm.so+0x49c210] CounterDecay::do_method(Method*)+0x0____ > >>> > >>> #____ > >>> > >>> # Core dump written. Default location: or core.140880____ > >>> > >>> #____ > >>> > >>> # If you would like to submit a bug report, please visit:____ > >>> > >>> # http://bugreport.java.com/bugreport/crash.jsp < > >> http://bugreport.java.com/bugreport/crash.jsp>____ > >>> > >>> #____ > >>> > >>> __ __ > >>> > >>> --------------- T H R E A D ---------------____ > >>> > >>> __ __ > >>> > >>> Current thread (0x00002b147cb12800): VMThread [stack: > >> 0x00002b149a543000,0x00002b149a644000] > >>> [id=140909]____ > >>> > >>> __ __ > >>> > >>> siginfo: si_signo: 11 (SIGSEGV), si_code: 1 (SEGV_MAPERR), si_addr: > >> 0x0000001a0019001d____ > >>> > >>> __ __ > >>> > >>> Registers:____ > >>> > >>> RAX=0x0000000000000000, RBX=0x0000000000000001, RCX=0x00002b156839ca18, > >> RDX=0x00002b156799fc68____ > >>> > >>> RSP=0x00002b149a6429b8, RBP=0x00002b149a6429e0, RSI=0x00002b14765b7210, > >> RDI=0x0000001a00190005____ > >>> > >>> R8 =0x0000000000000010, R9 =0x0000000000000001, R10=0x0000000000000000, > >> R11=0x0000000000000001____ > >>> > >>> R12=0x0000000000000007, R13=0x00000007c03c8428, R14=0x00002b14765b7210, > >> R15=0x0000000000000000____ > >>> > >>> RIP=0x00002b14765b7210, EFLAGS=0x0000000000010202, > >> CSGSFS=0x0000000000000033, ERR=0x0000000000000004____ > >>> > >>> TRAPNO=0x000000000000000e____ > >>> > >>> __ __ > >>> > >>> Top of Stack: (sp=0x00002b149a6429b8)____ > >>> > >>> 0x00002b149a6429b8: 00002b147675e83d 0000000000000060____ > >>> > >>> 0x00002b149a6429c8: 00002b149a642a10 0000000000000000____ > >>> > >>> 0x00002b149a6429d8: 0000000000000000 00002b149a642a00____ > >>> > >>> 0x00002b149a6429e8: 00002b14765b3cd1 40590bbbbbbbbbbc____ > >>> > >>> 0x00002b149a6429f8: 00002b14770db352 00002b149a642a50____ > >>> > >>> 0x00002b149a642a08: 00002b1476adfbd6 00000000018f0100____ > >>> > >>> 0x00002b149a642a18: 0000000000000000 00002b149a642a40____ > >>> > >>> 0x00002b149a642a28: 00002b1476ba2100 431bde82d7b634db____ > >>> > >>> 0x00002b149a642a38: 00002b14f9ce9800 431bde82d7b634db____ > >>> > >>> 0x00002b149a642a48: 00002b14f9ce9800 00002b149a642b00____ > >>> > >>> 0x00002b149a642a58: 00002b1476ae08a6 00002b14770e6140____ > >>> > >>> 0x00002b149a642a68: 00002b149a642aa0 00002b147709ef83____ > >>> > >>> 0x00002b149a642a78: 0000000000ae12f0 000000307cb12800____ > >>> > >>> 0x00002b149a642a88: 0000000000000000 00000040000003e8____ > >>> > >>> 0x00002b149a642a98: 0000001a0000001a 00002b14ce44b580____ > >>> > >>> 0x00002b149a642aa8: 00002b1478ccda09 00002b1478cbb5d0____ > >>> > >>> 0x00002b149a642ab8: 00002b1400000000 00002b14ce44b5d0____ > >>> > >>> 0x00002b149a642ac8: 00002b14ce44b580 00002b14770db3d8____ > >>> > >>> 0x00002b149a642ad8: 0000000000000000 0000000000000000____ > >>> > >>> 0x00002b149a642ae8: 00002b14770db3d8 00002b147cb12800____ > >>> > >>> 0x00002b149a642af8: 00002b14770e5950 00002b149a642ca0____ > >>> > >>> 0x00002b149a642b08: 00002b1476bf22ef 00002b149a642b20____ > >>> > >>> 0x00002b149a642b18: 00002b149a642c30 00002b149a642b28____ > >>> > >>> 0x00002b149a642b28: 6e69747563657845 65706f204d562067____ > >>> > >>> 0x00002b149a642b38: 203a6e6f69746172 6c6f43636e493147____ > >>> > >>> 0x00002b149a642b48: 506e6f697463656c 6e6f640065737561____ > >>> > >>> 0x00002b149a642b58: 6e6f64206e6f0065 0000000000000065____ > >>> > >>> 0x00002b149a642b68: 0000001577100ce0 0000000000000000____ > >>> > >>> 0x00002b149a642b78: 00002b14770ae164 00002b1476116e40____ > >>> > >>> 0x00002b149a642b88: 0000000000000148 00002b147cb12800____ > >>> > >>> 0x00002b149a642b98: 0000000000000002 00002b149a642c40____ > >>> > >>> 0x00002b149a642ba8: 00002b1475e08a40 00002b149a543000____ > >>> > >>> __ __ > >>> > >>> Instructions: (pc=0x00002b14765b7210)____ > >>> > >>> 0x00002b14765b71f0: 55 31 c0 48 89 e5 c9 c3 90 90 90 90 90 90 90 > 90____ > >>> > >>> 0x00002b14765b7200: 55 b8 04 00 00 00 48 89 e5 c9 c3 90 90 90 90 > 90____ > >>> > >>> 0x00002b14765b7210: 48 8b 4f 18 55 48 89 e5 48 85 c9 74 28 8b 51 > 08____ > >>> > >>> 0x00002b14765b7220: 89 d0 c1 e8 03 89 c6 d1 fe 85 c0 7e 09 85 f6 > b8____ > >>> > >>> __ __ > >>> > >>> Register to memory mapping:____ > >>> > >>> __ __ > >>> > >>> RAX=0x0000000000000000 is an unknown value____ > >>> > >>> RBX=0x0000000000000001 is an unknown value____ > >>> > >>> RCX=0x00002b156839ca18 is an unknown value____ > >>> > >>> RDX=0x00002b156799fc68 is pointing into metadata____ > >>> > >>> RSP=0x00002b149a6429b8 is an unknown value____ > >>> > >>> RBP=0x00002b149a6429e0 is an unknown value____ > >>> > >>> RSI=0x00002b14765b7210: in > >> /jre/lib/amd64/server/libjvm.so at > >>> 0x00002b147611b000____ > >>> > >>> RDI=0x0000001a00190005 is an unknown value____ > >>> > >>> R8 =0x0000000000000010 is an unknown value____ > >>> > >>> R9 =0x0000000000000001 is an unknown value____ > >>> > >>> R10=0x0000000000000000 is an unknown value____ > >>> > >>> R11=0x0000000000000001 is an unknown value____ > >>> > >>> R12=0x0000000000000007 is an unknown value____ > >>> > >>> R13=0x00000007c03c8428 is pointing into metadata____ > >>> > >>> R14=0x00002b14765b7210: in > >> /jre/lib/amd64/server/libjvm.so at > >>> 0x00002b147611b000____ > >>> > >>> R15=0x0000000000000000 is an unknown value____ > >>> > >>> __ __ > >>> > >>> __ __ > >>> > >>> Stack: [0x00002b149a543000,0x00002b149a644000], sp=0x00002b149a6429b8, > >> free space=1022k____ > >>> > >>> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, > >> C=native code)____ > >>> > >>> V [libjvm.so+0x49c210] CounterDecay::do_method(Method*)+0x0____ > >>> > >>> V [libjvm.so+0x498cd1] > >> NonTieredCompPolicy::do_safepoint_work()+0x91____ > >>> > >>> V [libjvm.so+0x9c4bd6] > >> SafepointSynchronize::do_cleanup_tasks()+0x76____ > >>> > >>> V [libjvm.so+0x9c58a6] SafepointSynchronize::begin()+0x406____ > >>> > >>> V [libjvm.so+0xad72ef] VMThread::loop()+0x1bf____ > >>> > >>> V [libjvm.so+0xad7770] VMThread::run()+0x70____ > >>> > >>> V [libjvm.so+0x92d8d8] java_start(Thread*)+0x108____ > >>> > >>> __ __ > >>> > >>> VM_Operation (0x00002b15140011a0): G1IncCollectionPause, mode: > >> safepoint, requested by thread > >>> 0x00002b14f9bb8000 > >>> > >>> > >>> This is on a Debian Wheezy linux machine running Xeon Broadwell cores. > >> The reason I mention this > >>> part is a quick google did show > >> https://bugs.openjdk.java.net/browse/JDK-8156721 but that JBS is for > >>> a different platform (with an overclocked CPU, apparently) and it's > >> marked Incomplete. > >>> > >>> This crash was observed on about 17 separate JVMs (different hosts) at > >> about the same time, all > >>> running the same application code after about 3 weeks of uptime. > >>> > >>> I can provide more details if you'd like but wanted to see if this is a > >> known (but rarely witnessed) > >>> bug. > >>> > >>> Thanks > >>> > >> > -- Sent from my phone -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Fri Feb 2 09:07:07 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 2 Feb 2018 10:07:07 +0100 Subject: [11] RFR(XS): 8196608: [Graal] compiler/uncommontrap/TestUnstableIfTrap.java fails in Graal JIT mode Message-ID: <1a6f4697-e36c-f7d3-1221-750bf95c01c3@oracle.com> Hi, please review the following patch: https://bugs.openjdk.java.net/browse/JDK-8196608 http://cr.openjdk.java.net/~thartmann/8196608/webrev.00/ The test verifies C2 specific generation of uncommon traps and should not be executed with Graal. Thanks, Tobias From erik.osterlund at oracle.com Fri Feb 2 16:44:37 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Fri, 2 Feb 2018 17:44:37 +0100 Subject: Low-Overhead Heap Profiling In-Reply-To: References:

<5ec70351-910a-96bb-eb03-43ca88bd6259@oracle.com> <1508935388.13554.11.camel@oracle.com> <1510146425.3155.11.camel@oracle.com> <1511450419.2477.24.camel@oracle.com> <787d7c33-43bf-328c-9b8f-c0708ae642e9@oracle.com> Message-ID: <5A749575.5050004@oracle.com> Hi JC, Hope I am reviewing the right version of your work. Here goes... src/hotspot/share/gc/shared/collectedHeap.inline.hpp: 159 AllocTracer::send_allocation_outside_tlab(klass, result, size * HeapWordSize, THREAD); 160 161 THREAD->tlab().handle_sample(THREAD, result, size); 162 return result; 163 } Should not call tlab()->X without checking if (UseTLAB) IMO. src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp: So first of all, there seems to quite a few ends. There is an "end", a "hard end", a "slow path end", and an "actual end". Moreover, it seems like the "hard end" is actually further away than the "actual end". So the "hard end" seems like more of a "really definitely actual end" or something. I don't know about you, but I think it looks kind of messy. In particular, I don't feel like the name "actual end" reflects what it represents, especially when there is another end that is behind the "actual end". 413 HeapWord* ThreadLocalAllocBuffer::hard_end() { 414 // Did a fast TLAB refill occur? 415 if (_slow_path_end != _end) { 416 // Fix up the actual end to be now the end of this TLAB. 417 _slow_path_end = _end; 418 _actual_end = _end; 419 } 420 421 return _actual_end + alignment_reserve(); 422 } I really do not like making getters unexpectedly have these kind of side effects. It is not expected that when you ask for the "hard end", you implicitly update the "slow path end" and "actual end" to new values. src/hotspot/share/prims/jvmti.xml: 10357 10358 10359 Can sample the heap. 10360 If this capability is enabled then the heap sampling methods can be called. 10361 10362 Looks like this capability should not be "since 9" if it gets integrated now. src/hotspot/share/runtime/heapMonitoring.cpp: 448 if (is_alive->do_object_b(value)) { 449 // Update the oop to point to the new object if it is still alive. 450 f->do_oop(&(trace.obj)); 451 452 // Copy the old trace, if it is still live. 453 _allocated_traces->at_put(curr_pos++, trace); 454 455 // Store the live trace in a cache, to be served up on /heapz. 456 _traces_on_last_full_gc->append(trace); 457 458 count++; 459 } else { 460 // If the old trace is no longer live, add it to the list of 461 // recently collected garbage. 462 store_garbage_trace(trace); 463 } In the case where the oop was not live, I would like it to be explicitly cleared. Also I see a lot of concurrent-looking use of the following field: 267 volatile bool _initialized; Please note that the "volatile" qualifier does not help with reordering here. Reordering between volatile and non-volatile fields is completely free for both compiler and hardware, except for windows with MSVC, where volatile semantics is defined to use acquire/release semantics, and the hardware is TSO. But for the general case, I would expect this field to be stored with OrderAccess::release_store and loaded with OrderAccess::load_acquire. Otherwise it is not thread safe. As a kind of meta comment, I wonder if it would make sense to add sampling for non-TLAB allocations. Seems like if someone is rapidly allocating a whole bunch of 1 MB objects that never fit in a TLAB, I might still be interested in seeing that in my traces, and not get surprised that the allocation rate is very high yet not showing up in any profiles. Thanks, /Erik On 2018-01-26 06:45, JC Beyler wrote: > Thanks Robbin for the reviews :) > > The new full webrev is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/ > The incremental webrev is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/ > > I inlined my answers: > > On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn wrote: >> Hi JC, great to see another revision! >> >> #### >> heapMonitoring.cpp >> >> StackTraceData should not contain the oop for 'safety' reasons. >> When StackTraceData is moved from _allocated_traces: >> L452 store_garbage_trace(trace); >> it contains a dead oop. >> _allocated_traces could instead be a tupel of oop and StackTraceData thus >> dead oops are not kept. > Done I used inheritance to make the copier work regardless but the > idea is the same. > >> You should use the new Access API for loading the oop, something like this: >> RootAccess::load(...) >> I don't think you need to use Access API for clearing the oop, but it would >> look nicer. And you shouldn't probably be using: >> Universe::heap()->is_in_reserved(value) > I am unfamiliar with this but I think I did do it like you wanted me > to (all tests pass so that's a start). I'm not sure how to clear the > oop exactly, is there somewhere that does that, which I can use to do > the same? > > I removed the is_in_reserved, this came from our internal version, I > don't know why it was there but my tests work without so I removed it > :) > > >> The lock: >> L424 MutexLocker mu(HeapMonitorStorage_lock); >> Is not needed as far as I can see. >> weak_oops_do is called in a safepoint, no TLAB allocation can happen and >> JVMTI thread can't access these data-structures. Is there something more to >> this lock that I'm missing? > Since a thread can call the JVMTI getLiveTraces (or any of the other > ones), it can get to the point of trying to copying the > _allocated_traces. I imagine it is possible that this is happening > during a GC or that it can be started and a GC happens afterwards. > Therefore, it seems to me that you want this protected, no? > > >> #### >> You have 6 files without any changes in them (any more): >> g1CollectedHeap.cpp >> psMarkSweep.cpp >> psParallelCompact.cpp >> genCollectedHeap.cpp >> referenceProcessor.cpp >> thread.hpp >> > Done. > >> #### >> I have not looked closely, but is it possible to hide heap sampling in >> AllocTracer ? (with some minor changes to the AllocTracer API) >> > I am imagining that you are saying to move the code that does the > sampling code (change the tlab end, do the call to HeapMonitoring, > etc.) into the AllocTracer code itself? I think that is right and I'll > look if that is possible and prepare a webrev to show what would be > needed to make that happen. > >> #### >> Minor nit, when declaring pointer there is a little mix of having the >> pointer adjacent by type name and data name. (Most hotspot code is by type >> name) >> E.g. >> heapMonitoring.cpp:711 jvmtiStackTrace *trace = .... >> heapMonitoring.cpp:733 Method* m = vfst.method(); >> (not just this file) >> > Done! > >> #### >> HeapMonitorThreadOnOffTest.java:77 >> I would make g_tmp volatile, otherwise the assignment in loop may >> theoretical be skipped. >> > Also done! > > Thanks again! > Jc From ekaterina.pavlova at oracle.com Fri Feb 2 18:08:38 2018 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Fri, 2 Feb 2018 10:08:38 -0800 Subject: [11] RFR(XS): 8196608: [Graal] compiler/uncommontrap/TestUnstableIfTrap.java fails in Graal JIT mode In-Reply-To: <1a6f4697-e36c-f7d3-1221-750bf95c01c3@oracle.com> References: <1a6f4697-e36c-f7d3-1221-750bf95c01c3@oracle.com> Message-ID: <01388b6d-8b79-1e80-7f46-e7e5fe5e7335@oracle.com> Looks good. Thanks for fixing it. -katya On 2/2/18 1:07 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8196608 > http://cr.openjdk.java.net/~thartmann/8196608/webrev.00/ > > The test verifies C2 specific generation of uncommon traps and should not be executed with Graal. > > Thanks, > Tobias > From vladimir.kozlov at oracle.com Fri Feb 2 18:32:15 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 2 Feb 2018 10:32:15 -0800 Subject: [11] RFR(XS): 8196608: [Graal] compiler/uncommontrap/TestUnstableIfTrap.java fails in Graal JIT mode In-Reply-To: <1a6f4697-e36c-f7d3-1221-750bf95c01c3@oracle.com> References: <1a6f4697-e36c-f7d3-1221-750bf95c01c3@oracle.com> Message-ID: <1b410e05-4863-83ec-6483-f455689b5b3b@oracle.com> Good. Thanks, Vladimir On 2/2/18 1:07 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch: > https://bugs.openjdk.java.net/browse/JDK-8196608 > http://cr.openjdk.java.net/~thartmann/8196608/webrev.00/ > > The test verifies C2 specific generation of uncommon traps and should not be executed with Graal. > > Thanks, > Tobias > From igor.veresov at oracle.com Fri Feb 2 22:40:39 2018 From: igor.veresov at oracle.com (Igor Veresov) Date: Fri, 2 Feb 2018 14:40:39 -0800 Subject: RFR(XL) 8194819: Update Graal Message-ID: <8BD20862-017B-4181-9E73-E7AEC0DF6E62@oracle.com> Please see the JBS issue (https://bugs.openjdk.java.net/browse/JDK-8194819) for the list of changes included. Webrev: http://cr.openjdk.java.net/~iveresov/8194819/webrev.00/ Thanks, igor From vladimir.kozlov at oracle.com Fri Feb 2 23:22:58 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 2 Feb 2018 15:22:58 -0800 Subject: RFR(XL) 8194819: Update Graal In-Reply-To: <8BD20862-017B-4181-9E73-E7AEC0DF6E62@oracle.com> References: <8BD20862-017B-4181-9E73-E7AEC0DF6E62@oracle.com> Message-ID: I don't see changes to make/gensrc/Gensrc-jdk.internal.vm.compiler.gmk which built annotation parser. Otherwise looks good. Wait until testing is finished. Thanks, Vladimir On 2/2/18 2:40 PM, Igor Veresov wrote: > Please see the JBS issue (https://bugs.openjdk.java.net/browse/JDK-8194819) for the list of changes included. > > Webrev: http://cr.openjdk.java.net/~iveresov/8194819/webrev.00/ > > Thanks, > igor > From vladimir.kozlov at oracle.com Sat Feb 3 02:24:23 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 2 Feb 2018 18:24:23 -0800 Subject: [11] RFR(XS) 8196726: compiler/inlining/InlineAccessors.java test failed Message-ID: <1f8ce2f3-736c-be5d-cf9f-59e004891ec9@oracle.com> https://bugs.openjdk.java.net/browse/JDK-8196726 http://hg.openjdk.java.net/jdk/hs/file/5bce1b7e7800/test/hotspot/jtreg/compiler/inlining/InlineAccessors.java Remove -Xcomp flag from test command. It can lead to failure due to inconsistent compilation result. diff -r 19ef3f64bc10 test/hotspot/jtreg/compiler/inlining/InlineAccessors.java --- a/test/hotspot/jtreg/compiler/inlining/InlineAccessors.java +++ b/test/hotspot/jtreg/compiler/inlining/InlineAccessors.java @@ -43,7 +43,7 @@ ProcessBuilder pb = ProcessTools.createJavaProcessBuilder( "-XX:+IgnoreUnrecognizedVMOptions", "-showversion", - "-server", "-XX:-TieredCompilation", "-Xbatch", "-Xcomp", + "-server", "-XX:-TieredCompilation", "-Xbatch", "-XX:+PrintCompilation", "-XX:+UnlockDiagnosticVMOptions", "-XX:+PrintInlining", Launcher.class.getName()); Even without -Xcomp compilation will be triggered because method is call >10000 times and -Xbatch flag is used. Tested with JDK 10 and 11. Thanks, Vladimir From zhongwei.yao at linaro.org Sat Feb 3 02:44:50 2018 From: zhongwei.yao at linaro.org (Zhongwei Yao) Date: Sat, 3 Feb 2018 10:44:50 +0800 Subject: RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: <4cdcb619-754f-e6fc-bc18-67df9b892248@redhat.com> References: <4cdcb619-754f-e6fc-bc18-67df9b892248@redhat.com> Message-ID: Hi, Andrew, Thanks for your review and feedback! Patch is updated: http://cr.openjdk.java.net/~zyao/8196064/webrev.01/ It passes jtreg test. On 1 February 2018 at 23:36, Andrew Haley wrote: > On 30/01/18 03:30, Zhongwei Yao wrote: >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8196064 >> >> Webrev: >> http://cr.openjdk.java.net/~zyao/8196064/webrev.00 >> >> This patch merges adjacent load/store into ldp/stp in macro-assembler >> as discussed in previous thread: >> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-December/027957.html >> >> My micro-benchmark case gives about 8% performance improvement with this patch. >> >> And this patch is based on commit be48be2 due to recent build failure >> on jdk/hs master, and it passes all jtreg tests. >> >> Please help to review it. > > Very nice. You've done a good job. This merging opportunity happens > a lot, and it reduces not only the size of compiled code but also the > size of the interpreter and the generated stubs. It's definitely > worth doing. > > before: ArrObj.testArrObjCopy avgt 20 3.075 ? 0.002 us/op > after: ArrObj.testArrObjCopy avgt 20 2.779 ? 0.004 us/op > > So, some improvement. One thing I noticed: > > Compiled method (c2) 6812 627 4 org.linaro.benchmarks.ArrObj::testArrObjCopy (32 bytes) > > ;; B44: # B45 <- B43 Freq: 40.1247 > > 0x000003ff94b0db50: ldr x17, [sp] > 0x000003ff94b0db54: ldp x18, x1, [sp,#8] > 0x000003ff94b0db58: ldp x10, x3, [sp,#24] > 0x000003ff94b0db5c: ldp x4, x11, [sp,#40] > 0x000003ff94b0db60: ldr x6, [sp,#56] > > It seems to me like there could be four ldp instructions here. There > aren't (as far as I can see) because they're in an unfortunate order, > so the MacroAssembler doesn't see the opportunity: > > ;; B44: # B45 <- B43 Freq: 39.2717 > > 0x000003ffad17bbf8: ldr x17, [sp] > 0x000003ffad17bbfc: ldr x1, [sp,#16] > 0x000003ffad17bc00: ldr x18, [sp,#8] > 0x000003ffad17bc04: ldr x10, [sp,#24] > 0x000003ffad17bc08: ldr x3, [sp,#32] > 0x000003ffad17bc0c: ldr x4, [sp,#40] > 0x000003ffad17bc10: ldr x11, [sp,#48] > 0x000003ffad17bc14: ldr x6, [sp,#56] > > This is OK: I don't think we want to make MacroAssembler even more > complicated in order to handle occasional out-of-order spills. I agree. > > Also, last_membar and last_ldst seem to be doing the same thing. I > think that you could simply have a single last_insn field rather than > both last_membar and last_ldst, and then inspect the instruction to > see if it's a membar or a ld/st. Done > > Finally, please add BLOCK_COMMENTs in merge_ldst, like this: > > if (!is_store) { > BLOCK_COMMENT("merged ldr pair"); > if (sz == 8) { > ldp(rt_low, rt_high, adr_p); > } else { > ldpw(rt_low, rt_high, adr_p); > } > } else { > BLOCK_COMMENT("merged str pair"); > if (sz == 8) { > stp(rt_low, rt_high, adr_p); > } else { > stpw(rt_low, rt_high, adr_p); > } > } > Done > Thanks. > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 -- Best regards, Zhongwei From tobias.hartmann at oracle.com Sat Feb 3 09:13:15 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Sat, 3 Feb 2018 10:13:15 +0100 Subject: [11] RFR(XS): 8196608: [Graal] compiler/uncommontrap/TestUnstableIfTrap.java fails in Graal JIT mode In-Reply-To: <1b410e05-4863-83ec-6483-f455689b5b3b@oracle.com> References: <1a6f4697-e36c-f7d3-1221-750bf95c01c3@oracle.com> <1b410e05-4863-83ec-6483-f455689b5b3b@oracle.com> Message-ID: <987f1ea8-bd75-b8e5-e06b-3cbde4f5d4bd@oracle.com> Thanks Vladimir! Best regards, Tobias On 02.02.2018 19:32, Vladimir Kozlov wrote: > Good. > > Thanks, > Vladimir > > On 2/2/18 1:07 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8196608 >> http://cr.openjdk.java.net/~thartmann/8196608/webrev.00/ >> >> The test verifies C2 specific generation of uncommon traps and should not be executed with Graal. >> >> Thanks, >> Tobias >> From tobias.hartmann at oracle.com Sat Feb 3 09:13:06 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Sat, 3 Feb 2018 10:13:06 +0100 Subject: [11] RFR(XS): 8196608: [Graal] compiler/uncommontrap/TestUnstableIfTrap.java fails in Graal JIT mode In-Reply-To: <01388b6d-8b79-1e80-7f46-e7e5fe5e7335@oracle.com> References: <1a6f4697-e36c-f7d3-1221-750bf95c01c3@oracle.com> <01388b6d-8b79-1e80-7f46-e7e5fe5e7335@oracle.com> Message-ID: <56982152-002c-3da8-c4e2-47f110c1d687@oracle.com> Thanks Katya! Best regards, Tobias On 02.02.2018 19:08, Ekaterina Pavlova wrote: > Looks good. > Thanks for fixing it. > > -katya > > On 2/2/18 1:07 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch: >> https://bugs.openjdk.java.net/browse/JDK-8196608 >> http://cr.openjdk.java.net/~thartmann/8196608/webrev.00/ >> >> The test verifies C2 specific generation of uncommon traps and should not be executed with Graal. >> >> Thanks, >> Tobias >> > From tobias.hartmann at oracle.com Mon Feb 5 06:16:26 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 5 Feb 2018 07:16:26 +0100 Subject: [11] RFR(XS) 8196726: compiler/inlining/InlineAccessors.java test failed In-Reply-To: <1f8ce2f3-736c-be5d-cf9f-59e004891ec9@oracle.com> References: <1f8ce2f3-736c-be5d-cf9f-59e004891ec9@oracle.com> Message-ID: <57e4187b-1360-71c2-768b-6986306734b1@oracle.com> Hi Vladimir, looks good to me. Best regards, Tobias On 03.02.2018 03:24, Vladimir Kozlov wrote: > https://bugs.openjdk.java.net/browse/JDK-8196726 > > http://hg.openjdk.java.net/jdk/hs/file/5bce1b7e7800/test/hotspot/jtreg/compiler/inlining/InlineAccessors.java > > Remove -Xcomp flag from test command. It can lead to failure due to inconsistent compilation result. > > diff -r 19ef3f64bc10 test/hotspot/jtreg/compiler/inlining/InlineAccessors.java > --- a/test/hotspot/jtreg/compiler/inlining/InlineAccessors.java > +++ b/test/hotspot/jtreg/compiler/inlining/InlineAccessors.java > @@ -43,7 +43,7 @@ > > ???????? ProcessBuilder pb = ProcessTools.createJavaProcessBuilder( > ???????????????? "-XX:+IgnoreUnrecognizedVMOptions", "-showversion", > -??????????????? "-server", "-XX:-TieredCompilation", "-Xbatch", "-Xcomp", > +??????????????? "-server", "-XX:-TieredCompilation", "-Xbatch", > ???????????????? "-XX:+PrintCompilation", "-XX:+UnlockDiagnosticVMOptions", "-XX:+PrintInlining", > ???????????????????? Launcher.class.getName()); > > Even without -Xcomp compilation will be triggered because method is call >10000 times and -Xbatch flag is used. > > Tested with JDK 10 and 11. > > Thanks, > Vladimir From muthusamy.chinnathambi at oracle.com Mon Feb 5 08:53:14 2018 From: muthusamy.chinnathambi at oracle.com (Muthusamy Chinnathambi) Date: Mon, 5 Feb 2018 00:53:14 -0800 (PST) Subject: [8u] RFR for backport of JDK-8148175: C1: G1 barriers don't preserve FP registers to jdk8u-dev In-Reply-To: References: <30dbb109-c259-4529-b846-e4afffc94bd0@default> <19c869b6-d595-1fde-1481-d2fa583eae3d@oracle.com>

<91ca5a56-8d20-4714-8b09-c767574af4ae@default> Message-ID: Hi, While going for backport RFA it was pointed out(thanks to Buck) that the test-case may not work as intended for JDK8 as G1 is not default there. Can someone please review updated webrev fixing the issue. Other than the Copyright year changes, below diff is the only new change. < * @run main/othervm/timeout=300 -XX:+UseG1GC -Xbatch -Xmx128m PreserveFPRegistersTest --- > * @run main/othervm/timeout=300 -Xbatch -Xmx128m PreserveFPRegistersTest Updated webrev: http://cr.openjdk.java.net/~mchinnathamb/8148175/webrev.01/ Regards, Muthusamy C -----Original Message----- From: Vladimir Kozlov Sent: Friday, October 13, 2017 11:23 PM To: Muthusamy Chinnathambi ; hotspot compiler ; hotspot-gc-dev at openjdk.java.net Subject: Re: [8u] RFR for backport of JDK-8148175: C1: G1 barriers don't preserve FP registers to jdk8u-dev Good. Thanks, Vladimir On 10/13/17 3:53 AM, Muthusamy Chinnathambi wrote: > Hi Vladimir, > >> Why do you need to add test explicitly to hotspot_compiler group? >> It should be included implicitly into compact1_minimal group as other compiler/ tests. And compact1_minimal should be used in all other >> testing. > You are right, it should get picked implicitly as part of compact1_minimal group. > >> Did you check that the test is executed without you modifying TEST.groups? > Now - yes. Without my TEST.groups modification the test gets executed. > > I will drop the change in TEST.groups file. > Please note, this request is only for 8u. > > Regards, > Muthusamy C > > -----Original Message----- > From: Vladimir Kozlov > Sent: Thursday, October 12, 2017 11:20 PM > To: Muthusamy Chinnathambi ; hotspot compiler ; hotspot-gc-dev at openjdk.java.net > Subject: Re: [8u] RFR for backport of JDK-8148175: C1: G1 barriers don't preserve FP registers to jdk8u-dev > > Why do you need to add test explicitly to hotspot_compiler group? > It should be included implicitly into compact1_minimal group as other compiler/ tests. And compact1_minimal should be > used in all other testing. Did you check that the test is executed without you modifying TEST.groups? > > Thanks, > Vladimir K > > On 10/12/17 3:46 AM, Muthusamy Chinnathambi wrote: >> May I please get a second review for the change. >> >> Regards, >> Muthusamy C >> >> -----Original Message----- >> From: Vladimir Ivanov >> Sent: Wednesday, October 11, 2017 5:29 PM >> To: Muthusamy Chinnathambi >> Cc: hotspot-gc-dev at openjdk.java.net; hotspot compiler >> Subject: Re: [8u] RFR for backport of JDK-8148175: C1: G1 barriers don't preserve FP registers to jdk8u-dev >> >> Looks good. >> >> Best regards, >> Vladimir Ivanov >> >> On 10/11/17 12:33 PM, Muthusamy Chinnathambi wrote: >>> Hi, >>> >>> Please review the backport of bug: "JDK-8148175: C1: G1 barriers don't preserve FP registers" to jdk8u-dev >>> >>> Please note that this is not a clean backport due to new entries in TEST.groups and copyright changes. >>> >>> Webrev: http://cr.openjdk.java.net/~mchinnathamb/8148175/webrev.00/ >>> jdk9 bug: https://bugs.openjdk.java.net/browse/JDK-8148175 >>> Original patch pushed to jdk9: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/a9334e785873 >>> >>> Test: Run jtreg and jprt hotspot testsets. >>> >>> Regards, >>> Muthusamy C >>> From vladimir.kozlov at oracle.com Mon Feb 5 18:34:27 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 5 Feb 2018 10:34:27 -0800 Subject: [8u] RFR for backport of JDK-8148175: C1: G1 barriers don't preserve FP registers to jdk8u-dev In-Reply-To: References: <30dbb109-c259-4529-b846-e4afffc94bd0@default> <19c869b6-d595-1fde-1481-d2fa583eae3d@oracle.com>

<91ca5a56-8d20-4714-8b09-c767574af4ae@default>

Message-ID: <8641de08-baa0-9870-1bb2-0a64d3458b27@oracle.com> Changes are correct (-XX:+UseG1GC was added) based on webrev.01. But it was confused in e-mail because usually '>' shows new code: < * @run main/othervm/timeout=300 -Xbatch -Xmx128m PreserveFPRegistersTest --- > * @run main/othervm/timeout=300 -XX:+UseG1GC -Xbatch -Xmx128m PreserveFPRegistersTest Thanks, Vladimir On 2/5/18 12:53 AM, Muthusamy Chinnathambi wrote: > Hi, > > While going for backport RFA it was pointed out(thanks to Buck) that the test-case may not work as intended for JDK8 as G1 is not default there. > Can someone please review updated webrev fixing the issue. > > Other than the Copyright year changes, below diff is the only new change. > > < * @run main/othervm/timeout=300 -XX:+UseG1GC -Xbatch -Xmx128m PreserveFPRegistersTest > --- >> * @run main/othervm/timeout=300 -Xbatch -Xmx128m PreserveFPRegistersTest > > Updated webrev: http://cr.openjdk.java.net/~mchinnathamb/8148175/webrev.01/ > > Regards, > Muthusamy C > > -----Original Message----- > From: Vladimir Kozlov > Sent: Friday, October 13, 2017 11:23 PM > To: Muthusamy Chinnathambi ; hotspot compiler ; hotspot-gc-dev at openjdk.java.net > Subject: Re: [8u] RFR for backport of JDK-8148175: C1: G1 barriers don't preserve FP registers to jdk8u-dev > > Good. > > Thanks, > Vladimir > > On 10/13/17 3:53 AM, Muthusamy Chinnathambi wrote: >> Hi Vladimir, >> >>> Why do you need to add test explicitly to hotspot_compiler group? >>> It should be included implicitly into compact1_minimal group as other compiler/ tests. And compact1_minimal should be used in all other >>> testing. >> You are right, it should get picked implicitly as part of compact1_minimal group. >> >>> Did you check that the test is executed without you modifying TEST.groups? >> Now - yes. Without my TEST.groups modification the test gets executed. >> >> I will drop the change in TEST.groups file. >> Please note, this request is only for 8u. >> >> Regards, >> Muthusamy C >> >> -----Original Message----- >> From: Vladimir Kozlov >> Sent: Thursday, October 12, 2017 11:20 PM >> To: Muthusamy Chinnathambi ; hotspot compiler ; hotspot-gc-dev at openjdk.java.net >> Subject: Re: [8u] RFR for backport of JDK-8148175: C1: G1 barriers don't preserve FP registers to jdk8u-dev >> >> Why do you need to add test explicitly to hotspot_compiler group? >> It should be included implicitly into compact1_minimal group as other compiler/ tests. And compact1_minimal should be >> used in all other testing. Did you check that the test is executed without you modifying TEST.groups? >> >> Thanks, >> Vladimir K >> >> On 10/12/17 3:46 AM, Muthusamy Chinnathambi wrote: >>> May I please get a second review for the change. >>> >>> Regards, >>> Muthusamy C >>> >>> -----Original Message----- >>> From: Vladimir Ivanov >>> Sent: Wednesday, October 11, 2017 5:29 PM >>> To: Muthusamy Chinnathambi >>> Cc: hotspot-gc-dev at openjdk.java.net; hotspot compiler >>> Subject: Re: [8u] RFR for backport of JDK-8148175: C1: G1 barriers don't preserve FP registers to jdk8u-dev >>> >>> Looks good. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> On 10/11/17 12:33 PM, Muthusamy Chinnathambi wrote: >>>> Hi, >>>> >>>> Please review the backport of bug: "JDK-8148175: C1: G1 barriers don't preserve FP registers" to jdk8u-dev >>>> >>>> Please note that this is not a clean backport due to new entries in TEST.groups and copyright changes. >>>> >>>> Webrev: http://cr.openjdk.java.net/~mchinnathamb/8148175/webrev.00/ >>>> jdk9 bug: https://bugs.openjdk.java.net/browse/JDK-8148175 >>>> Original patch pushed to jdk9: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/a9334e785873 >>>> >>>> Test: Run jtreg and jprt hotspot testsets. >>>> >>>> Regards, >>>> Muthusamy C >>>> From vladimir.kozlov at oracle.com Mon Feb 5 18:41:25 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 5 Feb 2018 10:41:25 -0800 Subject: [11] RFR(XS) 8196726: compiler/inlining/InlineAccessors.java test failed In-Reply-To: <57e4187b-1360-71c2-768b-6986306734b1@oracle.com> References: <1f8ce2f3-736c-be5d-cf9f-59e004891ec9@oracle.com> <57e4187b-1360-71c2-768b-6986306734b1@oracle.com> Message-ID: <6a9f6e53-0683-c0a9-075e-2e71e1939fbc@oracle.com> Thank you, Tobias Vladimir On 2/4/18 10:16 PM, Tobias Hartmann wrote: > Hi Vladimir, > > looks good to me. > > Best regards, > Tobias > > On 03.02.2018 03:24, Vladimir Kozlov wrote: >> https://bugs.openjdk.java.net/browse/JDK-8196726 >> >> http://hg.openjdk.java.net/jdk/hs/file/5bce1b7e7800/test/hotspot/jtreg/compiler/inlining/InlineAccessors.java >> >> Remove -Xcomp flag from test command. It can lead to failure due to inconsistent compilation result. >> >> diff -r 19ef3f64bc10 test/hotspot/jtreg/compiler/inlining/InlineAccessors.java >> --- a/test/hotspot/jtreg/compiler/inlining/InlineAccessors.java >> +++ b/test/hotspot/jtreg/compiler/inlining/InlineAccessors.java >> @@ -43,7 +43,7 @@ >> >> ???????? ProcessBuilder pb = ProcessTools.createJavaProcessBuilder( >> ???????????????? "-XX:+IgnoreUnrecognizedVMOptions", "-showversion", >> -??????????????? "-server", "-XX:-TieredCompilation", "-Xbatch", "-Xcomp", >> +??????????????? "-server", "-XX:-TieredCompilation", "-Xbatch", >> ???????????????? "-XX:+PrintCompilation", "-XX:+UnlockDiagnosticVMOptions", "-XX:+PrintInlining", >> ???????????????????? Launcher.class.getName()); >> >> Even without -Xcomp compilation will be triggered because method is call >10000 times and -Xbatch flag is used. >> >> Tested with JDK 10 and 11. >> >> Thanks, >> Vladimir From jcbeyler at google.com Mon Feb 5 23:04:41 2018 From: jcbeyler at google.com (JC Beyler) Date: Mon, 5 Feb 2018 15:04:41 -0800 Subject: JDK-8171119: Low-Overhead Heap Profiling Message-ID: Hi Erik, (Renaming this to be folded into the newly renamed thread :)) First off, thanks a lot for reviewing the webrev! I appreciate it! I updated the webrev to: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/ And the incremental one is here: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.04_05a/ It contains: - The change for since from 9 to 11 for the jvmti.xml - The use of the OrderAccess for initialized - Clearing the oop I also have inlined my answers to your comments. The biggest question will come from the multiple *_end variables. A bit of the logic there is due to handling the slow path refill vs fast path refill and checking that the rug was not pulled underneath the slowpath. I believe that a previous comment was that TlabFastRefill was going to be deprecated. If this is true, we could revert this code a bit and just do a : if TlabFastRefill is enabled, disable this. And then deprecate that when TlabFastRefill is deprecated. This might simplify this webrev and I can work on a follow-up that either: removes TlabFastRefill if Robbin does not have the time to do it or add the support to the assembly side to handle this correctly. What do you think? Now, below, inlined are my answers: On Fri, Feb 2, 2018 at 8:44 AM, Erik ?sterlund wrote: > Hi JC, > > Hope I am reviewing the right version of your work. Here goes... > > src/hotspot/share/gc/shared/collectedHeap.inline.hpp: > > 159 AllocTracer::send_allocation_outside_tlab(klass, result, size * > HeapWordSize, THREAD); > 160 > 161 THREAD->tlab().handle_sample(THREAD, result, size); > 162 return result; > 163 } > > Should not call tlab()->X without checking if (UseTLAB) IMO. > Done! > src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp: > > So first of all, there seems to quite a few ends. There is an "end", a "hard > end", a "slow path end", and an "actual end". Moreover, it seems like the > "hard end" is actually further away than the "actual end". So the "hard end" > seems like more of a "really definitely actual end" or something. I don't > know about you, but I think it looks kind of messy. In particular, I don't > feel like the name "actual end" reflects what it represents, especially when > there is another end that is behind the "actual end". > > 413 HeapWord* ThreadLocalAllocBuffer::hard_end() { > 414 // Did a fast TLAB refill occur? > 415 if (_slow_path_end != _end) { > 416 // Fix up the actual end to be now the end of this TLAB. > 417 _slow_path_end = _end; > 418 _actual_end = _end; > 419 } > 420 > 421 return _actual_end + alignment_reserve(); > 422 } > > I really do not like making getters unexpectedly have these kind of side > effects. It is not expected that when you ask for the "hard end", you > implicitly update the "slow path end" and "actual end" to new values. > As I said, a lot of this is due to the FastTlabRefill. If I make this not supporting FastTlabRefill, this goes away. The reason the system needs to update itself at the get is that you only know at that get if things have shifted underneath the tlab slow path. I am not sure of really better names (naming is hard!), perhaps we could do these names: - current_tlab_end // Either the allocated tlab end or a sampling point - last_allocation_address // The end of the tlab allocation - last_slowpath_allocated_end // In case a fast refill occurred the end might have changed, this is to remember slow vs fast past refills the hard_end method can be renamed to something like: tlab_end_pointer() // The end of the lab including a bit of alignment reserved bytes Not sure it's better but before updating the webrev, I wanted to try to get input/consensus :) (Note hard_end was always further off than end). > src/hotspot/share/prims/jvmti.xml: > > 10357 > 10358 > 10359 Can sample the heap. > 10360 If this capability is enabled then the heap sampling methods > can be called. > 10361 > 10362 > > Looks like this capability should not be "since 9" if it gets integrated > now. Updated now to 11, crossing my fingers :) > > src/hotspot/share/runtime/heapMonitoring.cpp: > > 448 if (is_alive->do_object_b(value)) { > 449 // Update the oop to point to the new object if it is still > alive. > 450 f->do_oop(&(trace.obj)); > 451 > 452 // Copy the old trace, if it is still live. > 453 _allocated_traces->at_put(curr_pos++, trace); > 454 > 455 // Store the live trace in a cache, to be served up on /heapz. > 456 _traces_on_last_full_gc->append(trace); > 457 > 458 count++; > 459 } else { > 460 // If the old trace is no longer live, add it to the list of > 461 // recently collected garbage. > 462 store_garbage_trace(trace); > 463 } > > In the case where the oop was not live, I would like it to be explicitly > cleared. Done I think how you wanted it. Let me know because I'm not familiar with the RootAccess API. I'm unclear if I'm doing this right or not so reviews of these parts are highly appreciated. Robbin had talked of perhaps later pushing this all into a OopStorage, should I do this now do you think? Or can that wait a second webrev later down the road? + Did I clear it the way you wanted me to or were you thinking of something else? + Final question here, seems like if I were to want to not do the f->do_oop directly on the trace.obj, I'd need to do something like: f->do_oop(&value); ... trace->store_oop(value); to update the oop internally. Is that right/is that one of the advantages of going to the Oopstorage sooner than later? > > Also I see a lot of concurrent-looking use of the following field: > 267 volatile bool _initialized; > > Please note that the "volatile" qualifier does not help with reordering > here. Reordering between volatile and non-volatile fields is completely free > for both compiler and hardware, except for windows with MSVC, where volatile > semantics is defined to use acquire/release semantics, and the hardware is > TSO. But for the general case, I would expect this field to be stored with > OrderAccess::release_store and loaded with OrderAccess::load_acquire. > Otherwise it is not thread safe. Because everything is behind a mutex, I wasn't really worried about this. I have a test that has multiple threads trying to hit this corner case and it passes. However, to be paranoid, I updated it to using the OrderAccess API now, thanks! Let me know what you think there too! > > As a kind of meta comment, I wonder if it would make sense to add sampling > for non-TLAB allocations. Seems like if someone is rapidly allocating a > whole bunch of 1 MB objects that never fit in a TLAB, I might still be > interested in seeing that in my traces, and not get surprised that the > allocation rate is very high yet not showing up in any profiles. > That is handled by the handle_sample where you wanted me to put a UseTlab because you hit that case if the allocation is too big. I double checked by changing the test http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatObjectCorrectnessTest.java to use a smaller Tlab (2048) and made the object bigger and it goes through that and passes. Thanks again for your review and I look forward to your pointers for the questions I now have raised! Jc > Thanks, > /Erik > > > On 2018-01-26 06:45, JC Beyler wrote: >> >> Thanks Robbin for the reviews :) >> >> The new full webrev is here: >> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/ >> The incremental webrev is here: >> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/ >> >> I inlined my answers: >> >> On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn wrote: >>> >>> Hi JC, great to see another revision! >>> >>> #### >>> heapMonitoring.cpp >>> >>> StackTraceData should not contain the oop for 'safety' reasons. >>> When StackTraceData is moved from _allocated_traces: >>> L452 store_garbage_trace(trace); >>> it contains a dead oop. >>> _allocated_traces could instead be a tupel of oop and StackTraceData thus >>> dead oops are not kept. >> >> Done I used inheritance to make the copier work regardless but the >> idea is the same. >> >>> You should use the new Access API for loading the oop, something like >>> this: >>> RootAccess::load(...) >>> I don't think you need to use Access API for clearing the oop, but it >>> would >>> look nicer. And you shouldn't probably be using: >>> Universe::heap()->is_in_reserved(value) >> >> I am unfamiliar with this but I think I did do it like you wanted me >> to (all tests pass so that's a start). I'm not sure how to clear the >> oop exactly, is there somewhere that does that, which I can use to do >> the same? >> >> I removed the is_in_reserved, this came from our internal version, I >> don't know why it was there but my tests work without so I removed it >> :) >> >> >>> The lock: >>> L424 MutexLocker mu(HeapMonitorStorage_lock); >>> Is not needed as far as I can see. >>> weak_oops_do is called in a safepoint, no TLAB allocation can happen and >>> JVMTI thread can't access these data-structures. Is there something more >>> to >>> this lock that I'm missing? >> >> Since a thread can call the JVMTI getLiveTraces (or any of the other >> ones), it can get to the point of trying to copying the >> _allocated_traces. I imagine it is possible that this is happening >> during a GC or that it can be started and a GC happens afterwards. >> Therefore, it seems to me that you want this protected, no? >> >> >>> #### >>> You have 6 files without any changes in them (any more): >>> g1CollectedHeap.cpp >>> psMarkSweep.cpp >>> psParallelCompact.cpp >>> genCollectedHeap.cpp >>> referenceProcessor.cpp >>> thread.hpp >>> >> Done. >> >>> #### >>> I have not looked closely, but is it possible to hide heap sampling in >>> AllocTracer ? (with some minor changes to the AllocTracer API) >>> >> I am imagining that you are saying to move the code that does the >> sampling code (change the tlab end, do the call to HeapMonitoring, >> etc.) into the AllocTracer code itself? I think that is right and I'll >> look if that is possible and prepare a webrev to show what would be >> needed to make that happen. >> >>> #### >>> Minor nit, when declaring pointer there is a little mix of having the >>> pointer adjacent by type name and data name. (Most hotspot code is by >>> type >>> name) >>> E.g. >>> heapMonitoring.cpp:711 jvmtiStackTrace *trace = .... >>> heapMonitoring.cpp:733 Method* m = vfst.method(); >>> (not just this file) >>> >> Done! >> >>> #### >>> HeapMonitorThreadOnOffTest.java:77 >>> I would make g_tmp volatile, otherwise the assignment in loop may >>> theoretical be skipped. >>> >> Also done! >> >> Thanks again! >> Jc > > From muthusamy.chinnathambi at oracle.com Tue Feb 6 04:44:29 2018 From: muthusamy.chinnathambi at oracle.com (Muthusamy Chinnathambi) Date: Mon, 5 Feb 2018 20:44:29 -0800 (PST) Subject: [8u] RFR for backport of JDK-8148175: C1: G1 barriers don't preserve FP registers to jdk8u-dev In-Reply-To: <8641de08-baa0-9870-1bb2-0a64d3458b27@oracle.com> References: <30dbb109-c259-4529-b846-e4afffc94bd0@default> <19c869b6-d595-1fde-1481-d2fa583eae3d@oracle.com>

<91ca5a56-8d20-4714-8b09-c767574af4ae@default>

<8641de08-baa0-9870-1bb2-0a64d3458b27@oracle.com> Message-ID: > Changes are correct (-XX:+UseG1GC was added) based on webrev.01. Thanks Vladimir. > But it was confused in e-mail because usually '>' shows new code: Oops, I had passed the arguments to "diff" in the opposite order. Apologies. Regards, Muthusamy C -----Original Message----- From: Vladimir Kozlov Sent: Tuesday, February 6, 2018 12:04 AM To: Muthusamy Chinnathambi ; hotspot compiler ; hotspot-gc-dev at openjdk.java.net; David Buck Subject: Re: [8u] RFR for backport of JDK-8148175: C1: G1 barriers don't preserve FP registers to jdk8u-dev Changes are correct (-XX:+UseG1GC was added) based on webrev.01. But it was confused in e-mail because usually '>' shows new code: < * @run main/othervm/timeout=300 -Xbatch -Xmx128m PreserveFPRegistersTest --- > * @run main/othervm/timeout=300 -XX:+UseG1GC -Xbatch -Xmx128m PreserveFPRegistersTest Thanks, Vladimir On 2/5/18 12:53 AM, Muthusamy Chinnathambi wrote: > Hi, > > While going for backport RFA it was pointed out(thanks to Buck) that the test-case may not work as intended for JDK8 as G1 is not default there. > Can someone please review updated webrev fixing the issue. > > Other than the Copyright year changes, below diff is the only new change. > > < * @run main/othervm/timeout=300 -XX:+UseG1GC -Xbatch -Xmx128m PreserveFPRegistersTest > --- >> * @run main/othervm/timeout=300 -Xbatch -Xmx128m PreserveFPRegistersTest > > Updated webrev: http://cr.openjdk.java.net/~mchinnathamb/8148175/webrev.01/ > > Regards, > Muthusamy C > > -----Original Message----- > From: Vladimir Kozlov > Sent: Friday, October 13, 2017 11:23 PM > To: Muthusamy Chinnathambi ; hotspot compiler ; hotspot-gc-dev at openjdk.java.net > Subject: Re: [8u] RFR for backport of JDK-8148175: C1: G1 barriers don't preserve FP registers to jdk8u-dev > > Good. > > Thanks, > Vladimir > > On 10/13/17 3:53 AM, Muthusamy Chinnathambi wrote: >> Hi Vladimir, >> >>> Why do you need to add test explicitly to hotspot_compiler group? >>> It should be included implicitly into compact1_minimal group as other compiler/ tests. And compact1_minimal should be used in all other >>> testing. >> You are right, it should get picked implicitly as part of compact1_minimal group. >> >>> Did you check that the test is executed without you modifying TEST.groups? >> Now - yes. Without my TEST.groups modification the test gets executed. >> >> I will drop the change in TEST.groups file. >> Please note, this request is only for 8u. >> >> Regards, >> Muthusamy C >> >> -----Original Message----- >> From: Vladimir Kozlov >> Sent: Thursday, October 12, 2017 11:20 PM >> To: Muthusamy Chinnathambi ; hotspot compiler ; hotspot-gc-dev at openjdk.java.net >> Subject: Re: [8u] RFR for backport of JDK-8148175: C1: G1 barriers don't preserve FP registers to jdk8u-dev >> >> Why do you need to add test explicitly to hotspot_compiler group? >> It should be included implicitly into compact1_minimal group as other compiler/ tests. And compact1_minimal should be >> used in all other testing. Did you check that the test is executed without you modifying TEST.groups? >> >> Thanks, >> Vladimir K >> >> On 10/12/17 3:46 AM, Muthusamy Chinnathambi wrote: >>> May I please get a second review for the change. >>> >>> Regards, >>> Muthusamy C >>> >>> -----Original Message----- >>> From: Vladimir Ivanov >>> Sent: Wednesday, October 11, 2017 5:29 PM >>> To: Muthusamy Chinnathambi >>> Cc: hotspot-gc-dev at openjdk.java.net; hotspot compiler >>> Subject: Re: [8u] RFR for backport of JDK-8148175: C1: G1 barriers don't preserve FP registers to jdk8u-dev >>> >>> Looks good. >>> >>> Best regards, >>> Vladimir Ivanov >>> >>> On 10/11/17 12:33 PM, Muthusamy Chinnathambi wrote: >>>> Hi, >>>> >>>> Please review the backport of bug: "JDK-8148175: C1: G1 barriers don't preserve FP registers" to jdk8u-dev >>>> >>>> Please note that this is not a clean backport due to new entries in TEST.groups and copyright changes. >>>> >>>> Webrev: http://cr.openjdk.java.net/~mchinnathamb/8148175/webrev.00/ >>>> jdk9 bug: https://bugs.openjdk.java.net/browse/JDK-8148175 >>>> Original patch pushed to jdk9: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/a9334e785873 >>>> >>>> Test: Run jtreg and jprt hotspot testsets. >>>> >>>> Regards, >>>> Muthusamy C >>>> From kim.barrett at oracle.com Tue Feb 6 05:15:40 2018 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 6 Feb 2018 00:15:40 -0500 Subject: RFR(M): 8195103: Refactor ReduceInitialCardMarks to not assume all GCs use card marks In-Reply-To: <5A707248.2050905@oracle.com> References: <5A5DC8EE.9050806@oracle.com> <1517309054.2832.14.camel@oracle.com> <5A707248.2050905@oracle.com> Message-ID: <82158F93-93A6-4BCC-B4BD-2B71FBB759B4@oracle.com> > On Jan 30, 2018, at 8:25 AM, Erik ?sterlund wrote: > > Hi Thomas, > > Thanks for the review. :) > > On 2018-01-30 11:44, Thomas Schatzl wrote: >> Hi, >> >> On Tue, 2018-01-16 at 10:42 +0100, Erik ?sterlund wrote: >> >> ^^ sorry for being a bit late... >> >>> Hi, >>> >>> The current interface between the compilers and GC regarding the >>> ReduceInitialCardMarks optimization lives in the CollectedHeap. >>> However, the optimization is relevant only for collectors with a card >>> mark barrier set (CardTableModRefBS). Therefore, this interface ought >>> to be moved into CardTableModRef so that code gets less messy when a >>> collector does not use card marking. In the process, the >>> CollectedHeap::pre_initialize member function was removed (as it was >>> only used for initializing ReduceInitialCardMarks). >>> >>> The optimization needs to check if an object is in young or not. >>> This question is now asked to the barrier set rather than the heap. >>> For all collectors except G1, this has been implemented by forwarding >>> the question to the corresponding heap (inlined member function), >>> which is what was done before. For G1, I chose to instead look at >>> the card value and see if it is a young card, which should give the >>> same answer. >> Marking the cards young is done concurrently to the application. So you >> could get false answers here. However it seems that this is benign, >> i.e. at most too many objects are pushed into the deferred card mark >> from what I can see. >> >> However the assert in >> CardTableModRefBs::flush_deferred_card_mark_barrier() may complain... >> i.e. at the time when the object is deferred, the result of is_young() >> may be false, but at the time the deferred card mark is flushed, >> is_young() will return true. >> >> Note that while this occurrence is not very common, it does happen. >> >> I think this needs to be fixed. Either the mentioned assert, or the >> is_young() check. The region type is always good btw. > > We discussed this off-list. There is in fact no such race. > The compiler slow-path first allocates new memory (TLAB or not). Then it writes young to all of the cards. Then it contemplates whether performing a card mark is necessary for non-young objects to comply with ReduceInitialCardMarks. > So by the time the is_young() question is asked, Thread::current() has written the young value, which is always observable to itself. It might be that a concurrent thread over-writes this value with a monotonic card transition to the very same young value, due to crossing the same card boundary with another allocation. In either case, the young value will always be observed by the thread that performed the allocation if and only if the object then resides in young. This threw me a bit too. I thought there was a post-pause fill cards for young regions phase. It's alluded to in a comment in G1RemSet::refine_card_concurrently: "The region could be young. ..." But I can't find any such code now. Maybe I'm misremembering? Or maybe it got refactored out of existance? Looks good. >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8195103 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~eosterlund/8195103/webrev.00/ >>> >>> Testing: mach5 hs-tier1-5 >> looks good to me otherwise. > > Thanks Thomas! > > /Erik > >> Thanks, >> Thomas From erik.osterlund at oracle.com Tue Feb 6 10:05:56 2018 From: erik.osterlund at oracle.com (Erik Osterlund) Date: Tue, 6 Feb 2018 10:05:56 +0000 Subject: RFR(M): 8195103: Refactor ReduceInitialCardMarks to not assume all GCs use card marks In-Reply-To: <82158F93-93A6-4BCC-B4BD-2B71FBB759B4@oracle.com> References: <5A5DC8EE.9050806@oracle.com> <1517309054.2832.14.camel@oracle.com> <5A707248.2050905@oracle.com> <82158F93-93A6-4BCC-B4BD-2B71FBB759B4@oracle.com> Message-ID: <4D469C39-FA68-42D3-895C-5B347D718591@oracle.com> Hi Kim, On 6 Feb 2018, at 05:15, Kim Barrett wrote: >> On Jan 30, 2018, at 8:25 AM, Erik ?sterlund wrote: >> >> Hi Thomas, >> >> Thanks for the review. :) >> >>> On 2018-01-30 11:44, Thomas Schatzl wrote: >>> Hi, >>> >>> On Tue, 2018-01-16 at 10:42 +0100, Erik ?sterlund wrote: >>> >>> ^^ sorry for being a bit late... >>> >>>> Hi, >>>> >>>> The current interface between the compilers and GC regarding the >>>> ReduceInitialCardMarks optimization lives in the CollectedHeap. >>>> However, the optimization is relevant only for collectors with a card >>>> mark barrier set (CardTableModRefBS). Therefore, this interface ought >>>> to be moved into CardTableModRef so that code gets less messy when a >>>> collector does not use card marking. In the process, the >>>> CollectedHeap::pre_initialize member function was removed (as it was >>>> only used for initializing ReduceInitialCardMarks). >>>> >>>> The optimization needs to check if an object is in young or not. >>>> This question is now asked to the barrier set rather than the heap. >>>> For all collectors except G1, this has been implemented by forwarding >>>> the question to the corresponding heap (inlined member function), >>>> which is what was done before. For G1, I chose to instead look at >>>> the card value and see if it is a young card, which should give the >>>> same answer. >>> Marking the cards young is done concurrently to the application. So you >>> could get false answers here. However it seems that this is benign, >>> i.e. at most too many objects are pushed into the deferred card mark >>> from what I can see. >>> >>> However the assert in >>> CardTableModRefBs::flush_deferred_card_mark_barrier() may complain... >>> i.e. at the time when the object is deferred, the result of is_young() >>> may be false, but at the time the deferred card mark is flushed, >>> is_young() will return true. >>> >>> Note that while this occurrence is not very common, it does happen. >>> >>> I think this needs to be fixed. Either the mentioned assert, or the >>> is_young() check. The region type is always good btw. >> >> We discussed this off-list. There is in fact no such race. >> The compiler slow-path first allocates new memory (TLAB or not). Then it writes young to all of the cards. Then it contemplates whether performing a card mark is necessary for non-young objects to comply with ReduceInitialCardMarks. >> So by the time the is_young() question is asked, Thread::current() has written the young value, which is always observable to itself. It might be that a concurrent thread over-writes this value with a monotonic card transition to the very same young value, due to crossing the same card boundary with another allocation. In either case, the young value will always be observed by the thread that performed the allocation if and only if the object then resides in young. > > This threw me a bit too. > > I thought there was a post-pause fill cards for young regions phase. > It's alluded to in a comment in G1RemSet::refine_card_concurrently: > "The region could be young. ..." But I can't find any such code now. > Maybe I'm misremembering? Or maybe it got refactored out of existance? I had a vague memory about that too but after inspecting the code concluded this is currently not the case. I think we used to shade the cards young on region level after dropping a mutex but changed it at some time as it was a bad idea. > Looks good. Thanks for the review. /Erik > >>>> Bug: >>>> https://bugs.openjdk.java.net/browse/JDK-8195103 >>>> >>>> Webrev: >>>> http://cr.openjdk.java.net/~eosterlund/8195103/webrev.00/ >>>> >>>> Testing: mach5 hs-tier1-5 >>> looks good to me otherwise. >> >> Thanks Thomas! >> >> /Erik >> >>> Thanks, >>> Thomas > > From thomas.schatzl at oracle.com Tue Feb 6 10:55:03 2018 From: thomas.schatzl at oracle.com (Thomas Schatzl) Date: Tue, 06 Feb 2018 11:55:03 +0100 Subject: RFR(M): 8195103: Refactor ReduceInitialCardMarks to not assume all GCs use card marks In-Reply-To: <4D469C39-FA68-42D3-895C-5B347D718591@oracle.com> References: <5A5DC8EE.9050806@oracle.com> <1517309054.2832.14.camel@oracle.com> <5A707248.2050905@oracle.com> <82158F93-93A6-4BCC-B4BD-2B71FBB759B4@oracle.com> <4D469C39-FA68-42D3-895C-5B347D718591@oracle.com> Message-ID: <1517914503.4278.7.camel@oracle.com> Hi, On Tue, 2018-02-06 at 10:05 +0000, Erik Osterlund wrote: > Hi Kim, > > On 6 Feb 2018, at 05:15, Kim Barrett wrote: > > > > > > > > > [...] > > > > > The optimization needs to check if an object is in young or > > > > > not. This question is now asked to the barrier set rather > > > > > than the heap. > > > > > For all collectors except G1, this has been implemented by > > > > > forwarding the question to the corresponding heap (inlined > > > > > member function), which is what was done before. For G1, I > > > > > chose to instead look at the card value and see if it is a > > > > > young card, which should give the same answer. > > > > > > > > Marking the cards young is done concurrently to the > > > > application. So you could get false answers here. However it > > > > seems that this is benign, i.e. at most too many objects are > > > > pushed into the deferred card mark from what I can see. > > > > > > > > However the assert in > > > > CardTableModRefBs::flush_deferred_card_mark_barrier() may > > > > complain... > > > > i.e. at the time when the object is deferred, the result of > > > > is_young() may be false, but at the time the deferred card mark > > > > is flushed, is_young() will return true. > > > > > > > > Note that while this occurrence is not very common, it does > > > > happen. > > > > > > > > I think this needs to be fixed. Either the mentioned assert, or > > > > the is_young() check. The region type is always good btw. > > > > > > We discussed this off-list. There is in fact no such race. > > > The compiler slow-path first allocates new memory (TLAB or not). > > > Then it writes young to all of the cards. Then it contemplates > > > whether performing a card mark is necessary for non-young objects > > > to comply with ReduceInitialCardMarks. > > > So by the time the is_young() question is asked, > > > Thread::current() has written the young value, which is always > > > observable to itself. It might be that a concurrent thread over- > > > writes this value with a monotonic card transition to the very > > > same young value, due to crossing the same card boundary with > > > another allocation. In either case, the young value will always > > > be observed by the thread that performed the allocation if and > > > only if the object then resides in young. > > > > This threw me a bit too. > > > > I thought there was a post-pause fill cards for young regions > > phase. It's alluded to in a comment in > > G1RemSet::refine_card_concurrently: > > "The region could be young. ..." But I can't find any such code > > now. > > Maybe I'm misremembering? Or maybe it got refactored out of > > existance? > > I had a vague memory about that too but after inspecting the code > concluded this is currently not the case. I think we used to shade > the cards young on region level after dropping a mutex but changed it > at some time as it was a bad idea. Just to add to the confusion :) G1 afair never shaded a young region's cards during a mutex, but always after dropping it (in G1CollectedHeap::attempt_allocation(), which is outside holding any lock). The situation for concurrent refinement is simply different: during concurrent refinement there is a thread *different* to the one which sets the young card marks, and that one can observe the non-young cards in the young region. This is not the case here, the same thread accesses the young region's card to determine whether it is in young. Obviously it always observes its own writes. Thanks, Thomas From dmitrij.pochepko at bell-sw.com Wed Feb 7 17:40:44 2018 From: dmitrij.pochepko at bell-sw.com (Dmitrij Pochepko) Date: Wed, 7 Feb 2018 20:40:44 +0300 Subject: RFR(XS): 8196868 - AARCH64: ld/st instructions hit guarantee assert while using sp Message-ID: <21b53516-a6c9-06d7-a58c-84c8d3df5401@bell-sw.com> Hi all, please review small patch for JDK-8196868 - AARCH64: ld/st instructions hit guarantee assert while using sp I found a guarantee assert crash in hotspot while trying to write intrinsic with st4 (SIMD load) instruction which write values into stack. It is allowed by specification: SP or general-purpose register can be used for addressing. A reason for this was register encoding call rf(...), which was used originally. This call doesn't expect SP to be passed for encoding, while srf(...) expects SP or general-purpose register. webrev: http://cr.openjdk.java.net/~dpochepk/8196868/webrev.01/ CR: https://bugs.openjdk.java.net/browse/JDK-8196868 I've tested this patch with my intrinsic and crash is not reproducible anymore. I also run part of jtreg hotspot tests as sanity to ensure nothing got broken and it looks good. Thanks, Dmitrij From aph at redhat.com Wed Feb 7 18:37:25 2018 From: aph at redhat.com (Andrew Haley) Date: Wed, 7 Feb 2018 18:37:25 +0000 Subject: RFR(XS): 8196868 - AARCH64: ld/st instructions hit guarantee assert while using sp In-Reply-To: <21b53516-a6c9-06d7-a58c-84c8d3df5401@bell-sw.com> References: <21b53516-a6c9-06d7-a58c-84c8d3df5401@bell-sw.com> Message-ID: <9c42769e-581a-4b49-d3cb-cca5c446436f@redhat.com> On 07/02/18 17:40, Dmitrij Pochepko wrote: > Hi all, > > please review small patch for JDK-8196868 - AARCH64: ld/st instructions > hit guarantee assert while using sp > > I found a guarantee assert crash in hotspot while trying to write > intrinsic with st4 (SIMD load) instruction which write values into > stack. It is allowed by specification: SP or general-purpose register > can be used for addressing. > > > A reason for this was register encoding call rf(...), which was used > originally. This call doesn't expect SP to be passed for encoding, while > srf(...) expects SP or general-purpose register. > > > webrev: http://cr.openjdk.java.net/~dpochepk/8196868/webrev.01/ > > CR: https://bugs.openjdk.java.net/browse/JDK-8196868 > > > I've tested this patch with my intrinsic and crash is not reproducible > anymore. I also run part of jtreg hotspot tests as sanity to ensure > nothing got broken and it looks good. Yes, this is OK. Thanks. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From dmitrij.pochepko at bell-sw.com Wed Feb 7 18:43:29 2018 From: dmitrij.pochepko at bell-sw.com (Dmitrij Pochepko) Date: Wed, 7 Feb 2018 21:43:29 +0300 Subject: RFR(XS): 8196868 - AARCH64: ld/st instructions hit guarantee assert while using sp In-Reply-To: <9c42769e-581a-4b49-d3cb-cca5c446436f@redhat.com> References: <21b53516-a6c9-06d7-a58c-84c8d3df5401@bell-sw.com> <9c42769e-581a-4b49-d3cb-cca5c446436f@redhat.com> Message-ID: Thank you for review. Dmitrij On 07.02.2018 21:37, Andrew Haley wrote: > On 07/02/18 17:40, Dmitrij Pochepko wrote: >> Hi all, >> >> please review small patch for JDK-8196868 - AARCH64: ld/st instructions >> hit guarantee assert while using sp >> >> I found a guarantee assert crash in hotspot while trying to write >> intrinsic with st4 (SIMD load) instruction which write values into >> stack. It is allowed by specification: SP or general-purpose register >> can be used for addressing. >> >> >> A reason for this was register encoding call rf(...), which was used >> originally. This call doesn't expect SP to be passed for encoding, while >> srf(...) expects SP or general-purpose register. >> >> >> webrev: http://cr.openjdk.java.net/~dpochepk/8196868/webrev.01/ >> >> CR: https://bugs.openjdk.java.net/browse/JDK-8196868 >> >> >> I've tested this patch with my intrinsic and crash is not reproducible >> anymore. I also run part of jtreg hotspot tests as sanity to ensure >> nothing got broken and it looks good. > Yes, this is OK. Thanks. > From dmitrij.pochepko at bell-sw.com Wed Feb 7 19:39:58 2018 From: dmitrij.pochepko at bell-sw.com (Dmitrij Pochepko) Date: Wed, 7 Feb 2018 22:39:58 +0300 Subject: RFR: 8187472 - AARCH64: array_equals intrinsic doesn't use prefetch for large arrays Message-ID: <02e62700-efaf-918e-f04e-031dd4788aa0@bell-sw.com> Hi all, please review patch for JDK-8187472 - AARCH64: array_equals intrinsic doesn't use prefetch for large arrays this patch adds logic with prefetch instruction usage, which is helping most of CPUs. It also introduce 2 ways of loading data: via loading into general-purpose registers and via SIMD registers(controlled by newly introduced vm key: UseSIMDForArrayEquals. I've measured performance on few CPUs (ThunderX, ThunderX2(not yet released), Cortex A53, Cortex A73) and arranged defaults based on measurement results(so, SIMD is default for everything except ThunderX and A53. Also, prefetch instruction slows down A73 - I've disabled it there). In general, this patch changes very short arrays handling(performing 8-byte read instead of few smaller reads, using the fact of 8-byte alignment) and jumping into stub with large 64-byte read loop for larger arrays). Measurements(measured array length 7,64,128,256,512,1024,100000. Improvement in %. 80% improvement means that new version is 80% faster, i.e. 5 times.): ThunderX: 2%, -4%, 0%, 2%, 32%, 55%, 80% ThunderX2: 0%, -3%, 17%, 19%, 29%, 31%, 47% Cortex A53 at 533MHz: 8%, -1%, -2%, 4%, 6%, 5%, 3% Cortex A73 at 903MHz: 8%, -3%, 0%, 7%, 8%, 9%, 8% Note: medium sizes are a bit slower because of additional branch added(which checks size and jumps to stub). Note2: Cortex CPUs were downclocked to have stable frequency because of overheat on device I was using(frequency was randomly jumping producing unstable results otherwise), so, results might be a bit different on devices with higher frequency. webrev: http://cr.openjdk.java.net/~dpochepk/8187472/webrev.07/ CR: https://bugs.openjdk.java.net/browse/JDK-8187472 Testing: I've run all hotspot jtreg tests with and without patch on aarch64 and found no new failures Benchmark: http://cr.openjdk.java.net/~dpochepk/8187472/webrev.07/ArrayEqualsBench.java Thanks, Dmitrij -------------- next part -------------- An HTML attachment was scrubbed... URL: From Derek.White at cavium.com Wed Feb 7 23:09:19 2018 From: Derek.White at cavium.com (White, Derek) Date: Wed, 7 Feb 2018 23:09:19 +0000 Subject: [aarch64-port-dev ] RFR(XS): 8196868 - AARCH64: ld/st instructions hit guarantee assert while using sp In-Reply-To: <9c42769e-581a-4b49-d3cb-cca5c446436f@redhat.com> References: <21b53516-a6c9-06d7-a58c-84c8d3df5401@bell-sw.com> <9c42769e-581a-4b49-d3cb-cca5c446436f@redhat.com> Message-ID: Looks good to me. - Derek > -----Original Message----- > From: aarch64-port-dev [mailto:aarch64-port-dev- > bounces at openjdk.java.net] On Behalf Of Andrew Haley > Sent: Wednesday, February 07, 2018 1:37 PM > To: Dmitrij Pochepko ; aarch64-port- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net > Subject: Re: [aarch64-port-dev ] RFR(XS): 8196868 - AARCH64: ld/st > instructions hit guarantee assert while using sp > > On 07/02/18 17:40, Dmitrij Pochepko wrote: > > Hi all, > > > > please review small patch for JDK-8196868 - AARCH64: ld/st > > instructions hit guarantee assert while using sp > > > > I found a guarantee assert crash in hotspot while trying to write > > intrinsic with st4 (SIMD load) instruction which write values into > > stack. It is allowed by specification: SP or general-purpose register > > can be used for addressing. > > > > > > A reason for this was register encoding call rf(...), which was used > > originally. This call doesn't expect SP to be passed for encoding, > > while > > srf(...) expects SP or general-purpose register. > > > > > > webrev: http://cr.openjdk.java.net/~dpochepk/8196868/webrev.01/ > > > > CR: https://bugs.openjdk.java.net/browse/JDK-8196868 > > > > > > I've tested this patch with my intrinsic and crash is not reproducible > > anymore. I also run part of jtreg hotspot tests as sanity to ensure > > nothing got broken and it looks good. > > Yes, this is OK. Thanks. > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Thu Feb 8 09:54:40 2018 From: aph at redhat.com (Andrew Haley) Date: Thu, 8 Feb 2018 09:54:40 +0000 Subject: RFR: 8187472 - AARCH64: array_equals intrinsic doesn't use prefetch for large arrays In-Reply-To: <02e62700-efaf-918e-f04e-031dd4788aa0@bell-sw.com> References: <02e62700-efaf-918e-f04e-031dd4788aa0@bell-sw.com> Message-ID: <6fb5de69-394c-5c11-5c0b-f8be46f05cd7@redhat.com> On 07/02/18 19:39, Dmitrij Pochepko wrote: > Note: medium sizes are a bit slower because of additional branch > added(which checks size and jumps to stub). The indentation is messed up here: @@ -5201,40 +5217,23 @@ // length == 4. if (log_elem_size > 0) lsl(cnt1, cnt1, log_elem_size); - ldr(tmp1, Address(a1, cnt1)); - ldr(tmp2, Address(a2, cnt1)); + ldr(tmp1, Address(a1, cnt1)); + ldr(tmp2, Address(a2, cnt1)); eor(tmp1, tmp1, tmp2); cbnz(tmp1, DONE); b(SAME); This pessimizes a very common case of string comparisons, those of about 32 bytes. Please think again. I'm not convinced that reading beyond the end of the memory you're passed is always safe. What if you're passed an address which is the last three bytes in a region? It's not good enough to say that HotSpot doesn't do that today: maybe not, but one day it might. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Thu Feb 8 10:11:00 2018 From: aph at redhat.com (Andrew Haley) Date: Thu, 8 Feb 2018 10:11:00 +0000 Subject: [aarch64-port-dev ] RFR: 8187472 - AARCH64: array_equals intrinsic doesn't use prefetch for large arrays In-Reply-To: <02e62700-efaf-918e-f04e-031dd4788aa0@bell-sw.com> References: <02e62700-efaf-918e-f04e-031dd4788aa0@bell-sw.com> Message-ID: On 07/02/18 19:39, Dmitrij Pochepko wrote: > In general, this patch changes very short arrays handling(performing > 8-byte read instead of few smaller reads, using the fact of 8-byte > alignment) and jumping into stub with large 64-byte read loop for larger > arrays). > > Measurements(measured array length 7,64,128,256,512,1024,100000. > Improvement in %. 80% improvement means that new version is 80% faster, > i.e. 5 times.): > > > ThunderX: 2%, -4%, 0%, 2%, 32%, 55%, 80% > > ThunderX2: 0%, -3%, 17%, 19%, 29%, 31%, 47% > > Cortex A53 at 533MHz: 8%, -1%, -2%, 4%, 6%, 5%, 3% > > Cortex A73 at 903MHz: 8%, -3%, 0%, 7%, 8%, 9%, 8% > > Note: medium sizes are a bit slower because of additional branch > added(which checks size and jumps to stub). This indentation is messed up: @@ -5201,40 +5217,23 @@ // length == 4. if (log_elem_size > 0) lsl(cnt1, cnt1, log_elem_size); - ldr(tmp1, Address(a1, cnt1)); - ldr(tmp2, Address(a2, cnt1)); + ldr(tmp1, Address(a1, cnt1)); + ldr(tmp2, Address(a2, cnt1)); I'm not convinced that this works correctly if passed the address of a pair of arrays at the end of a page. Maybe it isn't used on sub-arrays today in HotSpot, but one day it might be. It pessimizes a very common case of strings, those of about 32 characters. Please think again. Please also think about strings that are long enough for the SIMD loop but differ in their early substrings. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From ekaterina.pavlova at oracle.com Thu Feb 8 23:08:44 2018 From: ekaterina.pavlova at oracle.com (Ekaterina Pavlova) Date: Thu, 8 Feb 2018 15:08:44 -0800 Subject: RFR(S) 8197453 : Add support of extra problem list Message-ID: Hi all, ProblemList.txt files used by makefiles for jtreg testing allow to specify list of tests to be excluded from execution on all or specific platforms. However to test such features like Graal we want to be able to specify list of failed tests which fail in particular JVM mode only. Please review this change which adds support of extra problem list and introduces 2 Graal specific problem list files. - test/hotspot/jtreg/ProblemList-graal.txt - test/jdk/ProblemList-graal.txt JBS: https://bugs.openjdk.java.net/browse/JDK-8197453 webrev: http://cr.openjdk.java.net/~epavlova//8197453/webrev.00/ testing: precheckin, tier1 and tier2 with empty EXTRA_PROBLEM_LISTS. testing in Graal mode with EXTRA_PROBLEM_LISTS=ProblemList-graal.txt thanks, -katya p.s. Igor Ignatyev volunteered to sponsor this change. From zhongwei.yao at linaro.org Fri Feb 9 02:59:25 2018 From: zhongwei.yao at linaro.org (Zhongwei Yao) Date: Fri, 9 Feb 2018 10:59:25 +0800 Subject: RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: References: <4cdcb619-754f-e6fc-bc18-67df9b892248@redhat.com> Message-ID: Ping... Is this patch OK? On 3 February 2018 at 10:44, Zhongwei Yao wrote: > Hi, Andrew, > > Thanks for your review and feedback! > > Patch is updated: http://cr.openjdk.java.net/~zyao/8196064/webrev.01/ > > It passes jtreg test. > > On 1 February 2018 at 23:36, Andrew Haley wrote: >> On 30/01/18 03:30, Zhongwei Yao wrote: >>> Bug: >>> https://bugs.openjdk.java.net/browse/JDK-8196064 >>> >>> Webrev: >>> http://cr.openjdk.java.net/~zyao/8196064/webrev.00 >>> >>> This patch merges adjacent load/store into ldp/stp in macro-assembler >>> as discussed in previous thread: >>> http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-December/027957.html >>> >>> My micro-benchmark case gives about 8% performance improvement with this patch. >>> >>> And this patch is based on commit be48be2 due to recent build failure >>> on jdk/hs master, and it passes all jtreg tests. >>> >>> Please help to review it. >> >> Very nice. You've done a good job. This merging opportunity happens >> a lot, and it reduces not only the size of compiled code but also the >> size of the interpreter and the generated stubs. It's definitely >> worth doing. >> >> before: ArrObj.testArrObjCopy avgt 20 3.075 ? 0.002 us/op >> after: ArrObj.testArrObjCopy avgt 20 2.779 ? 0.004 us/op >> >> So, some improvement. One thing I noticed: >> >> Compiled method (c2) 6812 627 4 org.linaro.benchmarks.ArrObj::testArrObjCopy (32 bytes) >> >> ;; B44: # B45 <- B43 Freq: 40.1247 >> >> 0x000003ff94b0db50: ldr x17, [sp] >> 0x000003ff94b0db54: ldp x18, x1, [sp,#8] >> 0x000003ff94b0db58: ldp x10, x3, [sp,#24] >> 0x000003ff94b0db5c: ldp x4, x11, [sp,#40] >> 0x000003ff94b0db60: ldr x6, [sp,#56] >> >> It seems to me like there could be four ldp instructions here. There >> aren't (as far as I can see) because they're in an unfortunate order, >> so the MacroAssembler doesn't see the opportunity: >> >> ;; B44: # B45 <- B43 Freq: 39.2717 >> >> 0x000003ffad17bbf8: ldr x17, [sp] >> 0x000003ffad17bbfc: ldr x1, [sp,#16] >> 0x000003ffad17bc00: ldr x18, [sp,#8] >> 0x000003ffad17bc04: ldr x10, [sp,#24] >> 0x000003ffad17bc08: ldr x3, [sp,#32] >> 0x000003ffad17bc0c: ldr x4, [sp,#40] >> 0x000003ffad17bc10: ldr x11, [sp,#48] >> 0x000003ffad17bc14: ldr x6, [sp,#56] >> >> This is OK: I don't think we want to make MacroAssembler even more >> complicated in order to handle occasional out-of-order spills. > I agree. > >> >> Also, last_membar and last_ldst seem to be doing the same thing. I >> think that you could simply have a single last_insn field rather than >> both last_membar and last_ldst, and then inspect the instruction to >> see if it's a membar or a ld/st. > Done > >> >> Finally, please add BLOCK_COMMENTs in merge_ldst, like this: >> >> if (!is_store) { >> BLOCK_COMMENT("merged ldr pair"); >> if (sz == 8) { >> ldp(rt_low, rt_high, adr_p); >> } else { >> ldpw(rt_low, rt_high, adr_p); >> } >> } else { >> BLOCK_COMMENT("merged str pair"); >> if (sz == 8) { >> stp(rt_low, rt_high, adr_p); >> } else { >> stpw(rt_low, rt_high, adr_p); >> } >> } >> > Done > >> Thanks. >> >> -- >> Andrew Haley >> Java Platform Lead Engineer >> Red Hat UK Ltd. >> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > > > > -- > Best regards, > Zhongwei -- Best regards, Zhongwei From aph at redhat.com Fri Feb 9 09:19:16 2018 From: aph at redhat.com (Andrew Haley) Date: Fri, 9 Feb 2018 09:19:16 +0000 Subject: [aarch64-port-dev ] RFR: 8187472 - AARCH64: array_equals intrinsic doesn't use prefetch for large arrays In-Reply-To: <6fb5de69-394c-5c11-5c0b-f8be46f05cd7@redhat.com> References: <02e62700-efaf-918e-f04e-031dd4788aa0@bell-sw.com> <6fb5de69-394c-5c11-5c0b-f8be46f05cd7@redhat.com> Message-ID: On 08/02/18 09:54, Andrew Haley wrote: > b(SAME); > > This pessimizes a very common case of string comparisons, those of > about 32 bytes. Please think again. I have a suggestion: You could run an instrumented version of OpenJDK on couple of apps which use a lot of strings. (I suggest the javac invocation in the OpenJKD build which builds java.base and SPECjbb.) This instrumented version should record the lengths of the strings which are compared and the first character where they differ. Then use the result to determine what to optimize for. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Fri Feb 9 09:36:38 2018 From: aph at redhat.com (Andrew Haley) Date: Fri, 9 Feb 2018 09:36:38 +0000 Subject: RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: References: <4cdcb619-754f-e6fc-bc18-67df9b892248@redhat.com>

Message-ID: <348d01b1-f15e-bd50-3236-f0b3a1c64a6c@redhat.com> On 09/02/18 02:59, Zhongwei Yao wrote: > Ping... Is this patch OK? Sorry, FOSDEM conference. Yes, this looks great. I believe that AArch64 is the only user of last_insn. Please check. This patch should be a good test of the submission forest. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From shade at redhat.com Fri Feb 9 10:14:42 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Fri, 9 Feb 2018 11:14:42 +0100 Subject: [aarch64-port-dev ] Cannot get_source In-Reply-To: <1306924248.915873.1518149991960@mail.yahoo.com> References: <02e62700-efaf-918e-f04e-031dd4788aa0@bell-sw.com> <6fb5de69-394c-5c11-5c0b-f8be46f05cd7@redhat.com> <1306924248.915873.1518149991960@mail.yahoo.com> Message-ID: <6197d7fe-8b73-cf97-10d0-3b3006b48f27@redhat.com> On 02/09/2018 05:19 AM, Cao Hoang Thu wrote: > I got latest source:?http://hg.openjdk.java.net/aarch64-port/jdk8u/ > > But I cannot clone by get_source hg.openjdk.java.net sometimes gets funky, and the chances for experiencing that funkiness gets higher if you do the fresh clone that takes a long time. As the workaround, you can try pulling the workspace tarball from here: https://builds.shipilev.net/workspaces/aarch64-port-jdk8u.tar.xz https://builds.shipilev.net/workspaces/README.txt -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From erik.joelsson at oracle.com Fri Feb 9 16:37:50 2018 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Fri, 9 Feb 2018 08:37:50 -0800 Subject: RFR: JDK-8187676: Disable uninitialized warnings for two files until proper fix available Message-ID: Hello, In preparation for upgrading the toolchains used at Oracle, we need the build to be clean with the new compiler versions. On Linux, we are currently aiming at GCC 7.3. In hotspot, two files generate warnings when using this compiler. I found this issue already filed with a patch for disabling the warnings and the related issue JDK-8160404 which will deal with actually fixing the problem. I would like to push the disabling of the warnings ASAP so that the compiler upgrade project can continue unhindered. Bug: https://bugs.openjdk.java.net/browse/JDK-8187676 Webrev: http://cr.openjdk.java.net/~ehelin/8187676/00/ /Erik From tim.bell at oracle.com Fri Feb 9 17:13:43 2018 From: tim.bell at oracle.com (Tim Bell) Date: Fri, 09 Feb 2018 09:13:43 -0800 Subject: RFR: JDK-8187676: Disable uninitialized warnings for two files until proper fix available In-Reply-To: References: Message-ID: <5A7DD6C7.2000103@oracle.com> Erik: > In preparation for upgrading the toolchains used at Oracle, we need the > build to be clean with the new compiler versions. On Linux, we are > currently aiming at GCC 7.3. In hotspot, two files generate warnings > when using this compiler. I found this issue already filed with a patch > for disabling the warnings and the related issue JDK-8160404 which will > deal with actually fixing the problem. I would like to push the > disabling of the warnings ASAP so that the compiler upgrade project can > continue unhindered. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8187676 > > Webrev: http://cr.openjdk.java.net/~ehelin/8187676/00/ Looks good. /Tim From vladimir.kozlov at oracle.com Fri Feb 9 17:31:36 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 9 Feb 2018 09:31:36 -0800 Subject: RFR: JDK-8187676: Disable uninitialized warnings for two files until proper fix available In-Reply-To: References: Message-ID: <3f07db43-90be-dac5-466e-8a035c85c37d@oracle.com> Good. Thanks, Vladimir K On 2/9/18 8:37 AM, Erik Joelsson wrote: > Hello, > > In preparation for upgrading the toolchains used at Oracle, we need the build to be clean with the new compiler > versions. On Linux, we are currently aiming at GCC 7.3. In hotspot, two files generate warnings when using this > compiler. I found this issue already filed with a patch for disabling the warnings and the related issue JDK-8160404 > which will deal with actually fixing the problem. I would like to push the disabling of the warnings ASAP so that the > compiler upgrade project can continue unhindered. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8187676 > > Webrev: http://cr.openjdk.java.net/~ehelin/8187676/00/ > > /Erik > From magnus.ihse.bursie at oracle.com Fri Feb 9 20:19:18 2018 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Fri, 9 Feb 2018 21:19:18 +0100 Subject: RFR: JDK-8187676: Disable uninitialized warnings for two files until proper fix available In-Reply-To: References: Message-ID: Looks good to me. (But ~ehelin..?) /Magnus > 9 feb. 2018 kl. 17:37 skrev Erik Joelsson : > > Hello, > > In preparation for upgrading the toolchains used at Oracle, we need the build to be clean with the new compiler versions. On Linux, we are currently aiming at GCC 7.3. In hotspot, two files generate warnings when using this compiler. I found this issue already filed with a patch for disabling the warnings and the related issue JDK-8160404 which will deal with actually fixing the problem. I would like to push the disabling of the warnings ASAP so that the compiler upgrade project can continue unhindered. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8187676 > > Webrev: http://cr.openjdk.java.net/~ehelin/8187676/00/ > > /Erik > From erik.joelsson at oracle.com Fri Feb 9 21:02:07 2018 From: erik.joelsson at oracle.com (Erik Joelsson) Date: Fri, 9 Feb 2018 13:02:07 -0800 Subject: RFR: JDK-8187676: Disable uninitialized warnings for two files until proper fix available In-Reply-To: References:

Message-ID: On 2018-02-09 12:19, Magnus Ihse Bursie wrote: > (But ~ehelin..?) > I found the issue filed and review posted already last year, but it was dropped in favor of fixing the underlying problem instead. I just reused his patch. See hotspot-compiler-dev archive if you are curious: http://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2017-November/027719.html /Erik From eric.caspole at oracle.com Fri Feb 9 21:48:24 2018 From: eric.caspole at oracle.com (Eric Caspole) Date: Fri, 9 Feb 2018 16:48:24 -0500 Subject: RFR(XS) (11) 8197447 : LogCompilation throws "couldn't find bytecode" Message-ID: Hi everybody, Please review this small change to the LogCompilation tool, where it did not support processing the unstable_fused_if uncommon trap which would take the place of the first unstable_if in the original code. I would like to add a test case for this but I am not sure what code sequence would be sure to get this optimization, and also not result in a giant log (where I originally found this problem while debugging an unrelated problem). Open to suggestions. Thanks, Eric JBS: https://bugs.openjdk.java.net/browse/JDK-8197447 webrev: http://cr.openjdk.java.net/~ecaspole/JDK-8197447/webrev/ From zhongwei.yao at linaro.org Sun Feb 11 02:15:56 2018 From: zhongwei.yao at linaro.org (Zhongwei Yao) Date: Sun, 11 Feb 2018 10:15:56 +0800 Subject: RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: <348d01b1-f15e-bd50-3236-f0b3a1c64a6c@redhat.com> References: <4cdcb619-754f-e6fc-bc18-67df9b892248@redhat.com>

<348d01b1-f15e-bd50-3236-f0b3a1c64a6c@redhat.com> Message-ID: Hi, Andrew, On 9 February 2018 at 17:36, Andrew Haley wrote: > On 09/02/18 02:59, Zhongwei Yao wrote: >> Ping... Is this patch OK? > > Sorry, FOSDEM conference. I see. It's all right. > Yes, this looks great. > > I believe that AArch64 is the only user of last_insn. Please check. Yes, I double checked that only AArch64 uses last_insn. > > This patch should be a good test of the submission forest. Yeah, but I'm not a JDK Committer yet. Could you help sponsor it? > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 -- Best regards, Zhongwei From zhongwei.yao at linaro.org Sun Feb 11 02:56:20 2018 From: zhongwei.yao at linaro.org (Zhongwei Yao) Date: Sun, 11 Feb 2018 10:56:20 +0800 Subject: RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: References: <4cdcb619-754f-e6fc-bc18-67df9b892248@redhat.com>

<348d01b1-f15e-bd50-3236-f0b3a1c64a6c@redhat.com> Message-ID: And I forgot to mention that since this patch touches some shared code at hotspot/share/asm/codeBuffer.hpp, do I need a sponsor from Oracle? On 11 February 2018 at 10:15, Zhongwei Yao wrote: > Hi, Andrew, > > On 9 February 2018 at 17:36, Andrew Haley wrote: >> On 09/02/18 02:59, Zhongwei Yao wrote: >>> Ping... Is this patch OK? >> >> Sorry, FOSDEM conference. > I see. It's all right. > >> Yes, this looks great. >> >> I believe that AArch64 is the only user of last_insn. Please check. > Yes, I double checked that only AArch64 uses last_insn. > >> >> This patch should be a good test of the submission forest. > Yeah, but I'm not a JDK Committer yet. Could you help sponsor it? > >> >> -- >> Andrew Haley >> Java Platform Lead Engineer >> Red Hat UK Ltd. >> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > > > > -- > Best regards, > Zhongwei -- Best regards, Zhongwei From tianxiao.gu at gmail.com Sun Feb 11 07:29:57 2018 From: tianxiao.gu at gmail.com (Tianxiao Gu) Date: Sat, 10 Feb 2018 23:29:57 -0800 Subject: C2 crashes with error message assert(is_Loop()) failed: invalid node class Message-ID: C2 crashes with error message assert(is_Loop()) failed: invalid node class The JDK image is built from the following repo. ~~~ $ hg tip changeset: 48878:c8a33db795b6 tag: tip parent: 48837:67aa88701d46 parent: 48877:5e2d2067da48 user: jwilhelm date: Sat Feb 10 09:25:35 2018 +0100 summary: Merge $ hg paths default = http://hg.openjdk.java.net/jdk/hs/ ~~~ This crash may be a duplicate to the following issues but has a different stack trace. Can anyone help me to resolve it? https://bugs.openjdk.java.net/browse/JDK-8196296 https://bugs.openjdk.java.net/browse/JDK-8196653 https://bugs.openjdk.java.net/browse/JDK-8195668 Stack trace: ~~~ --------------- S U M M A R Y ------------ Command Line: -Xcomp -XX:-TieredCompilation -XX:CompileCommand=exclude,java*::* C0 Host: wish, Intel(R) Core(TM) i7 CPU 860 @ 2.80GHz, 8 cores, 7G, Ubuntu 16.04.3 LTS Time: Sat Feb 10 23:17:29 2018 PST elapsed time: 3 seconds (0d 0h 0m 3s) --------------- T H R E A D --------------- Current thread (0x00007f7c3c257800): JavaThread "C2 CompilerThread1" daemon [_thread_in_native, id=5769, stack(0x00007f7bdb023000,0x00007f7bdb124000)] Current CompileTask: C2: 3321 365 b C0::_main (126 bytes) Stack: [0x00007f7bdb023000,0x00007f7bdb124000], sp=0x00007f7bdb11e1a0, free space=1004k Native frames: (J=compiled Java code, A=aot compiled Java code, j=interpreted, Vv=VM code, C=native code) V [libjvm.so+0x16c4d9e] VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x4ce V [libjvm.so+0x16c590f] VMError::report_and_die(Thread*, char const*, int, char const*, char const*, __va_list_tag*)+0x2f V [libjvm.so+0xab51f2] report_vm_error(char const*, int, char const*, char const*, ...)+0x112 V [libjvm.so+0x112f07e] Node::as_Loop() const [clone .part.74]+0x2e V [libjvm.so+0x1133c22] PhaseIdealLoop::try_move_store_before_loop(Node*, Node*)+0x652 V [libjvm.so+0x11340de] PhaseIdealLoop::split_if_with_blocks_pre(Node*)+0xbe V [libjvm.so+0x11359f3] PhaseIdealLoop::split_if_with_blocks(VectorSet&, Node_Stack&)+0x133 V [libjvm.so+0x112dff2] PhaseIdealLoop::build_and_optimize(bool, bool)+0x10c2 V [libjvm.so+0xa142c1] Compile::Optimize()+0x321 V [libjvm.so+0xa16052] Compile::Compile(ciEnv*, C2Compiler*, ciMethod*, int, bool, bool, bool, DirectiveSet*)+0x1172 V [libjvm.so+0x81f03b] C2Compiler::compile_method(ciEnv*, ciMethod*, int, DirectiveSet*)+0x2db V [libjvm.so+0xa22f5e] CompileBroker::invoke_compiler_on_method(CompileTask*)+0x55e V [libjvm.so+0xa244c8] CompileBroker::compiler_thread_loop()+0x408 V [libjvm.so+0x1610ce2] JavaThread::thread_main_inner()+0x1e2 V [libjvm.so+0x1349462] thread_native_entry(Thread*)+0x112 ~~~ Output: ~~~ $ ../../openjdk/hs/build/linux-x86_64-normal-server-fastdebug/images/jdk/bin/java -Xcomp -XX:-TieredCompilation -XX:CompileCommand=exclude,java*::* C0 CompileCommand: exclude java*.* # To suppress the following error report, specify this argument # after -XX: or in .hotspotrc: SuppressErrorAt=/node.hpp:821 # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (/home/wish/Projects/JVMTesting/openjdk/hs/src/hotspot/share/opto/node.hpp:821), pid=5741, tid=5769 # assert(is_Loop()) failed: invalid node class # # JRE version: OpenJDK Runtime Environment (11.0) (fastdebug build 11-internal+0-adhoc.wish.hs) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 11-internal+0-adhoc.wish.hs, compiled mode, compressed oops, g1 gc, linux-amd64) # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport %p %s %c %d %P" (or dumping to /home/wish/Projects/JVMTesting/bugs/error-20180210020349/core.5741) # # An error report file with more information is saved as: # /home/wish/Projects/JVMTesting/bugs/error-20180210020349/hs_err_pid5741.log # # Compiler replay data is saved as: # /home/wish/Projects/JVMTesting/bugs/error-20180210020349/replay_pid5741.log # # If you would like to submit a bug report, please visit: # http://bugreport.java.com/bugreport/crash.jsp # Current thread is 5769 Dumping core ... Aborted (core dumped) ~~~ Reproduce: Operating System: ~~~ $ uname -a Linux wish 4.10.0-40-generic #44~16.04.1-Ubuntu SMP Thu Nov 9 15:37:44 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux ~~~ The class file can be obtained by Jasmin. java -jar jasmin-2.4/jasmin.jar C0-full.j The content of C0-full.j is listed below. ~~~ .class public C0 .super java/lang/Object .field public static F3 [Ljava/lang/Object; .field public static F6 I .field public static F10 [C .field protected static F15 [C .method public ()V aload 0 invokenonvirtual java/lang/Object/()V return .limit stack 1 .limit locals 1 .end method .method public static main([Ljava/lang/String;)V aload 0 invokestatic C0/_main([Ljava/lang/String;)I invokestatic java/lang/System/exit(I)V return .limit stack 1 .limit locals 1 .end method .method public static _main([Ljava/lang/String;)I L0: aload 0 getstatic C0/F6 I ifgt L1 bipush -109 ireturn L1: iconst_4 putstatic C0/F6 I bipush 65 ifge L1 ifnull L0 aconst_null iconst_m1 iflt L2 new java/lang/Object dup invokenonvirtual java/lang/Object/()V astore 0 L2: getstatic C0/F15 [C dconst_0 aload 0 aload 0 iconst_3 putstatic C0/F6 I lconst_1 l2i newarray int astore 1 L5: aload 1 ifnonnull L5 fconst_0 f2i putstatic C0/F6 I dup astore 1 lconst_0 l2i putstatic C0/F6 I astore 0 lconst_0 l2f f2i putstatic C0/F6 I astore 0 L7: iconst_2 ifle L7 ldc2_w -1281607442625018332 lstore 0 getstatic C0/F10 [C astore 0 d2i dconst_0 d2i isub i2f getstatic C0/F3 [Ljava/lang/Object; astore 1 fconst_2 frem fstore 0 invokevirtual java/lang/Object/equals(Ljava/lang/Object;)Z aconst_null astore 1 dconst_0 dstore 0 getstatic C0/F10 [C astore 1 ireturn .limit stack 26 .limit locals 2 .end method ~~~ Options -Xcomp -XX:-TieredCompilation -XX:CompileCommand=exclude,java*::* C0 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: C0-full.j Type: application/octet-stream Size: 1544 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: hs_err_pid5741.log Type: application/octet-stream Size: 55218 bytes Desc: not available URL: From rwestrel at redhat.com Mon Feb 12 08:52:04 2018 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 12 Feb 2018 09:52:04 +0100 Subject: C2 crashes with error message assert(is_Loop()) failed: invalid node class In-Reply-To: References: Message-ID: Thanks for the bug report. It's new, I'll work on getting it fixed. Roland. From aph at redhat.com Mon Feb 12 09:51:02 2018 From: aph at redhat.com (Andrew Haley) Date: Mon, 12 Feb 2018 09:51:02 +0000 Subject: RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: References: <4cdcb619-754f-e6fc-bc18-67df9b892248@redhat.com>

<348d01b1-f15e-bd50-3236-f0b3a1c64a6c@redhat.com> Message-ID: <997eac31-e16c-1bbb-59c8-3515604a9b45@redhat.com> On 11/02/18 02:56, Zhongwei Yao wrote: > And I forgot to mention that since this patch touches some shared code > at hotspot/share/asm/codeBuffer.hpp, do I need a sponsor from Oracle? In theory, no. The submission forest should do that job. I'll have a try. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From zhongwei.yao at linaro.org Mon Feb 12 10:03:08 2018 From: zhongwei.yao at linaro.org (Zhongwei Yao) Date: Mon, 12 Feb 2018 18:03:08 +0800 Subject: RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: <997eac31-e16c-1bbb-59c8-3515604a9b45@redhat.com> References: <4cdcb619-754f-e6fc-bc18-67df9b892248@redhat.com>

<348d01b1-f15e-bd50-3236-f0b3a1c64a6c@redhat.com> <997eac31-e16c-1bbb-59c8-3515604a9b45@redhat.com> Message-ID: I see. Thanks. On 12 February 2018 at 17:51, Andrew Haley wrote: > On 11/02/18 02:56, Zhongwei Yao wrote: >> And I forgot to mention that since this patch touches some shared code >> at hotspot/share/asm/codeBuffer.hpp, do I need a sponsor from Oracle? > > In theory, no. The submission forest should do that job. I'll have a try. > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 -- Best regards, Zhongwei From tobias.hartmann at oracle.com Mon Feb 12 10:55:06 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 12 Feb 2018 11:55:06 +0100 Subject: C2 crashes with error message assert(is_Loop()) failed: invalid node class In-Reply-To: References: Message-ID: Hi, thanks for reporting this issue! Roland, have you filed a bug yet? Best regards, Tobias From rwestrel at redhat.com Mon Feb 12 11:02:58 2018 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 12 Feb 2018 12:02:58 +0100 Subject: C2 crashes with error message assert(is_Loop()) failed: invalid node class In-Reply-To: References:

Message-ID: > thanks for reporting this issue! Roland, have you filed a bug yet? Not yet. Roland. From rwestrel at redhat.com Mon Feb 12 13:37:17 2018 From: rwestrel at redhat.com (Roland Westrelin) Date: Mon, 12 Feb 2018 14:37:17 +0100 Subject: [11] RFR(XS): 8197563: assert(is_Loop()) crash in PhaseIdealLoop::try_move_store_before_loop() Message-ID: http://cr.openjdk.java.net/~roland/8197563/webrev.00/ The method being compiled has an infinite loop and the head of the loop is not converted to a LoopNode. A store is moved out of the infinite loop. Some code I added with 8186027 (C2: loop strip mining) assumes the head is a loop and we hit the assert failure. The fix I propose is not never try to move a store out of loop if the head is not a LoopNode. Roland. From erik.osterlund at oracle.com Mon Feb 12 14:05:04 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Mon, 12 Feb 2018 15:05:04 +0100 Subject: JDK-8171119: Low-Overhead Heap Profiling In-Reply-To: References: Message-ID: <5A819F10.8040201@oracle.com> Hi JC, Sorry for the delayed reply. Inlined answers: On 2018-02-06 00:04, JC Beyler wrote: > Hi Erik, > > (Renaming this to be folded into the newly renamed thread :)) > > First off, thanks a lot for reviewing the webrev! I appreciate it! > > I updated the webrev to: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/ > > And the incremental one is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.04_05a/ > > It contains: > - The change for since from 9 to 11 for the jvmti.xml > - The use of the OrderAccess for initialized > - Clearing the oop > > I also have inlined my answers to your comments. The biggest question > will come from the multiple *_end variables. A bit of the logic there > is due to handling the slow path refill vs fast path refill and > checking that the rug was not pulled underneath the slowpath. I > believe that a previous comment was that TlabFastRefill was going to > be deprecated. > > If this is true, we could revert this code a bit and just do a : if > TlabFastRefill is enabled, disable this. And then deprecate that when > TlabFastRefill is deprecated. > > This might simplify this webrev and I can work on a follow-up that > either: removes TlabFastRefill if Robbin does not have the time to do > it or add the support to the assembly side to handle this correctly. > What do you think? I support removing TlabFastRefill, but I think it is good to not depend on that happening first. > Now, below, inlined are my answers: > > On Fri, Feb 2, 2018 at 8:44 AM, Erik ?sterlund > wrote: >> Hi JC, >> >> Hope I am reviewing the right version of your work. Here goes... >> >> src/hotspot/share/gc/shared/collectedHeap.inline.hpp: >> >> 159 AllocTracer::send_allocation_outside_tlab(klass, result, size * >> HeapWordSize, THREAD); >> 160 >> 161 THREAD->tlab().handle_sample(THREAD, result, size); >> 162 return result; >> 163 } >> >> Should not call tlab()->X without checking if (UseTLAB) IMO. >> > Done! More about this later. > >> src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp: >> >> So first of all, there seems to quite a few ends. There is an "end", a "hard >> end", a "slow path end", and an "actual end". Moreover, it seems like the >> "hard end" is actually further away than the "actual end". So the "hard end" >> seems like more of a "really definitely actual end" or something. I don't >> know about you, but I think it looks kind of messy. In particular, I don't >> feel like the name "actual end" reflects what it represents, especially when >> there is another end that is behind the "actual end". >> >> 413 HeapWord* ThreadLocalAllocBuffer::hard_end() { >> 414 // Did a fast TLAB refill occur? >> 415 if (_slow_path_end != _end) { >> 416 // Fix up the actual end to be now the end of this TLAB. >> 417 _slow_path_end = _end; >> 418 _actual_end = _end; >> 419 } >> 420 >> 421 return _actual_end + alignment_reserve(); >> 422 } >> >> I really do not like making getters unexpectedly have these kind of side >> effects. It is not expected that when you ask for the "hard end", you >> implicitly update the "slow path end" and "actual end" to new values. >> > As I said, a lot of this is due to the FastTlabRefill. If I make this > not supporting FastTlabRefill, this goes away. The reason the system > needs to update itself at the get is that you only know at that get if > things have shifted underneath the tlab slow path. I am not sure of > really better names (naming is hard!), perhaps we could do these > names: > > - current_tlab_end // Either the allocated tlab end or a sampling point > - last_allocation_address // The end of the tlab allocation > - last_slowpath_allocated_end // In case a fast refill occurred the > end might have changed, this is to remember slow vs fast past refills > > the hard_end method can be renamed to something like: > tlab_end_pointer() // The end of the lab including a bit of > alignment reserved bytes Those names sound better to me. Could you please provide a mapping from the old names to the new names so I understand which one is which please? This is my current guess of what you are proposing: end -> current_tlab_end actual_end -> last_allocation_address slow_path_end -> last_slowpath_allocated_end hard_end -> tlab_end_pointer I would prefer this naming: end -> slow_path_end // the end for taking a slow path; either due to sampling or refilling actual_end -> allocation_end // the end for allocations slow_path_end -> last_slow_path_end // last address for slow_path_end (as opposed to allocation_end) hard_end -> reserved_end // the end of the reserved space of the TLAB About setting things in the getter... that still seems like a very unpleasant thing to me. It would be better to inspect the call hierarchy and explicitly update the ends where they need updating, and assert in the getter that they are in sync, rather than implicitly setting various ends as a surprising side effect in a getter. It looks like the call hierarchy is very small. With my new naming convention, reserved_end() would presumably return _allocation_end + alignment_reserve(), and have an assert checking that _allocation_end == _last_slow_path_allocation_end, complaining that this invariant must hold, and that a caller to this function, such as make_parsable(), must first explicitly synchronize the ends as required, to honor that invariant. > Not sure it's better but before updating the webrev, I wanted to try > to get input/consensus :) > > (Note hard_end was always further off than end). > >> src/hotspot/share/prims/jvmti.xml: >> >> 10357 >> 10358 >> 10359 Can sample the heap. >> 10360 If this capability is enabled then the heap sampling methods >> can be called. >> 10361 >> 10362 >> >> Looks like this capability should not be "since 9" if it gets integrated >> now. > Updated now to 11, crossing my fingers :) > > >> src/hotspot/share/runtime/heapMonitoring.cpp: >> >> 448 if (is_alive->do_object_b(value)) { >> 449 // Update the oop to point to the new object if it is still >> alive. >> 450 f->do_oop(&(trace.obj)); >> 451 >> 452 // Copy the old trace, if it is still live. >> 453 _allocated_traces->at_put(curr_pos++, trace); >> 454 >> 455 // Store the live trace in a cache, to be served up on /heapz. >> 456 _traces_on_last_full_gc->append(trace); >> 457 >> 458 count++; >> 459 } else { >> 460 // If the old trace is no longer live, add it to the list of >> 461 // recently collected garbage. >> 462 store_garbage_trace(trace); >> 463 } >> >> In the case where the oop was not live, I would like it to be explicitly >> cleared. > Done I think how you wanted it. Let me know because I'm not familiar > with the RootAccess API. I'm unclear if I'm doing this right or not so > reviews of these parts are highly appreciated. Robbin had talked of > perhaps later pushing this all into a OopStorage, should I do this now > do you think? Or can that wait a second webrev later down the road? I think using handles can and should be done later. You can use the Access API now. I noticed that you are missing an #include "oops/access.inline.hpp" in your heapMonitoring.cpp file. > + Did I clear it the way you wanted me to or were you thinking of > something else? That is precisely how I wanted it to be cleared. Thanks. > + Final question here, seems like if I were to want to not do the > f->do_oop directly on the trace.obj, I'd need to do something like: > > f->do_oop(&value); > ... > trace->store_oop(value); > > to update the oop internally. Is that right/is that one of the > advantages of going to the Oopstorage sooner than later? I think you really want to do the do_oop on the root directly. Is there a particular reason why you would not want to do that? Otherwise, yes - the benefit with using the handle approach is that you do not need to call do_oop explicitly in your code. > >> Also I see a lot of concurrent-looking use of the following field: >> 267 volatile bool _initialized; >> >> Please note that the "volatile" qualifier does not help with reordering >> here. Reordering between volatile and non-volatile fields is completely free >> for both compiler and hardware, except for windows with MSVC, where volatile >> semantics is defined to use acquire/release semantics, and the hardware is >> TSO. But for the general case, I would expect this field to be stored with >> OrderAccess::release_store and loaded with OrderAccess::load_acquire. >> Otherwise it is not thread safe. > Because everything is behind a mutex, I wasn't really worried about > this. I have a test that has multiple threads trying to hit this > corner case and it passes. > > However, to be paranoid, I updated it to using the OrderAccess API > now, thanks! Let me know what you think there too! If it is indeed always supposed to be read and written under a mutex, then I would strongly prefer to have it accessed as a normal non-volatile member, and have an assertion that given lock is held or we are in a safepoint, as we do in many other places. Something like this: assert(HeapMonitorStorage_lock->owned_by_self() || (SafepointSynchronize::is_at_safepoint() && Thread::current()->is_VM_thread()), "this should not be accessed concurrently"); It would be confusing to people reading the code if there are uses of OrderAccess that are actually always protected under a mutex. >> As a kind of meta comment, I wonder if it would make sense to add sampling >> for non-TLAB allocations. Seems like if someone is rapidly allocating a >> whole bunch of 1 MB objects that never fit in a TLAB, I might still be >> interested in seeing that in my traces, and not get surprised that the >> allocation rate is very high yet not showing up in any profiles. >> > That is handled by the handle_sample where you wanted me to put a > UseTlab because you hit that case if the allocation is too big. I see. It was not obvious to me that non-TLAB sampling is done in the TLAB class. That seems like an abstraction crime. What I wanted in my previous comment was that we do not call into the TLAB when we are not using TLABs. If there is sampling logic in the TLAB that is used for something else than TLABs, then it seems like that logic simply does not belong inside of the TLAB. It should be moved out of the TLAB, and instead have the TLAB call this common abstraction that makes sense. Hope I have answered your questions and that my feedback makes sense to you. Thanks, /Erik > I double checked by changing the test > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatObjectCorrectnessTest.java > > to use a smaller Tlab (2048) and made the object bigger and it goes > through that and passes. > > Thanks again for your review and I look forward to your pointers for > the questions I now have raised! > Jc > > > > > > > > > >> Thanks, >> /Erik >> >> >> On 2018-01-26 06:45, JC Beyler wrote: >>> Thanks Robbin for the reviews :) >>> >>> The new full webrev is here: >>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/ >>> The incremental webrev is here: >>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/ >>> >>> I inlined my answers: >>> >>> On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn wrote: >>>> Hi JC, great to see another revision! >>>> >>>> #### >>>> heapMonitoring.cpp >>>> >>>> StackTraceData should not contain the oop for 'safety' reasons. >>>> When StackTraceData is moved from _allocated_traces: >>>> L452 store_garbage_trace(trace); >>>> it contains a dead oop. >>>> _allocated_traces could instead be a tupel of oop and StackTraceData thus >>>> dead oops are not kept. >>> Done I used inheritance to make the copier work regardless but the >>> idea is the same. >>> >>>> You should use the new Access API for loading the oop, something like >>>> this: >>>> RootAccess::load(...) >>>> I don't think you need to use Access API for clearing the oop, but it >>>> would >>>> look nicer. And you shouldn't probably be using: >>>> Universe::heap()->is_in_reserved(value) >>> I am unfamiliar with this but I think I did do it like you wanted me >>> to (all tests pass so that's a start). I'm not sure how to clear the >>> oop exactly, is there somewhere that does that, which I can use to do >>> the same? >>> >>> I removed the is_in_reserved, this came from our internal version, I >>> don't know why it was there but my tests work without so I removed it >>> :) >>> >>> >>>> The lock: >>>> L424 MutexLocker mu(HeapMonitorStorage_lock); >>>> Is not needed as far as I can see. >>>> weak_oops_do is called in a safepoint, no TLAB allocation can happen and >>>> JVMTI thread can't access these data-structures. Is there something more >>>> to >>>> this lock that I'm missing? >>> Since a thread can call the JVMTI getLiveTraces (or any of the other >>> ones), it can get to the point of trying to copying the >>> _allocated_traces. I imagine it is possible that this is happening >>> during a GC or that it can be started and a GC happens afterwards. >>> Therefore, it seems to me that you want this protected, no? >>> >>> >>>> #### >>>> You have 6 files without any changes in them (any more): >>>> g1CollectedHeap.cpp >>>> psMarkSweep.cpp >>>> psParallelCompact.cpp >>>> genCollectedHeap.cpp >>>> referenceProcessor.cpp >>>> thread.hpp >>>> >>> Done. >>> >>>> #### >>>> I have not looked closely, but is it possible to hide heap sampling in >>>> AllocTracer ? (with some minor changes to the AllocTracer API) >>>> >>> I am imagining that you are saying to move the code that does the >>> sampling code (change the tlab end, do the call to HeapMonitoring, >>> etc.) into the AllocTracer code itself? I think that is right and I'll >>> look if that is possible and prepare a webrev to show what would be >>> needed to make that happen. >>> >>>> #### >>>> Minor nit, when declaring pointer there is a little mix of having the >>>> pointer adjacent by type name and data name. (Most hotspot code is by >>>> type >>>> name) >>>> E.g. >>>> heapMonitoring.cpp:711 jvmtiStackTrace *trace = .... >>>> heapMonitoring.cpp:733 Method* m = vfst.method(); >>>> (not just this file) >>>> >>> Done! >>> >>>> #### >>>> HeapMonitorThreadOnOffTest.java:77 >>>> I would make g_tmp volatile, otherwise the assignment in loop may >>>> theoretical be skipped. >>>> >>> Also done! >>> >>> Thanks again! >>> Jc >> From vladimir.kozlov at oracle.com Mon Feb 12 17:07:46 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Mon, 12 Feb 2018 09:07:46 -0800 Subject: [11] RFR(XS): 8197563: assert(is_Loop()) crash in PhaseIdealLoop::try_move_store_before_loop() In-Reply-To: References: Message-ID: <106aa4af-5a27-e12c-de72-5ea1f329643d@oracle.com> Good. Thank you for fixing it fast. Vladimir On 2/12/18 5:37 AM, Roland Westrelin wrote: > > http://cr.openjdk.java.net/~roland/8197563/webrev.00/ > > The method being compiled has an infinite loop and the head of the loop > is not converted to a LoopNode. A store is moved out of the infinite > loop. Some code I added with 8186027 (C2: loop strip mining) assumes the > head is a loop and we hit the assert failure. The fix I propose is not > never try to move a store out of loop if the head is not a LoopNode. > > Roland. > From jcbeyler at google.com Tue Feb 13 05:18:28 2018 From: jcbeyler at google.com (JC Beyler) Date: Mon, 12 Feb 2018 21:18:28 -0800 Subject: JDK-8171119: Low-Overhead Heap Profiling In-Reply-To: <5A819F10.8040201@oracle.com> References: <5A819F10.8040201@oracle.com> Message-ID: Hi Erik, Thanks for your answers, I've now inlined my own answers/comments. I've done a new webrev here: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.08/ The incremental is here: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/ Note to all: - I've been integrating changes from Erin/Serguei/David comments so this webrev incremental is a bit an answer to all comments in one. I apologize for that :) On Mon, Feb 12, 2018 at 6:05 AM, Erik ?sterlund wrote: > Hi JC, > > Sorry for the delayed reply. > > Inlined answers: > > > On 2018-02-06 00:04, JC Beyler wrote: > >> Hi Erik, >> >> (Renaming this to be folded into the newly renamed thread :)) >> >> First off, thanks a lot for reviewing the webrev! I appreciate it! >> >> I updated the webrev to: >> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/ >> >> And the incremental one is here: >> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.04_05a/ >> >> It contains: >> - The change for since from 9 to 11 for the jvmti.xml >> - The use of the OrderAccess for initialized >> - Clearing the oop >> >> I also have inlined my answers to your comments. The biggest question >> will come from the multiple *_end variables. A bit of the logic there >> is due to handling the slow path refill vs fast path refill and >> checking that the rug was not pulled underneath the slowpath. I >> believe that a previous comment was that TlabFastRefill was going to >> be deprecated. >> >> If this is true, we could revert this code a bit and just do a : if >> TlabFastRefill is enabled, disable this. And then deprecate that when >> TlabFastRefill is deprecated. >> >> This might simplify this webrev and I can work on a follow-up that >> either: removes TlabFastRefill if Robbin does not have the time to do >> it or add the support to the assembly side to handle this correctly. >> What do you think? >> > > I support removing TlabFastRefill, but I think it is good to not depend on > that happening first. > > I'm slowly pushing on the FastTLABRefill (https://bugs.openjdk.java.net /browse/JDK-8194084), I agree on keeping both separate for now though so that we can think of both differently > Now, below, inlined are my answers: >> >> On Fri, Feb 2, 2018 at 8:44 AM, Erik ?sterlund >> wrote: >> >>> Hi JC, >>> >>> Hope I am reviewing the right version of your work. Here goes... >>> >>> src/hotspot/share/gc/shared/collectedHeap.inline.hpp: >>> >>> 159 AllocTracer::send_allocation_outside_tlab(klass, result, size >>> * >>> HeapWordSize, THREAD); >>> 160 >>> 161 THREAD->tlab().handle_sample(THREAD, result, size); >>> 162 return result; >>> 163 } >>> >>> Should not call tlab()->X without checking if (UseTLAB) IMO. >>> >>> Done! >> > > More about this later. > > > >> src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp: >>> >>> So first of all, there seems to quite a few ends. There is an "end", a >>> "hard >>> end", a "slow path end", and an "actual end". Moreover, it seems like the >>> "hard end" is actually further away than the "actual end". So the "hard >>> end" >>> seems like more of a "really definitely actual end" or something. I don't >>> know about you, but I think it looks kind of messy. In particular, I >>> don't >>> feel like the name "actual end" reflects what it represents, especially >>> when >>> there is another end that is behind the "actual end". >>> >>> 413 HeapWord* ThreadLocalAllocBuffer::hard_end() { >>> 414 // Did a fast TLAB refill occur? >>> 415 if (_slow_path_end != _end) { >>> 416 // Fix up the actual end to be now the end of this TLAB. >>> 417 _slow_path_end = _end; >>> 418 _actual_end = _end; >>> 419 } >>> 420 >>> 421 return _actual_end + alignment_reserve(); >>> 422 } >>> >>> I really do not like making getters unexpectedly have these kind of side >>> effects. It is not expected that when you ask for the "hard end", you >>> implicitly update the "slow path end" and "actual end" to new values. >>> >>> As I said, a lot of this is due to the FastTlabRefill. If I make this >> not supporting FastTlabRefill, this goes away. The reason the system >> needs to update itself at the get is that you only know at that get if >> things have shifted underneath the tlab slow path. I am not sure of >> really better names (naming is hard!), perhaps we could do these >> names: >> >> - current_tlab_end // Either the allocated tlab end or a sampling >> point >> - last_allocation_address // The end of the tlab allocation >> - last_slowpath_allocated_end // In case a fast refill occurred the >> end might have changed, this is to remember slow vs fast past refills >> >> the hard_end method can be renamed to something like: >> tlab_end_pointer() // The end of the lab including a bit of >> alignment reserved bytes >> > > Those names sound better to me. Could you please provide a mapping from > the old names to the new names so I understand which one is which please? > > This is my current guess of what you are proposing: > > end -> current_tlab_end > actual_end -> last_allocation_address > slow_path_end -> last_slowpath_allocated_end > hard_end -> tlab_end_pointer > > Yes that is correct, that was what I was proposing. > I would prefer this naming: > > end -> slow_path_end // the end for taking a slow path; either due to > sampling or refilling > actual_end -> allocation_end // the end for allocations > slow_path_end -> last_slow_path_end // last address for slow_path_end (as > opposed to allocation_end) > hard_end -> reserved_end // the end of the reserved space of the TLAB > > About setting things in the getter... that still seems like a very > unpleasant thing to me. It would be better to inspect the call hierarchy > and explicitly update the ends where they need updating, and assert in the > getter that they are in sync, rather than implicitly setting various ends > as a surprising side effect in a getter. It looks like the call hierarchy > is very small. With my new naming convention, reserved_end() would > presumably return _allocation_end + alignment_reserve(), and have an assert > checking that _allocation_end == _last_slow_path_allocation_end, > complaining that this invariant must hold, and that a caller to this > function, such as make_parsable(), must first explicitly synchronize the > ends as required, to honor that invariant. > > I've renamed the variables to how you preferred it except for the _end one. I did: current_end last_allocation_address tlab_end_ptr The reason is that the architecture dependent code use the thread.hpp API and it already has tlab included into the name so it becomes tlab_current_end (which is better that tlab_current_tlab_end in my opinion). I also moved the update into a separate method with a TODO that says to remove it when FastTLABRefill is deprecated > > Not sure it's better but before updating the webrev, I wanted to try >> to get input/consensus :) >> >> (Note hard_end was always further off than end). >> >> src/hotspot/share/prims/jvmti.xml: >>> >>> 10357 >>> 10358 >>> 10359 Can sample the heap. >>> 10360 If this capability is enabled then the heap sampling >>> methods >>> can be called. >>> 10361 >>> 10362 >>> >>> Looks like this capability should not be "since 9" if it gets integrated >>> now. >>> >> Updated now to 11, crossing my fingers :) >> >> >> src/hotspot/share/runtime/heapMonitoring.cpp: >>> >>> 448 if (is_alive->do_object_b(value)) { >>> 449 // Update the oop to point to the new object if it is still >>> alive. >>> 450 f->do_oop(&(trace.obj)); >>> 451 >>> 452 // Copy the old trace, if it is still live. >>> 453 _allocated_traces->at_put(curr_pos++, trace); >>> 454 >>> 455 // Store the live trace in a cache, to be served up on >>> /heapz. >>> 456 _traces_on_last_full_gc->append(trace); >>> 457 >>> 458 count++; >>> 459 } else { >>> 460 // If the old trace is no longer live, add it to the list >>> of >>> 461 // recently collected garbage. >>> 462 store_garbage_trace(trace); >>> 463 } >>> >>> In the case where the oop was not live, I would like it to be explicitly >>> cleared. >>> >> Done I think how you wanted it. Let me know because I'm not familiar >> with the RootAccess API. I'm unclear if I'm doing this right or not so >> reviews of these parts are highly appreciated. Robbin had talked of >> perhaps later pushing this all into a OopStorage, should I do this now >> do you think? Or can that wait a second webrev later down the road? >> > > I think using handles can and should be done later. You can use the Access > API now. > I noticed that you are missing an #include "oops/access.inline.hpp" in > your heapMonitoring.cpp file. > > The missing header is there for me so I don't know, I made sure it is present in the latest webrev. Sorry about that. > + Did I clear it the way you wanted me to or were you thinking of >> something else? >> > > That is precisely how I wanted it to be cleared. Thanks. > > + Final question here, seems like if I were to want to not do the >> f->do_oop directly on the trace.obj, I'd need to do something like: >> >> f->do_oop(&value); >> ... >> trace->store_oop(value); >> >> to update the oop internally. Is that right/is that one of the >> advantages of going to the Oopstorage sooner than later? >> > > I think you really want to do the do_oop on the root directly. Is there a > particular reason why you would not want to do that? > Otherwise, yes - the benefit with using the handle approach is that you do > not need to call do_oop explicitly in your code. > > There is no reason except that now we have a load_oop and a get_oop_addr, I was not sure what you would think of that. > >> Also I see a lot of concurrent-looking use of the following field: >>> 267 volatile bool _initialized; >>> >>> Please note that the "volatile" qualifier does not help with reordering >>> here. Reordering between volatile and non-volatile fields is completely >>> free >>> for both compiler and hardware, except for windows with MSVC, where >>> volatile >>> semantics is defined to use acquire/release semantics, and the hardware >>> is >>> TSO. But for the general case, I would expect this field to be stored >>> with >>> OrderAccess::release_store and loaded with OrderAccess::load_acquire. >>> Otherwise it is not thread safe. >>> >> Because everything is behind a mutex, I wasn't really worried about >> this. I have a test that has multiple threads trying to hit this >> corner case and it passes. >> >> However, to be paranoid, I updated it to using the OrderAccess API >> now, thanks! Let me know what you think there too! >> > > If it is indeed always supposed to be read and written under a mutex, then > I would strongly prefer to have it accessed as a normal non-volatile > member, and have an assertion that given lock is held or we are in a > safepoint, as we do in many other places. Something like this: > > assert(HeapMonitorStorage_lock->owned_by_self() || > (SafepointSynchronize::is_at_safepoint() && Thread::current()->is_VM_thread()), > "this should not be accessed concurrently"); > > It would be confusing to people reading the code if there are uses of > OrderAccess that are actually always protected under a mutex. > > Thank you for the exact example to be put in the code! I put it around each access/assignment of the _initialized method and found one case where yes you can touch it and not have the lock. It actually is "ok" because you don't act on the storage until later and only when you really want to modify the storage (see the object_alloc_do_sample method which calls the add_trace method). But, because of this, I'm going to put the OrderAccess here, I'll do some performance numbers later and if there are issues, I might add a "unsafe" read and a "safe" one to make it explicit to the reader. But I don't think it will come to that. > As a kind of meta comment, I wonder if it would make sense to add sampling >>> for non-TLAB allocations. Seems like if someone is rapidly allocating a >>> whole bunch of 1 MB objects that never fit in a TLAB, I might still be >>> interested in seeing that in my traces, and not get surprised that the >>> allocation rate is very high yet not showing up in any profiles. >>> >>> That is handled by the handle_sample where you wanted me to put a >> UseTlab because you hit that case if the allocation is too big. >> > > I see. It was not obvious to me that non-TLAB sampling is done in the TLAB > class. That seems like an abstraction crime. > What I wanted in my previous comment was that we do not call into the TLAB > when we are not using TLABs. If there is sampling logic in the TLAB that is > used for something else than TLABs, then it seems like that logic simply > does not belong inside of the TLAB. It should be moved out of the TLAB, and > instead have the TLAB call this common abstraction that makes sense. > > So in the incremental version: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/, this is still a "crime". The reason is that the system has to have the bytes_until_sample on a per-thread level and it made "sense" to have it with the TLAB implementation. Also, I was not sure how people felt about adding something to the thread instance instead. Do you think it fits better at the Thread level? I can see how difficult it is to make it happen there and add some logic there. Let me know what you think. > Hope I have answered your questions and that my feedback makes sense to > you. > > You have and thank you for them, I think we are getting to a cleaner implementation and things are getting better and more readable :) Thanks for your help! Jc > Thanks, > /Erik > > > I double checked by changing the test >> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/raw_ >> files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonito >> r/MyPackage/HeapMonitorStatObjectCorrectnessTest.java >> >> to use a smaller Tlab (2048) and made the object bigger and it goes >> through that and passes. >> >> Thanks again for your review and I look forward to your pointers for >> the questions I now have raised! >> Jc >> >> >> >> >> >> >> >> >> >> Thanks, >>> /Erik >>> >>> >>> On 2018-01-26 06:45, JC Beyler wrote: >>> >>>> Thanks Robbin for the reviews :) >>>> >>>> The new full webrev is here: >>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/ >>>> The incremental webrev is here: >>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/ >>>> >>>> I inlined my answers: >>>> >>>> On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn >>>> wrote: >>>> >>>>> Hi JC, great to see another revision! >>>>> >>>>> #### >>>>> heapMonitoring.cpp >>>>> >>>>> StackTraceData should not contain the oop for 'safety' reasons. >>>>> When StackTraceData is moved from _allocated_traces: >>>>> L452 store_garbage_trace(trace); >>>>> it contains a dead oop. >>>>> _allocated_traces could instead be a tupel of oop and StackTraceData >>>>> thus >>>>> dead oops are not kept. >>>>> >>>> Done I used inheritance to make the copier work regardless but the >>>> idea is the same. >>>> >>>> You should use the new Access API for loading the oop, something like >>>>> this: >>>>> RootAccess::load(...) >>>>> I don't think you need to use Access API for clearing the oop, but it >>>>> would >>>>> look nicer. And you shouldn't probably be using: >>>>> Universe::heap()->is_in_reserved(value) >>>>> >>>> I am unfamiliar with this but I think I did do it like you wanted me >>>> to (all tests pass so that's a start). I'm not sure how to clear the >>>> oop exactly, is there somewhere that does that, which I can use to do >>>> the same? >>>> >>>> I removed the is_in_reserved, this came from our internal version, I >>>> don't know why it was there but my tests work without so I removed it >>>> :) >>>> >>>> >>>> The lock: >>>>> L424 MutexLocker mu(HeapMonitorStorage_lock); >>>>> Is not needed as far as I can see. >>>>> weak_oops_do is called in a safepoint, no TLAB allocation can happen >>>>> and >>>>> JVMTI thread can't access these data-structures. Is there something >>>>> more >>>>> to >>>>> this lock that I'm missing? >>>>> >>>> Since a thread can call the JVMTI getLiveTraces (or any of the other >>>> ones), it can get to the point of trying to copying the >>>> _allocated_traces. I imagine it is possible that this is happening >>>> during a GC or that it can be started and a GC happens afterwards. >>>> Therefore, it seems to me that you want this protected, no? >>>> >>>> >>>> #### >>>>> You have 6 files without any changes in them (any more): >>>>> g1CollectedHeap.cpp >>>>> psMarkSweep.cpp >>>>> psParallelCompact.cpp >>>>> genCollectedHeap.cpp >>>>> referenceProcessor.cpp >>>>> thread.hpp >>>>> >>>>> Done. >>>> >>>> #### >>>>> I have not looked closely, but is it possible to hide heap sampling in >>>>> AllocTracer ? (with some minor changes to the AllocTracer API) >>>>> >>>>> I am imagining that you are saying to move the code that does the >>>> sampling code (change the tlab end, do the call to HeapMonitoring, >>>> etc.) into the AllocTracer code itself? I think that is right and I'll >>>> look if that is possible and prepare a webrev to show what would be >>>> needed to make that happen. >>>> >>>> #### >>>>> Minor nit, when declaring pointer there is a little mix of having the >>>>> pointer adjacent by type name and data name. (Most hotspot code is by >>>>> type >>>>> name) >>>>> E.g. >>>>> heapMonitoring.cpp:711 jvmtiStackTrace *trace = .... >>>>> heapMonitoring.cpp:733 Method* m = vfst.method(); >>>>> (not just this file) >>>>> >>>>> Done! >>>> >>>> #### >>>>> HeapMonitorThreadOnOffTest.java:77 >>>>> I would make g_tmp volatile, otherwise the assignment in loop may >>>>> theoretical be skipped. >>>>> >>>>> Also done! >>>> >>>> Thanks again! >>>> Jc >>>> >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vaibhav.x.choudhary at oracle.com Tue Feb 13 08:49:46 2018 From: vaibhav.x.choudhary at oracle.com (Vaibhav Choudhary) Date: Tue, 13 Feb 2018 14:19:46 +0530 Subject: RFR: JDK-8197534: [TESTBUG] compiler/gcbarriers/PreserveFPRegistersTest.java fails with Conflicting collector combinations Message-ID: <300849D0-499A-4397-93EB-F162A3ED5A46@oracle.com> Hi All, Please review the fix for bug: https://bugs.openjdk.java.net/browse/JDK-8197534 Webrev: http://cr.openjdk.java.net/~rpatil/8197534/webrev.00/ Thanks, Vaibhav C -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Tue Feb 13 09:20:09 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 13 Feb 2018 10:20:09 +0100 Subject: RFR: JDK-8197534: [TESTBUG] compiler/gcbarriers/PreserveFPRegistersTest.java fails with Conflicting collector combinations In-Reply-To: <300849D0-499A-4397-93EB-F162A3ED5A46@oracle.com> References: <300849D0-499A-4397-93EB-F162A3ED5A46@oracle.com> Message-ID: Hi Vaibhav, You linked the wrong webrev. Here's the correct one: http://cr.openjdk.java.net/~rpatil/8197534/webrev.00 Wouldn't it make more sense to backport the full fix for JDK-8153194 [1]? Thanks, Tobias [1] https://bugs.openjdk.java.net/browse/JDK-8153194 On 13.02.2018 09:49, Vaibhav Choudhary wrote: > Hi All, > Please review the fix for bug: https://bugs.openjdk.java.net/browse/JDK-8197534 > Webrev: http://cr.openjdk.java.net/~rpatil/8197534/webrev.00/ > > Thanks, > Vaibhav C > From tobias.hartmann at oracle.com Tue Feb 13 09:27:03 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 13 Feb 2018 10:27:03 +0100 Subject: [11] RFR(XS): 8197563: assert(is_Loop()) crash in PhaseIdealLoop::try_move_store_before_loop() In-Reply-To: References: Message-ID: Hi Roland, looks good to me. I'll sponsor. Best regards, Tobias On 12.02.2018 14:37, Roland Westrelin wrote: > > http://cr.openjdk.java.net/~roland/8197563/webrev.00/ > > The method being compiled has an infinite loop and the head of the loop > is not converted to a LoopNode. A store is moved out of the infinite > loop. Some code I added with 8186027 (C2: loop strip mining) assumes the > head is a loop and we hit the assert failure. The fix I propose is not > never try to move a store out of loop if the head is not a LoopNode. > > Roland. > From tobias.hartmann at oracle.com Tue Feb 13 10:12:05 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 13 Feb 2018 11:12:05 +0100 Subject: [11] RFR(XS): 8197608: MacroAssembler::unimplemented calls global operator new[] Message-ID: <2f678321-208e-1892-ef88-57af3280bfde@oracle.com> Hi, please review the following fix that removes C heap allocation of string buffers: https://bugs.openjdk.java.net/browse/JDK-8197608 http://cr.openjdk.java.net/~thartmann/8197608/webrev.00/ This is consistent with the implementation on ARM: http://hg.openjdk.java.net/jdk/hs/file/040293c73621/src/hotspot/cpu/arm/macroAssembler_arm.cpp#l1710 Thanks, Tobias From tobias.hartmann at oracle.com Tue Feb 13 10:38:15 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 13 Feb 2018 11:38:15 +0100 Subject: [11] RFR(XS): 8197828: Null pointer dereference in MultiNode::proj_out_or_null Message-ID: Hi, please review the following patch that adds a null check for the return value of AllocateNode::initialization(): https://bugs.openjdk.java.net/browse/JDK-8197828 http://cr.openjdk.java.net/~thartmann/8197828/webrev.00/ Thanks, Tobias From shade at redhat.com Tue Feb 13 10:42:49 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 13 Feb 2018 11:42:49 +0100 Subject: [11] RFR(XS): 8197608: MacroAssembler::unimplemented calls global operator new[] In-Reply-To: <2f678321-208e-1892-ef88-57af3280bfde@oracle.com> References: <2f678321-208e-1892-ef88-57af3280bfde@oracle.com> Message-ID: <2d6a9f2c-4073-e8f6-87bf-9a3c4a9574e4@redhat.com> On 02/13/2018 11:12 AM, Tobias Hartmann wrote: > please review the following fix that removes C heap allocation of string buffers: > https://bugs.openjdk.java.net/browse/JDK-8197608 > http://cr.openjdk.java.net/~thartmann/8197608/webrev.00/ I was looking at the same recently, the patch looks good. Thanks, -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From shade at redhat.com Tue Feb 13 10:43:23 2018 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 13 Feb 2018 11:43:23 +0100 Subject: [11] RFR(XS): 8197828: Null pointer dereference in MultiNode::proj_out_or_null In-Reply-To: References: Message-ID: <4faf5ee0-85e1-25e7-7bf6-0305f3450293@redhat.com> On 02/13/2018 11:38 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch that adds a null check for the return value of AllocateNode::initialization(): > https://bugs.openjdk.java.net/browse/JDK-8197828 > http://cr.openjdk.java.net/~thartmann/8197828/webrev.00/ Looks trivial and good. -Aleksey -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From tobias.hartmann at oracle.com Tue Feb 13 10:47:01 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 13 Feb 2018 11:47:01 +0100 Subject: [11] RFR(XS): 8197828: Null pointer dereference in MultiNode::proj_out_or_null In-Reply-To: <4faf5ee0-85e1-25e7-7bf6-0305f3450293@redhat.com> References: <4faf5ee0-85e1-25e7-7bf6-0305f3450293@redhat.com> Message-ID: <6853c2cf-d4c8-6f6e-6b90-ef6a3f3d88a0@oracle.com> Hi Aleksey, thanks for the review! Best regards, Tobias On 13.02.2018 11:43, Aleksey Shipilev wrote: > On 02/13/2018 11:38 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following patch that adds a null check for the return value of AllocateNode::initialization(): >> https://bugs.openjdk.java.net/browse/JDK-8197828 >> http://cr.openjdk.java.net/~thartmann/8197828/webrev.00/ > > Looks trivial and good. > > -Aleksey > From tobias.hartmann at oracle.com Tue Feb 13 10:47:26 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 13 Feb 2018 11:47:26 +0100 Subject: [11] RFR(XS): 8197608: MacroAssembler::unimplemented calls global operator new[] In-Reply-To: <2d6a9f2c-4073-e8f6-87bf-9a3c4a9574e4@redhat.com> References: <2f678321-208e-1892-ef88-57af3280bfde@oracle.com> <2d6a9f2c-4073-e8f6-87bf-9a3c4a9574e4@redhat.com> Message-ID: <237af8e0-af10-be05-c656-d13758040887@oracle.com> Hi Aleksey, thanks for the review! Best regards, Tobias On 13.02.2018 11:42, Aleksey Shipilev wrote: > On 02/13/2018 11:12 AM, Tobias Hartmann wrote: >> please review the following fix that removes C heap allocation of string buffers: >> https://bugs.openjdk.java.net/browse/JDK-8197608 >> http://cr.openjdk.java.net/~thartmann/8197608/webrev.00/ > > I was looking at the same recently, the patch looks good. > > Thanks, > -Aleksey > From rwestrel at redhat.com Tue Feb 13 15:47:57 2018 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 13 Feb 2018 16:47:57 +0100 Subject: [11] RFR(XS): 8197563: assert(is_Loop()) crash in PhaseIdealLoop::try_move_store_before_loop() In-Reply-To: <106aa4af-5a27-e12c-de72-5ea1f329643d@oracle.com> References: <106aa4af-5a27-e12c-de72-5ea1f329643d@oracle.com> Message-ID: > Good. Thank you for fixing it fast. Thanks for the review. Roland. From rwestrel at redhat.com Tue Feb 13 15:48:10 2018 From: rwestrel at redhat.com (Roland Westrelin) Date: Tue, 13 Feb 2018 16:48:10 +0100 Subject: [11] RFR(XS): 8197563: assert(is_Loop()) crash in PhaseIdealLoop::try_move_store_before_loop() In-Reply-To: References:

Message-ID: > looks good to me. I'll sponsor. Thanks for the review & sponsoring. Roland. From vladimir.kozlov at oracle.com Tue Feb 13 17:47:35 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 13 Feb 2018 09:47:35 -0800 Subject: [11] RFR(XS): 8197828: Null pointer dereference in MultiNode::proj_out_or_null In-Reply-To: References: Message-ID: <008b533a-febe-98a2-f6b5-7344dc8ebdb1@oracle.com> I think we should do realtime check for NULL and use this->in(0) (allocation's control edge) if Initialize node is missing which could happen. Thanks, Vladimir On 2/13/18 2:38 AM, Tobias Hartmann wrote: > Hi, > > please review the following patch that adds a null check for the return value of AllocateNode::initialization(): > https://bugs.openjdk.java.net/browse/JDK-8197828 > http://cr.openjdk.java.net/~thartmann/8197828/webrev.00/ > > Thanks, > Tobias > From vladimir.kozlov at oracle.com Tue Feb 13 17:56:08 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 13 Feb 2018 09:56:08 -0800 Subject: [11] RFR(XS): 8197608: MacroAssembler::unimplemented calls global operator new[] In-Reply-To: <2f678321-208e-1892-ef88-57af3280bfde@oracle.com> References: <2f678321-208e-1892-ef88-57af3280bfde@oracle.com> Message-ID: <6032d289-5182-eb1d-3448-35d1f7400665@oracle.com> Okay. Thanks, Vladimir On 2/13/18 2:12 AM, Tobias Hartmann wrote: > Hi, > > please review the following fix that removes C heap allocation of string buffers: > https://bugs.openjdk.java.net/browse/JDK-8197608 > http://cr.openjdk.java.net/~thartmann/8197608/webrev.00/ > > This is consistent with the implementation on ARM: > http://hg.openjdk.java.net/jdk/hs/file/040293c73621/src/hotspot/cpu/arm/macroAssembler_arm.cpp#l1710 > > Thanks, > Tobias > From tobias.hartmann at oracle.com Tue Feb 13 17:57:32 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 13 Feb 2018 18:57:32 +0100 Subject: [11] RFR(XS): 8197608: MacroAssembler::unimplemented calls global operator new[] In-Reply-To: <6032d289-5182-eb1d-3448-35d1f7400665@oracle.com> References: <2f678321-208e-1892-ef88-57af3280bfde@oracle.com> <6032d289-5182-eb1d-3448-35d1f7400665@oracle.com> Message-ID: Thanks Vladimir! Best regards, Tobias On 13.02.2018 18:56, Vladimir Kozlov wrote: > Okay. > > Thanks, > Vladimir > > On 2/13/18 2:12 AM, Tobias Hartmann wrote: >> Hi, >> >> please review the following fix that removes C heap allocation of string buffers: >> https://bugs.openjdk.java.net/browse/JDK-8197608 >> http://cr.openjdk.java.net/~thartmann/8197608/webrev.00/ >> >> This is consistent with the implementation on ARM: >> http://hg.openjdk.java.net/jdk/hs/file/040293c73621/src/hotspot/cpu/arm/macroAssembler_arm.cpp#l1710 >> >> Thanks, >> Tobias >> From tobias.hartmann at oracle.com Tue Feb 13 18:37:08 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Tue, 13 Feb 2018 19:37:08 +0100 Subject: [11] RFR(XS): 8197828: Null pointer dereference in MultiNode::proj_out_or_null In-Reply-To: <008b533a-febe-98a2-f6b5-7344dc8ebdb1@oracle.com> References: <008b533a-febe-98a2-f6b5-7344dc8ebdb1@oracle.com> Message-ID: <8b26f2f7-003b-2a23-b0d7-a2ffaa8d1c57@oracle.com> Hi Vladimir, On 13.02.2018 18:47, Vladimir Kozlov wrote: > I think we should do realtime check for NULL and use this->in(0) (allocation's control edge) if Initialize node is > missing which could happen. Right, we should do that: http://cr.openjdk.java.net/~thartmann/8197828/webrev.01/ Thanks, Tobias From dean.long at oracle.com Tue Feb 13 19:22:10 2018 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 13 Feb 2018 11:22:10 -0800 Subject: [11] RFR(XS): 8197828: Null pointer dereference in MultiNode::proj_out_or_null In-Reply-To: <8b26f2f7-003b-2a23-b0d7-a2ffaa8d1c57@oracle.com> References: <008b533a-febe-98a2-f6b5-7344dc8ebdb1@oracle.com> <8b26f2f7-003b-2a23-b0d7-a2ffaa8d1c57@oracle.com> Message-ID: <498810c4-aea4-cf3e-c3db-66b9b5b0b8b8@oracle.com> On 2/13/18 10:37 AM, Tobias Hartmann wrote: > Hi Vladimir, > > On 13.02.2018 18:47, Vladimir Kozlov wrote: >> I think we should do realtime check for NULL and use this->in(0) (allocation's control edge) if Initialize node is >> missing which could happen. > Right, we should do that: > http://cr.openjdk.java.net/~thartmann/8197828/webrev.01/ How would we test that it works correctly?? I don't think we ever see a missing Initialize node here, otherwise we would have seen crashes. dl > Thanks, > Tobias From vladimir.kozlov at oracle.com Tue Feb 13 20:16:17 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 13 Feb 2018 12:16:17 -0800 Subject: [11] RFR(XS): 8197828: Null pointer dereference in MultiNode::proj_out_or_null In-Reply-To: <8b26f2f7-003b-2a23-b0d7-a2ffaa8d1c57@oracle.com> References: <008b533a-febe-98a2-f6b5-7344dc8ebdb1@oracle.com> <8b26f2f7-003b-2a23-b0d7-a2ffaa8d1c57@oracle.com> Message-ID: <89a81b51-5d74-b0d0-de1b-91c4917399e7@oracle.com> Good. Thanks, Vladimir On 2/13/18 10:37 AM, Tobias Hartmann wrote: > Hi Vladimir, > > On 13.02.2018 18:47, Vladimir Kozlov wrote: >> I think we should do realtime check for NULL and use this->in(0) (allocation's control edge) if Initialize node is >> missing which could happen. > > Right, we should do that: > http://cr.openjdk.java.net/~thartmann/8197828/webrev.01/ > > Thanks, > Tobias > From vladimir.kozlov at oracle.com Tue Feb 13 20:32:11 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 13 Feb 2018 12:32:11 -0800 Subject: [11] RFR(XS): 8197828: Null pointer dereference in MultiNode::proj_out_or_null In-Reply-To: <498810c4-aea4-cf3e-c3db-66b9b5b0b8b8@oracle.com> References: <008b533a-febe-98a2-f6b5-7344dc8ebdb1@oracle.com> <8b26f2f7-003b-2a23-b0d7-a2ffaa8d1c57@oracle.com> <498810c4-aea4-cf3e-c3db-66b9b5b0b8b8@oracle.com> Message-ID: <00f46bee-d8fa-12f5-8ff0-898d12db4684@oracle.com> On 2/13/18 11:22 AM, dean.long at oracle.com wrote: > On 2/13/18 10:37 AM, Tobias Hartmann wrote: > >> Hi Vladimir, >> >> On 13.02.2018 18:47, Vladimir Kozlov wrote: >>> I think we should do realtime check for NULL and use this->in(0) (allocation's control edge) if Initialize node is >>> missing which could happen. >> Right, we should do that: >> http://cr.openjdk.java.net/~thartmann/8197828/webrev.01/ > > How would we test that it works correctly?? I don't think we ever see a missing Initialize node here, otherwise we would > have seen crashes. I did see missing Initialize node when I worked on EA. Zero size array. We may not hit it in this code path because it is guarded by several conditions. For example ClearArrayNode::step_through() has such check: http://hg.openjdk.java.net/jdk/hs/file/be873d8c0114/src/hotspot/share/opto/memnode.cpp#l2813 and PhaseMacroExpand::expand_allocate_array() http://hg.openjdk.java.net/jdk/hs/file/be873d8c0114/src/hotspot/share/opto/macro.cpp#l2022 Vladimir > > dl > >> Thanks, >> Tobias > From dean.long at oracle.com Tue Feb 13 22:23:22 2018 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 13 Feb 2018 14:23:22 -0800 Subject: [11] RFR(XS): 8197828: Null pointer dereference in MultiNode::proj_out_or_null In-Reply-To: <00f46bee-d8fa-12f5-8ff0-898d12db4684@oracle.com> References: <008b533a-febe-98a2-f6b5-7344dc8ebdb1@oracle.com> <8b26f2f7-003b-2a23-b0d7-a2ffaa8d1c57@oracle.com> <498810c4-aea4-cf3e-c3db-66b9b5b0b8b8@oracle.com> <00f46bee-d8fa-12f5-8ff0-898d12db4684@oracle.com> Message-ID: <04bb252c-aa4e-654d-0a6c-972bdb2bb8a2@oracle.com> On 2/13/18 12:32 PM, Vladimir Kozlov wrote: > On 2/13/18 11:22 AM, dean.long at oracle.com wrote: >> On 2/13/18 10:37 AM, Tobias Hartmann wrote: >> >>> Hi Vladimir, >>> >>> On 13.02.2018 18:47, Vladimir Kozlov wrote: >>>> I think we should do realtime check for NULL and use this->in(0) >>>> (allocation's control edge) if Initialize node is >>>> missing which could happen. >>> Right, we should do that: >>> http://cr.openjdk.java.net/~thartmann/8197828/webrev.01/ >> >> How would we test that it works correctly?? I don't think we ever see >> a missing Initialize node here, otherwise we would have seen crashes. > > I did see missing Initialize node when I worked on EA. Zero size > array. We may not hit it in this code path because it is guarded by > several conditions. > If we can't hit it, then I vote for leaving it as an assert.? If we can hit it, then we need a test to trigger that code path. dl > For example ClearArrayNode::step_through() has such check: > > http://hg.openjdk.java.net/jdk/hs/file/be873d8c0114/src/hotspot/share/opto/memnode.cpp#l2813 > > > and PhaseMacroExpand::expand_allocate_array() > > http://hg.openjdk.java.net/jdk/hs/file/be873d8c0114/src/hotspot/share/opto/macro.cpp#l2022 > > > Vladimir > >> >> dl >> >>> Thanks, >>> Tobias >> From jcbeyler at google.com Wed Feb 14 00:11:35 2018 From: jcbeyler at google.com (JC Beyler) Date: Tue, 13 Feb 2018 16:11:35 -0800 Subject: JDK-8171119: Low-Overhead Heap Profiling In-Reply-To: References: <5A819F10.8040201@oracle.com> Message-ID: Hi all, Just to show a bit how to solve the one issue Erik was referring to, consider the following webrev: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.08a/ and incremental is here: http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.08_08a/ This puts the sampling bytes left in the Thread class () and then the code goes to Thread to sample or not. The advantage of this is that it is probably simpler to understand and follow what is going on, there is less of internal tlab magic going on and the outside of tlab allocations go through the thread instance the same way the TLAB allocations do. I think it's cleaner but what do you think? Thanks! Jc On Mon, Feb 12, 2018 at 9:18 PM, JC Beyler wrote: > Hi Erik, > > Thanks for your answers, I've now inlined my own answers/comments. > > I've done a new webrev here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.08/ > > The incremental is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/ > > Note to all: > - I've been integrating changes from Erin/Serguei/David comments so this > webrev incremental is a bit an answer to all comments in one. I apologize > for that :) > > > On Mon, Feb 12, 2018 at 6:05 AM, Erik ?sterlund > wrote: > >> Hi JC, >> >> Sorry for the delayed reply. >> >> Inlined answers: >> >> >> On 2018-02-06 00:04, JC Beyler wrote: >> >>> Hi Erik, >>> >>> (Renaming this to be folded into the newly renamed thread :)) >>> >>> First off, thanks a lot for reviewing the webrev! I appreciate it! >>> >>> I updated the webrev to: >>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/ >>> >>> And the incremental one is here: >>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.04_05a/ >>> >>> It contains: >>> - The change for since from 9 to 11 for the jvmti.xml >>> - The use of the OrderAccess for initialized >>> - Clearing the oop >>> >>> I also have inlined my answers to your comments. The biggest question >>> will come from the multiple *_end variables. A bit of the logic there >>> is due to handling the slow path refill vs fast path refill and >>> checking that the rug was not pulled underneath the slowpath. I >>> believe that a previous comment was that TlabFastRefill was going to >>> be deprecated. >>> >>> If this is true, we could revert this code a bit and just do a : if >>> TlabFastRefill is enabled, disable this. And then deprecate that when >>> TlabFastRefill is deprecated. >>> >>> This might simplify this webrev and I can work on a follow-up that >>> either: removes TlabFastRefill if Robbin does not have the time to do >>> it or add the support to the assembly side to handle this correctly. >>> What do you think? >>> >> >> I support removing TlabFastRefill, but I think it is good to not depend >> on that happening first. >> >> > > I'm slowly pushing on the FastTLABRefill (https://bugs.openjdk.java.net > /browse/JDK-8194084), I agree on keeping both separate for now though so > that we can think of both differently > > > >> Now, below, inlined are my answers: >>> >>> On Fri, Feb 2, 2018 at 8:44 AM, Erik ?sterlund >>> wrote: >>> >>>> Hi JC, >>>> >>>> Hope I am reviewing the right version of your work. Here goes... >>>> >>>> src/hotspot/share/gc/shared/collectedHeap.inline.hpp: >>>> >>>> 159 AllocTracer::send_allocation_outside_tlab(klass, result, >>>> size * >>>> HeapWordSize, THREAD); >>>> 160 >>>> 161 THREAD->tlab().handle_sample(THREAD, result, size); >>>> 162 return result; >>>> 163 } >>>> >>>> Should not call tlab()->X without checking if (UseTLAB) IMO. >>>> >>>> Done! >>> >> >> More about this later. >> >> >> >>> src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp: >>>> >>>> So first of all, there seems to quite a few ends. There is an "end", a >>>> "hard >>>> end", a "slow path end", and an "actual end". Moreover, it seems like >>>> the >>>> "hard end" is actually further away than the "actual end". So the "hard >>>> end" >>>> seems like more of a "really definitely actual end" or something. I >>>> don't >>>> know about you, but I think it looks kind of messy. In particular, I >>>> don't >>>> feel like the name "actual end" reflects what it represents, especially >>>> when >>>> there is another end that is behind the "actual end". >>>> >>>> 413 HeapWord* ThreadLocalAllocBuffer::hard_end() { >>>> 414 // Did a fast TLAB refill occur? >>>> 415 if (_slow_path_end != _end) { >>>> 416 // Fix up the actual end to be now the end of this TLAB. >>>> 417 _slow_path_end = _end; >>>> 418 _actual_end = _end; >>>> 419 } >>>> 420 >>>> 421 return _actual_end + alignment_reserve(); >>>> 422 } >>>> >>>> I really do not like making getters unexpectedly have these kind of side >>>> effects. It is not expected that when you ask for the "hard end", you >>>> implicitly update the "slow path end" and "actual end" to new values. >>>> >>>> As I said, a lot of this is due to the FastTlabRefill. If I make this >>> not supporting FastTlabRefill, this goes away. The reason the system >>> needs to update itself at the get is that you only know at that get if >>> things have shifted underneath the tlab slow path. I am not sure of >>> really better names (naming is hard!), perhaps we could do these >>> names: >>> >>> - current_tlab_end // Either the allocated tlab end or a sampling >>> point >>> - last_allocation_address // The end of the tlab allocation >>> - last_slowpath_allocated_end // In case a fast refill occurred the >>> end might have changed, this is to remember slow vs fast past refills >>> >>> the hard_end method can be renamed to something like: >>> tlab_end_pointer() // The end of the lab including a bit of >>> alignment reserved bytes >>> >> >> Those names sound better to me. Could you please provide a mapping from >> the old names to the new names so I understand which one is which please? >> >> This is my current guess of what you are proposing: >> >> end -> current_tlab_end >> actual_end -> last_allocation_address >> slow_path_end -> last_slowpath_allocated_end >> hard_end -> tlab_end_pointer >> >> > Yes that is correct, that was what I was proposing. > > >> I would prefer this naming: >> >> end -> slow_path_end // the end for taking a slow path; either due to >> sampling or refilling >> actual_end -> allocation_end // the end for allocations >> slow_path_end -> last_slow_path_end // last address for slow_path_end (as >> opposed to allocation_end) >> hard_end -> reserved_end // the end of the reserved space of the TLAB >> >> About setting things in the getter... that still seems like a very >> unpleasant thing to me. It would be better to inspect the call hierarchy >> and explicitly update the ends where they need updating, and assert in the >> getter that they are in sync, rather than implicitly setting various ends >> as a surprising side effect in a getter. It looks like the call hierarchy >> is very small. With my new naming convention, reserved_end() would >> presumably return _allocation_end + alignment_reserve(), and have an assert >> checking that _allocation_end == _last_slow_path_allocation_end, >> complaining that this invariant must hold, and that a caller to this >> function, such as make_parsable(), must first explicitly synchronize the >> ends as required, to honor that invariant. >> >> > > I've renamed the variables to how you preferred it except for the _end > one. I did: > current_end > last_allocation_address > tlab_end_ptr > > The reason is that the architecture dependent code use the thread.hpp API > and it already has tlab included into the name so it becomes > tlab_current_end (which is better that tlab_current_tlab_end in my opinion). > > I also moved the update into a separate method with a TODO that says to > remove it when FastTLABRefill is deprecated > > > >> >> Not sure it's better but before updating the webrev, I wanted to try >>> to get input/consensus :) >>> >>> (Note hard_end was always further off than end). >>> >>> src/hotspot/share/prims/jvmti.xml: >>>> >>>> 10357 >>>> 10358 >>>> 10359 Can sample the heap. >>>> 10360 If this capability is enabled then the heap sampling >>>> methods >>>> can be called. >>>> 10361 >>>> 10362 >>>> >>>> Looks like this capability should not be "since 9" if it gets integrated >>>> now. >>>> >>> Updated now to 11, crossing my fingers :) >>> >>> >>> src/hotspot/share/runtime/heapMonitoring.cpp: >>>> >>>> 448 if (is_alive->do_object_b(value)) { >>>> 449 // Update the oop to point to the new object if it is >>>> still >>>> alive. >>>> 450 f->do_oop(&(trace.obj)); >>>> 451 >>>> 452 // Copy the old trace, if it is still live. >>>> 453 _allocated_traces->at_put(curr_pos++, trace); >>>> 454 >>>> 455 // Store the live trace in a cache, to be served up on >>>> /heapz. >>>> 456 _traces_on_last_full_gc->append(trace); >>>> 457 >>>> 458 count++; >>>> 459 } else { >>>> 460 // If the old trace is no longer live, add it to the list >>>> of >>>> 461 // recently collected garbage. >>>> 462 store_garbage_trace(trace); >>>> 463 } >>>> >>>> In the case where the oop was not live, I would like it to be explicitly >>>> cleared. >>>> >>> Done I think how you wanted it. Let me know because I'm not familiar >>> with the RootAccess API. I'm unclear if I'm doing this right or not so >>> reviews of these parts are highly appreciated. Robbin had talked of >>> perhaps later pushing this all into a OopStorage, should I do this now >>> do you think? Or can that wait a second webrev later down the road? >>> >> >> I think using handles can and should be done later. You can use the >> Access API now. >> I noticed that you are missing an #include "oops/access.inline.hpp" in >> your heapMonitoring.cpp file. >> >> > The missing header is there for me so I don't know, I made sure it is > present in the latest webrev. Sorry about that. > > > >> + Did I clear it the way you wanted me to or were you thinking of >>> something else? >>> >> >> That is precisely how I wanted it to be cleared. Thanks. >> >> + Final question here, seems like if I were to want to not do the >>> f->do_oop directly on the trace.obj, I'd need to do something like: >>> >>> f->do_oop(&value); >>> ... >>> trace->store_oop(value); >>> >>> to update the oop internally. Is that right/is that one of the >>> advantages of going to the Oopstorage sooner than later? >>> >> >> I think you really want to do the do_oop on the root directly. Is there a >> particular reason why you would not want to do that? >> Otherwise, yes - the benefit with using the handle approach is that you >> do not need to call do_oop explicitly in your code. >> >> > There is no reason except that now we have a load_oop and a get_oop_addr, > I was not sure what you would think of that. > > > > >> >>> Also I see a lot of concurrent-looking use of the following field: >>>> 267 volatile bool _initialized; >>>> >>>> Please note that the "volatile" qualifier does not help with reordering >>>> here. Reordering between volatile and non-volatile fields is completely >>>> free >>>> for both compiler and hardware, except for windows with MSVC, where >>>> volatile >>>> semantics is defined to use acquire/release semantics, and the hardware >>>> is >>>> TSO. But for the general case, I would expect this field to be stored >>>> with >>>> OrderAccess::release_store and loaded with OrderAccess::load_acquire. >>>> Otherwise it is not thread safe. >>>> >>> Because everything is behind a mutex, I wasn't really worried about >>> this. I have a test that has multiple threads trying to hit this >>> corner case and it passes. >>> >>> However, to be paranoid, I updated it to using the OrderAccess API >>> now, thanks! Let me know what you think there too! >>> >> >> If it is indeed always supposed to be read and written under a mutex, >> then I would strongly prefer to have it accessed as a normal non-volatile >> member, and have an assertion that given lock is held or we are in a >> safepoint, as we do in many other places. Something like this: >> >> assert(HeapMonitorStorage_lock->owned_by_self() || >> (SafepointSynchronize::is_at_safepoint() && >> Thread::current()->is_VM_thread()), "this should not be accessed >> concurrently"); >> >> It would be confusing to people reading the code if there are uses of >> OrderAccess that are actually always protected under a mutex. >> >> > Thank you for the exact example to be put in the code! I put it around > each access/assignment of the _initialized method and found one case where > yes you can touch it and not have the lock. It actually is "ok" because you > don't act on the storage until later and only when you really want to > modify the storage (see the object_alloc_do_sample method which calls the > add_trace method). > > But, because of this, I'm going to put the OrderAccess here, I'll do some > performance numbers later and if there are issues, I might add a "unsafe" > read and a "safe" one to make it explicit to the reader. But I don't think > it will come to that. > > > >> As a kind of meta comment, I wonder if it would make sense to add sampling >>>> for non-TLAB allocations. Seems like if someone is rapidly allocating a >>>> whole bunch of 1 MB objects that never fit in a TLAB, I might still be >>>> interested in seeing that in my traces, and not get surprised that the >>>> allocation rate is very high yet not showing up in any profiles. >>>> >>>> That is handled by the handle_sample where you wanted me to put a >>> UseTlab because you hit that case if the allocation is too big. >>> >> >> I see. It was not obvious to me that non-TLAB sampling is done in the >> TLAB class. That seems like an abstraction crime. >> What I wanted in my previous comment was that we do not call into the >> TLAB when we are not using TLABs. If there is sampling logic in the TLAB >> that is used for something else than TLABs, then it seems like that logic >> simply does not belong inside of the TLAB. It should be moved out of the >> TLAB, and instead have the TLAB call this common abstraction that makes >> sense. >> >> > So in the incremental version: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/, this is still > a "crime". The reason is that the system has to have the bytes_until_sample > on a per-thread level and it made "sense" to have it with the TLAB > implementation. Also, I was not sure how people felt about adding something > to the thread instance instead. > > Do you think it fits better at the Thread level? I can see how difficult > it is to make it happen there and add some logic there. Let me know what > you think. > > > >> Hope I have answered your questions and that my feedback makes sense to >> you. >> >> > You have and thank you for them, I think we are getting to a cleaner > implementation and things are getting better and more readable :) > > Thanks for your help! > Jc > > > >> Thanks, >> /Erik >> >> >> I double checked by changing the test >>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/raw_ >>> files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonito >>> r/MyPackage/HeapMonitorStatObjectCorrectnessTest.java >>> >>> to use a smaller Tlab (2048) and made the object bigger and it goes >>> through that and passes. >>> >>> Thanks again for your review and I look forward to your pointers for >>> the questions I now have raised! >>> Jc >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> Thanks, >>>> /Erik >>>> >>>> >>>> On 2018-01-26 06:45, JC Beyler wrote: >>>> >>>>> Thanks Robbin for the reviews :) >>>>> >>>>> The new full webrev is here: >>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/ >>>>> The incremental webrev is here: >>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/ >>>>> >>>>> I inlined my answers: >>>>> >>>>> On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn >>>>> wrote: >>>>> >>>>>> Hi JC, great to see another revision! >>>>>> >>>>>> #### >>>>>> heapMonitoring.cpp >>>>>> >>>>>> StackTraceData should not contain the oop for 'safety' reasons. >>>>>> When StackTraceData is moved from _allocated_traces: >>>>>> L452 store_garbage_trace(trace); >>>>>> it contains a dead oop. >>>>>> _allocated_traces could instead be a tupel of oop and StackTraceData >>>>>> thus >>>>>> dead oops are not kept. >>>>>> >>>>> Done I used inheritance to make the copier work regardless but the >>>>> idea is the same. >>>>> >>>>> You should use the new Access API for loading the oop, something like >>>>>> this: >>>>>> RootAccess::load(...) >>>>>> I don't think you need to use Access API for clearing the oop, but it >>>>>> would >>>>>> look nicer. And you shouldn't probably be using: >>>>>> Universe::heap()->is_in_reserved(value) >>>>>> >>>>> I am unfamiliar with this but I think I did do it like you wanted me >>>>> to (all tests pass so that's a start). I'm not sure how to clear the >>>>> oop exactly, is there somewhere that does that, which I can use to do >>>>> the same? >>>>> >>>>> I removed the is_in_reserved, this came from our internal version, I >>>>> don't know why it was there but my tests work without so I removed it >>>>> :) >>>>> >>>>> >>>>> The lock: >>>>>> L424 MutexLocker mu(HeapMonitorStorage_lock); >>>>>> Is not needed as far as I can see. >>>>>> weak_oops_do is called in a safepoint, no TLAB allocation can happen >>>>>> and >>>>>> JVMTI thread can't access these data-structures. Is there something >>>>>> more >>>>>> to >>>>>> this lock that I'm missing? >>>>>> >>>>> Since a thread can call the JVMTI getLiveTraces (or any of the other >>>>> ones), it can get to the point of trying to copying the >>>>> _allocated_traces. I imagine it is possible that this is happening >>>>> during a GC or that it can be started and a GC happens afterwards. >>>>> Therefore, it seems to me that you want this protected, no? >>>>> >>>>> >>>>> #### >>>>>> You have 6 files without any changes in them (any more): >>>>>> g1CollectedHeap.cpp >>>>>> psMarkSweep.cpp >>>>>> psParallelCompact.cpp >>>>>> genCollectedHeap.cpp >>>>>> referenceProcessor.cpp >>>>>> thread.hpp >>>>>> >>>>>> Done. >>>>> >>>>> #### >>>>>> I have not looked closely, but is it possible to hide heap sampling in >>>>>> AllocTracer ? (with some minor changes to the AllocTracer API) >>>>>> >>>>>> I am imagining that you are saying to move the code that does the >>>>> sampling code (change the tlab end, do the call to HeapMonitoring, >>>>> etc.) into the AllocTracer code itself? I think that is right and I'll >>>>> look if that is possible and prepare a webrev to show what would be >>>>> needed to make that happen. >>>>> >>>>> #### >>>>>> Minor nit, when declaring pointer there is a little mix of having the >>>>>> pointer adjacent by type name and data name. (Most hotspot code is by >>>>>> type >>>>>> name) >>>>>> E.g. >>>>>> heapMonitoring.cpp:711 jvmtiStackTrace *trace = .... >>>>>> heapMonitoring.cpp:733 Method* m = vfst.method(); >>>>>> (not just this file) >>>>>> >>>>>> Done! >>>>> >>>>> #### >>>>>> HeapMonitorThreadOnOffTest.java:77 >>>>>> I would make g_tmp volatile, otherwise the assignment in loop may >>>>>> theoretical be skipped. >>>>>> >>>>>> Also done! >>>>> >>>>> Thanks again! >>>>> Jc >>>>> >>>> >>>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robbin.ehn at oracle.com Wed Feb 14 08:13:02 2018 From: robbin.ehn at oracle.com (Robbin Ehn) Date: Wed, 14 Feb 2018 09:13:02 +0100 Subject: JDK-8171119: Low-Overhead Heap Profiling In-Reply-To: References: <5A819F10.8040201@oracle.com>

Message-ID: <88a73c1c-0c11-fe23-b07e-eec5aa799450@oracle.com> Hi JC, Adding Markus Gr?nlund. On 2018-02-14 01:11, JC Beyler wrote: > Hi all, > > Just to show a bit how to solve the one issue Erik was referring to, consider the following webrev: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.08a/ > > and incremental is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.08_08a/ We have long standing issue of adding stuff directly to the thread. Please put tracing stuff into a separate data structure and file. Hoping that this new structure can contain this and thread specific parts of up coming tracing framework. (Markus?) Maybe follow threadSMR, which is contained in: open/src/hotspot/share/runtime/threadSMR.cpp open/src/hotspot/share/runtime/threadSMR.inline.hpp open/src/hotspot/share/runtime/threadSMR.hpp And create threadTracing.XXX or similar ? Thanks, Robbin > > This puts the sampling bytes left in the Thread class () and then the code goes to Thread to sample or not. The advantage of this is that it is probably simpler to understand and follow what is going on, there is less of internal tlab magic going on and the outside of tlab allocations go through the thread instance the same way the TLAB allocations do. > > I think it's cleaner but what do you think? > > Thanks! > Jc > > On Mon, Feb 12, 2018 at 9:18 PM, JC Beyler > wrote: > > Hi Erik, > > Thanks for your answers, I've now inlined my own answers/comments. > > I've done a new webrev here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.08/ > > The incremental is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/ > > Note to all: > ? - I've been integrating changes from Erin/Serguei/David comments so this webrev incremental is a bit an answer to all comments in one. I apologize for that :) > > > On Mon, Feb 12, 2018 at 6:05 AM, Erik ?sterlund > wrote: > > Hi JC, > > Sorry for the delayed reply. > > Inlined answers: > > > On 2018-02-06 00:04, JC Beyler wrote: > > Hi Erik, > > (Renaming this to be folded into the newly renamed thread :)) > > First off, thanks a lot for reviewing the webrev! I appreciate it! > > I updated the webrev to: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/ > > And the incremental one is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.04_05a/ > > It contains: > - The change for since from 9 to 11 for the jvmti.xml > - The use of the OrderAccess for initialized > - Clearing the oop > > I also have inlined my answers to your comments. The biggest question > will come from the multiple *_end variables. A bit of the logic there > is due to handling the slow path refill vs fast path refill and > checking that the rug was not pulled underneath the slowpath. I > believe that a previous comment was that TlabFastRefill was going to > be deprecated. > > If this is true, we could revert this code a bit and just do a : if > TlabFastRefill is enabled, disable this. And then deprecate that when > TlabFastRefill is deprecated. > > This might simplify this webrev and I can work on a follow-up that > either: removes TlabFastRefill if Robbin does not have the time to do > it or add the support to the assembly side to handle this correctly. > What do you think? > > > I support removing TlabFastRefill, but I think it is good to not depend on that happening first. > > > > I'm slowly pushing on the FastTLABRefill (https://bugs.openjdk.java.net/browse/JDK-8194084 ), I agree on keeping both separate for now though so that we can think of both differently > > Now, below, inlined are my answers: > > On Fri, Feb 2, 2018 at 8:44 AM, Erik ?sterlund > > wrote: > > Hi JC, > > Hope I am reviewing the right version of your work. Here goes... > > src/hotspot/share/gc/shared/collectedHeap.inline.hpp: > > ? 159? ? ?AllocTracer::send_allocation_outside_tlab(klass, result, size * > HeapWordSize, THREAD); > ? 160 > ? 161? ? ?THREAD->tlab().handle_sample(THREAD, result, size); > ? 162? ? ?return result; > ? 163? ?} > > Should not call tlab()->X without checking if (UseTLAB) IMO. > > Done! > > > More about this later. > > > > src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp: > > So first of all, there seems to quite a few ends. There is an "end", a "hard > end", a "slow path end", and an "actual end". Moreover, it seems like the > "hard end" is actually further away than the "actual end". So the "hard end" > seems like more of a "really definitely actual end" or something. I don't > know about you, but I think it looks kind of messy. In particular, I don't > feel like the name "actual end" reflects what it represents, especially when > there is another end that is behind the "actual end". > > ? 413 HeapWord* ThreadLocalAllocBuffer::hard_end() { > ? 414? ?// Did a fast TLAB refill occur? > ? 415? ?if (_slow_path_end != _end) { > ? 416? ? ?// Fix up the actual end to be now the end of this TLAB. > ? 417? ? ?_slow_path_end = _end; > ? 418? ? ?_actual_end = _end; > ? 419? ?} > ? 420 > ? 421? ?return _actual_end + alignment_reserve(); > ? 422 } > > I really do not like making getters unexpectedly have these kind of side > effects. It is not expected that when you ask for the "hard end", you > implicitly update the "slow path end" and "actual end" to new values. > > As I said, a lot of this is due to the FastTlabRefill. If I make this > not supporting FastTlabRefill, this goes away. The reason the system > needs to update itself at the get is that you only know at that get if > things have shifted underneath the tlab slow path. I am not sure of > really better names (naming is hard!), perhaps we could do these > names: > > - current_tlab_end? ? ? ?// Either the allocated tlab end or a sampling point > - last_allocation_address? // The end of the tlab allocation > - last_slowpath_allocated_end? // In case a fast refill occurred the > end might have changed, this is to remember slow vs fast past refills > > the hard_end method can be renamed to something like: > tlab_end_pointer()? ? ? ? // The end of the lab including a bit of > alignment reserved bytes > > > Those names sound better to me. Could you please provide a mapping from the old names to the new names so I understand which one is which please? > > This is my current guess of what you are proposing: > > end -> current_tlab_end > actual_end -> last_allocation_address > slow_path_end -> last_slowpath_allocated_end > hard_end -> tlab_end_pointer > > > Yes that is correct, that was what I was proposing. > > I would prefer this naming: > > end -> slow_path_end // the end for taking a slow path; either due to sampling or refilling > actual_end -> allocation_end // the end for allocations > slow_path_end -> last_slow_path_end // last address for slow_path_end (as opposed to allocation_end) > hard_end -> reserved_end // the end of the reserved space of the TLAB > > About setting things in the getter... that still seems like a very unpleasant thing to me. It would be better to inspect the call hierarchy and explicitly update the ends where they need updating, and assert in the getter that they are in sync, rather than implicitly setting various ends as a surprising side effect in a getter. It looks like the call hierarchy is very small. With my new naming convention, reserved_end() would presumably return _allocation_end + alignment_reserve(), and have an assert checking that _allocation_end == _last_slow_path_allocation_end, complaining that this > invariant must hold, and that a caller to this function, such as make_parsable(), must first explicitly synchronize the ends as required, to honor that invariant. > > > > I've renamed the variables to how you preferred it except for the _end one. I did: > current_end > last_allocation_address > tlab_end_ptr > > The reason is that the architecture dependent code use the thread.hpp API and it already has tlab included into the name so it becomes tlab_current_end (which is better that tlab_current_tlab_end in my opinion). > > I also moved the update into a separate method with a TODO that says to remove it when FastTLABRefill is deprecated > > > Not sure it's better but before updating the webrev, I wanted to try > to get input/consensus :) > > (Note hard_end was always further off than end). > > src/hotspot/share/prims/jvmti.xml: > > 10357? ? ? ? > 10358? ? ? ? ? > 10359? ? ? ? ? ?Can sample the heap. > 10360? ? ? ? ? ?If this capability is enabled then the heap sampling methods > can be called. > 10361? ? ? ? ? > 10362? ? ? ? > > Looks like this capability should not be "since 9" if it gets integrated > now. > > Updated now to 11, crossing my fingers :) > > > src/hotspot/share/runtime/heapMonitoring.cpp: > > ? 448? ? ? ?if (is_alive->do_object_b(value)) { > ? 449? ? ? ? ?// Update the oop to point to the new object if it is still > alive. > ? 450? ? ? ? ?f->do_oop(&(trace.obj)); > ? 451 > ? 452? ? ? ? ?// Copy the old trace, if it is still live. > ? 453? ? ? ? ?_allocated_traces->at_put(curr_pos++, trace); > ? 454 > ? 455? ? ? ? ?// Store the live trace in a cache, to be served up on /heapz. > ? 456? ? ? ? ?_traces_on_last_full_gc->append(trace); > ? 457 > ? 458? ? ? ? ?count++; > ? 459? ? ? ?} else { > ? 460? ? ? ? ?// If the old trace is no longer live, add it to the list of > ? 461? ? ? ? ?// recently collected garbage. > ? 462? ? ? ? ?store_garbage_trace(trace); > ? 463? ? ? ?} > > In the case where the oop was not live, I would like it to be explicitly > cleared. > > Done I think how you wanted it. Let me know because I'm not familiar > with the RootAccess API. I'm unclear if I'm doing this right or not so > reviews of these parts are highly appreciated. Robbin had talked of > perhaps later pushing this all into a OopStorage, should I do this now > do you think? Or can that wait a second webrev later down the road? > > > I think using handles can and should be done later. You can use the Access API now. > I noticed that you are missing an #include "oops/access.inline.hpp" in your heapMonitoring.cpp file. > > > The missing header is there for me so I don't know, I made sure it is present in the latest webrev. Sorry about that. > > > + Did I clear it the way you wanted me to or were you thinking of > something else? > > > That is precisely how I wanted it to be cleared. Thanks. > > + Final question here, seems like if I were to want to not do the > f->do_oop directly on the trace.obj, I'd need to do something like: > > ? ? f->do_oop(&value); > ? ? ... > ? ? trace->store_oop(value); > > to update the oop internally. Is that right/is that one of the > advantages of going to the Oopstorage sooner than later? > > > I think you really want to do the do_oop on the root directly. Is there a particular reason why you would not want to do that? > Otherwise, yes - the benefit with using the handle approach is that you do not need to call do_oop explicitly in your code. > > > There is no reason except that now we have a load_oop and a get_oop_addr, I was not sure what you would think of that. > > > > Also I see a lot of concurrent-looking use of the following field: > ? 267? ?volatile bool _initialized; > > Please note that the "volatile" qualifier does not help with reordering > here. Reordering between volatile and non-volatile fields is completely free > for both compiler and hardware, except for windows with MSVC, where volatile > semantics is defined to use acquire/release semantics, and the hardware is > TSO. But for the general case, I would expect this field to be stored with > OrderAccess::release_store and loaded with OrderAccess::load_acquire. > Otherwise it is not thread safe. > > Because everything is behind a mutex, I wasn't really worried about > this. I have a test that has multiple threads trying to hit this > corner case and it passes. > > However, to be paranoid, I updated it to using the OrderAccess API > now, thanks! Let me know what you think there too! > > > If it is indeed always supposed to be read and written under a mutex, then I would strongly prefer to have it accessed as a normal non-volatile member, and have an assertion that given lock is held or we are in a safepoint, as we do in many other places. Something like this: > > assert(HeapMonitorStorage_lock->owned_by_self() || (SafepointSynchronize::is_at_safepoint() && Thread::current()->is_VM_thread()), "this should not be accessed concurrently"); > > It would be confusing to people reading the code if there are uses of OrderAccess that are actually always protected under a mutex. > > > Thank you for the exact example to be put in the code! I put it around each access/assignment of the _initialized method and found one case where yes you can touch it and not have the lock. It actually is "ok" because you don't act on the storage until later and only when you really want to modify the storage (see the object_alloc_do_sample method which calls the add_trace method). > > But, because of this, I'm going to put the OrderAccess here, I'll do some performance numbers later and if there are issues, I might add a "unsafe" read and a "safe" one to make it explicit to the reader. But I don't think it will come to that. > > As a kind of meta comment, I wonder if it would make sense to add sampling > for non-TLAB allocations. Seems like if someone is rapidly allocating a > whole bunch of 1 MB objects that never fit in a TLAB, I might still be > interested in seeing that in my traces, and not get surprised that the > allocation rate is very high yet not showing up in any profiles. > > That is handled by the handle_sample where you wanted me to put a > UseTlab because you hit that case if the allocation is too big. > > > I see. It was not obvious to me that non-TLAB sampling is done in the TLAB class. That seems like an abstraction crime. > What I wanted in my previous comment was that we do not call into the TLAB when we are not using TLABs. If there is sampling logic in the TLAB that is used for something else than TLABs, then it seems like that logic simply does not belong inside of the TLAB. It should be moved out of the TLAB, and instead have the TLAB call this common abstraction that makes sense. > > > So in the incremental version: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/ , this is still a "crime". The reason is that the system has to have the bytes_until_sample on a per-thread level and it made "sense" to have it with the TLAB implementation. Also, I was not sure how people felt about adding something to the thread instance instead. > > Do you think it fits better at the Thread level? I can see how difficult it is to make it happen there and add some logic there. Let me know what you think. > > Hope I have answered your questions and that my feedback makes sense to you. > > > You have and thank you for them, I think we are getting to a cleaner implementation and things are getting better and more readable :) > > Thanks for your help! > Jc > > Thanks, > /Erik > > > I double checked by changing the test > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatObjectCorrectnessTest.java > > to use a smaller Tlab (2048) and made the object bigger and it goes > through that and passes. > > Thanks again for your review and I look forward to your pointers for > the questions I now have raised! > Jc > > > > > > > > > > Thanks, > /Erik > > > On 2018-01-26 06:45, JC Beyler wrote: > > Thanks Robbin for the reviews :) > > The new full webrev is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/ > The incremental webrev is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/ > > I inlined my answers: > > On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn > wrote: > > Hi JC, great to see another revision! > > #### > heapMonitoring.cpp > > StackTraceData should not contain the oop for 'safety' reasons. > When StackTraceData is moved from _allocated_traces: > L452 store_garbage_trace(trace); > it contains a dead oop. > _allocated_traces could instead be a tupel of oop and StackTraceData thus > dead oops are not kept. > > Done I used inheritance to make the copier work regardless but the > idea is the same. > > You should use the new Access API for loading the oop, something like > this: > RootAccess::load(...) > I don't think you need to use Access API for clearing the oop, but it > would > look nicer. And you shouldn't probably be using: > Universe::heap()->is_in_reserved(value) > > I am unfamiliar with this but I think I did do it like you wanted me > to (all tests pass so that's a start). I'm not sure how to clear the > oop exactly, is there somewhere that does that, which I can use to do > the same? > > I removed the is_in_reserved, this came from our internal version, I > don't know why it was there but my tests work without so I removed it > :) > > > The lock: > L424? ?MutexLocker mu(HeapMonitorStorage_lock); > Is not needed as far as I can see. > weak_oops_do is called in a safepoint, no TLAB allocation can happen and > JVMTI thread can't access these data-structures. Is there something more > to > this lock that I'm missing? > > Since a thread can call the JVMTI getLiveTraces (or any of the other > ones), it can get to the point of trying to copying the > _allocated_traces. I imagine it is possible that this is happening > during a GC or that it can be started and a GC happens afterwards. > Therefore, it seems to me that you want this protected, no? > > > #### > You have 6 files without any changes in them (any more): > g1CollectedHeap.cpp > psMarkSweep.cpp > psParallelCompact.cpp > genCollectedHeap.cpp > referenceProcessor.cpp > thread.hpp > > Done. > > #### > I have not looked closely, but is it possible to hide heap sampling in > AllocTracer ? (with some minor changes to the AllocTracer API) > > I am imagining that you are saying to move the code that does the > sampling code (change the tlab end, do the call to HeapMonitoring, > etc.) into the AllocTracer code itself? I think that is right and I'll > look if that is possible and prepare a webrev to show what would be > needed to make that happen. > > #### > Minor nit, when declaring pointer there is a little mix of having the > pointer adjacent by type name and data name. (Most hotspot code is by > type > name) > E.g. > heapMonitoring.cpp:711? ? ?jvmtiStackTrace *trace = .... > heapMonitoring.cpp:733? ? ? ? ?Method* m = vfst.method(); > (not just this file) > > Done! > > #### > HeapMonitorThreadOnOffTest.java:77 > I would make g_tmp volatile, otherwise the assignment in loop may > theoretical be skipped. > > Also done! > > Thanks again! > Jc > > > > > From tobias.hartmann at oracle.com Wed Feb 14 08:25:35 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 14 Feb 2018 09:25:35 +0100 Subject: [11] RFR(XS): 8197828: Null pointer dereference in MultiNode::proj_out_or_null In-Reply-To: <04bb252c-aa4e-654d-0a6c-972bdb2bb8a2@oracle.com> References: <008b533a-febe-98a2-f6b5-7344dc8ebdb1@oracle.com> <8b26f2f7-003b-2a23-b0d7-a2ffaa8d1c57@oracle.com> <498810c4-aea4-cf3e-c3db-66b9b5b0b8b8@oracle.com> <00f46bee-d8fa-12f5-8ff0-898d12db4684@oracle.com> <04bb252c-aa4e-654d-0a6c-972bdb2bb8a2@oracle.com> Message-ID: <3867b86f-88f9-b98c-4a85-77d385418369@oracle.com> Hi Dean, On 13.02.2018 23:23, dean.long at oracle.com wrote: > If we can't hit it, then I vote for leaving it as an assert.? If we can hit it, then we need a test to trigger that code > path. I don't think we can easily trigger this with a test (after all it never showed up). Vladimir, are you fine with leaving the assert for now? Thanks, Tobias From erik.osterlund at oracle.com Wed Feb 14 10:51:24 2018 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Wed, 14 Feb 2018 11:51:24 +0100 Subject: JDK-8171119: Low-Overhead Heap Profiling In-Reply-To: References: <5A819F10.8040201@oracle.com> Message-ID: <5A8414AC.3020209@oracle.com> Hi JC, Comments are inlined below. On 2018-02-13 06:18, JC Beyler wrote: > Hi Erik, > > Thanks for your answers, I've now inlined my own answers/comments. > > I've done a new webrev here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.08/ > > > The incremental is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/ > > > Note to all: > - I've been integrating changes from Erin/Serguei/David comments so > this webrev incremental is a bit an answer to all comments in one. I > apologize for that :) > > > On Mon, Feb 12, 2018 at 6:05 AM, Erik ?sterlund > > wrote: > > Hi JC, > > Sorry for the delayed reply. > > Inlined answers: > > > On 2018-02-06 00:04, JC Beyler wrote: > > Hi Erik, > > (Renaming this to be folded into the newly renamed thread :)) > > First off, thanks a lot for reviewing the webrev! I appreciate it! > > I updated the webrev to: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/ > > > And the incremental one is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.04_05a/ > > > It contains: > - The change for since from 9 to 11 for the jvmti.xml > - The use of the OrderAccess for initialized > - Clearing the oop > > I also have inlined my answers to your comments. The biggest > question > will come from the multiple *_end variables. A bit of the > logic there > is due to handling the slow path refill vs fast path refill and > checking that the rug was not pulled underneath the slowpath. I > believe that a previous comment was that TlabFastRefill was > going to > be deprecated. > > If this is true, we could revert this code a bit and just do a > : if > TlabFastRefill is enabled, disable this. And then deprecate > that when > TlabFastRefill is deprecated. > > This might simplify this webrev and I can work on a follow-up that > either: removes TlabFastRefill if Robbin does not have the > time to do > it or add the support to the assembly side to handle this > correctly. > What do you think? > > > I support removing TlabFastRefill, but I think it is good to not > depend on that happening first. > > > > I'm slowly pushing on the FastTLABRefill > (https://bugs.openjdk.java.net/browse/JDK-8194084 > ), I agree on > keeping both separate for now though so that we can think of both > differently > > Now, below, inlined are my answers: > > On Fri, Feb 2, 2018 at 8:44 AM, Erik ?sterlund > > > wrote: > > Hi JC, > > Hope I am reviewing the right version of your work. Here > goes... > > src/hotspot/share/gc/shared/collectedHeap.inline.hpp: > > 159 AllocTracer::send_allocation_outside_tlab(klass, > result, size * > HeapWordSize, THREAD); > 160 > 161 THREAD->tlab().handle_sample(THREAD, result, size); > 162 return result; > 163 } > > Should not call tlab()->X without checking if (UseTLAB) IMO. > > Done! > > > More about this later. > > > > src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp: > > So first of all, there seems to quite a few ends. There is > an "end", a "hard > end", a "slow path end", and an "actual end". Moreover, it > seems like the > "hard end" is actually further away than the "actual end". > So the "hard end" > seems like more of a "really definitely actual end" or > something. I don't > know about you, but I think it looks kind of messy. In > particular, I don't > feel like the name "actual end" reflects what it > represents, especially when > there is another end that is behind the "actual end". > > 413 HeapWord* ThreadLocalAllocBuffer::hard_end() { > 414 // Did a fast TLAB refill occur? > 415 if (_slow_path_end != _end) { > 416 // Fix up the actual end to be now the end of > this TLAB. > 417 _slow_path_end = _end; > 418 _actual_end = _end; > 419 } > 420 > 421 return _actual_end + alignment_reserve(); > 422 } > > I really do not like making getters unexpectedly have > these kind of side > effects. It is not expected that when you ask for the > "hard end", you > implicitly update the "slow path end" and "actual end" to > new values. > > As I said, a lot of this is due to the FastTlabRefill. If I > make this > not supporting FastTlabRefill, this goes away. The reason the > system > needs to update itself at the get is that you only know at > that get if > things have shifted underneath the tlab slow path. I am not > sure of > really better names (naming is hard!), perhaps we could do these > names: > > - current_tlab_end // Either the allocated tlab end or a > sampling point > - last_allocation_address // The end of the tlab allocation > - last_slowpath_allocated_end // In case a fast refill > occurred the > end might have changed, this is to remember slow vs fast past > refills > > the hard_end method can be renamed to something like: > tlab_end_pointer() // The end of the lab including a bit of > alignment reserved bytes > > > Those names sound better to me. Could you please provide a mapping > from the old names to the new names so I understand which one is > which please? > > This is my current guess of what you are proposing: > > end -> current_tlab_end > actual_end -> last_allocation_address > slow_path_end -> last_slowpath_allocated_end > hard_end -> tlab_end_pointer > > > Yes that is correct, that was what I was proposing. > > I would prefer this naming: > > end -> slow_path_end // the end for taking a slow path; either due > to sampling or refilling > actual_end -> allocation_end // the end for allocations > slow_path_end -> last_slow_path_end // last address for > slow_path_end (as opposed to allocation_end) > hard_end -> reserved_end // the end of the reserved space of the TLAB > > About setting things in the getter... that still seems like a very > unpleasant thing to me. It would be better to inspect the call > hierarchy and explicitly update the ends where they need updating, > and assert in the getter that they are in sync, rather than > implicitly setting various ends as a surprising side effect in a > getter. It looks like the call hierarchy is very small. With my > new naming convention, reserved_end() would presumably return > _allocation_end + alignment_reserve(), and have an assert checking > that _allocation_end == _last_slow_path_allocation_end, > complaining that this invariant must hold, and that a caller to > this function, such as make_parsable(), must first explicitly > synchronize the ends as required, to honor that invariant. > > > > I've renamed the variables to how you preferred it except for the _end > one. I did: > current_end > last_allocation_address > tlab_end_ptr > > The reason is that the architecture dependent code use the thread.hpp > API and it already has tlab included into the name so it becomes > tlab_current_end (which is better that tlab_current_tlab_end in my > opinion). > > I also moved the update into a separate method with a TODO that says > to remove it when FastTLABRefill is deprecated This looks a lot better now. Thanks. Note that the following comment now needs updating accordingly in threadLocalAllocBuffer.hpp: 41 // Heap sampling is performed via the end/actual_end fields. 42 // actual_end contains the real end of the tlab allocation, 43 // whereas end can be set to an arbitrary spot in the tlab to 44 // trip the return and sample the allocation. 45 // slow_path_end is used to track if a fast tlab refill occured 46 // between slowpath calls. There might be other comments too, I have not looked in detail. > > > Not sure it's better but before updating the webrev, I wanted > to try > to get input/consensus :) > > (Note hard_end was always further off than end). > > src/hotspot/share/prims/jvmti.xml: > > 10357 > 10358 > 10359 Can sample the heap. > 10360 If this capability is enabled then the > heap sampling methods > can be called. > 10361 > 10362 > > Looks like this capability should not be "since 9" if it > gets integrated > now. > > Updated now to 11, crossing my fingers :) > > > src/hotspot/share/runtime/heapMonitoring.cpp: > > 448 if (is_alive->do_object_b(value)) { > 449 // Update the oop to point to the new object > if it is still > alive. > 450 f->do_oop(&(trace.obj)); > 451 > 452 // Copy the old trace, if it is still live. > 453 _allocated_traces->at_put(curr_pos++, trace); > 454 > 455 // Store the live trace in a cache, to be > served up on /heapz. > 456 _traces_on_last_full_gc->append(trace); > 457 > 458 count++; > 459 } else { > 460 // If the old trace is no longer live, add > it to the list of > 461 // recently collected garbage. > 462 store_garbage_trace(trace); > 463 } > > In the case where the oop was not live, I would like it to > be explicitly > cleared. > > Done I think how you wanted it. Let me know because I'm not > familiar > with the RootAccess API. I'm unclear if I'm doing this right > or not so > reviews of these parts are highly appreciated. Robbin had > talked of > perhaps later pushing this all into a OopStorage, should I do > this now > do you think? Or can that wait a second webrev later down the > road? > > > I think using handles can and should be done later. You can use > the Access API now. > I noticed that you are missing an #include > "oops/access.inline.hpp" in your heapMonitoring.cpp file. > > > The missing header is there for me so I don't know, I made sure it is > present in the latest webrev. Sorry about that. > > > + Did I clear it the way you wanted me to or were you thinking of > something else? > > > That is precisely how I wanted it to be cleared. Thanks. > > + Final question here, seems like if I were to want to not do the > f->do_oop directly on the trace.obj, I'd need to do something > like: > > f->do_oop(&value); > ... > trace->store_oop(value); > > to update the oop internally. Is that right/is that one of the > advantages of going to the Oopstorage sooner than later? > > > I think you really want to do the do_oop on the root directly. Is > there a particular reason why you would not want to do that? > Otherwise, yes - the benefit with using the handle approach is > that you do not need to call do_oop explicitly in your code. > > > There is no reason except that now we have a load_oop and a > get_oop_addr, I was not sure what you would think of that. > That's fine. > > Also I see a lot of concurrent-looking use of the > following field: > 267 volatile bool _initialized; > > Please note that the "volatile" qualifier does not help > with reordering > here. Reordering between volatile and non-volatile fields > is completely free > for both compiler and hardware, except for windows with > MSVC, where volatile > semantics is defined to use acquire/release semantics, and > the hardware is > TSO. But for the general case, I would expect this field > to be stored with > OrderAccess::release_store and loaded with > OrderAccess::load_acquire. > Otherwise it is not thread safe. > > Because everything is behind a mutex, I wasn't really worried > about > this. I have a test that has multiple threads trying to hit this > corner case and it passes. > > However, to be paranoid, I updated it to using the OrderAccess API > now, thanks! Let me know what you think there too! > > > If it is indeed always supposed to be read and written under a > mutex, then I would strongly prefer to have it accessed as a > normal non-volatile member, and have an assertion that given lock > is held or we are in a safepoint, as we do in many other places. > Something like this: > > assert(HeapMonitorStorage_lock->owned_by_self() || > (SafepointSynchronize::is_at_safepoint() && > Thread::current()->is_VM_thread()), "this should not be accessed > concurrently"); > > It would be confusing to people reading the code if there are uses > of OrderAccess that are actually always protected under a mutex. > > > Thank you for the exact example to be put in the code! I put it around > each access/assignment of the _initialized method and found one case > where yes you can touch it and not have the lock. It actually is "ok" > because you don't act on the storage until later and only when you > really want to modify the storage (see the object_alloc_do_sample > method which calls the add_trace method). > > But, because of this, I'm going to put the OrderAccess here, I'll do > some performance numbers later and if there are issues, I might add a > "unsafe" read and a "safe" one to make it explicit to the reader. But > I don't think it will come to that. Okay. This double return in heapMonitoring.cpp looks wrong: 283 bool initialized() { 284 return OrderAccess::load_acquire(&_initialized) != 0; 285 return _initialized; 286 } Since you said object_alloc_do_sample() is the only place where you do not hold the mutex while reading initialized(), I had a closer look at that. It looks like in its current shape, the lack of a mutex may lead to a memory leak. In particular, it first checks if (initialized()). Let's assume this is now true. It then allocates a bunch of stuff, and checks if the number of frames were over 0. If they were, it calls StackTraceStorage::storage()->add_trace() seemingly hoping that after grabbing the lock in there, initialized() will still return true. But it could now return false and skip doing anything, in which case the allocated stuff will never be freed. So the analysis seems to be that _initialized is only used outside of the mutex in once instance, where it is used to perform double-checked locking, that actually causes a memory leak. I am not proposing how to fix that, just raising the issue. If you still want to perform this double-checked locking somehow, then the use of acquire/release still seems odd. Because the memory ordering restrictions of it never comes into play in this particular case. If it ever did, then the use of destroy_stuff(); release_store(_initialized, 0) would be broken anyway as that would imply that whatever concurrent reader there ever was would after reading _initialized with load_acquire() could *never* read the data that is concurrently destroyed anyway. I would be biased to think that RawAccess::load/store looks like a more appropriate solution, given that the memory leak issue is resolved. I do not know how painful it would be to not perform this double-checked locking. > > As a kind of meta comment, I wonder if it would make sense > to add sampling > for non-TLAB allocations. Seems like if someone is rapidly > allocating a > whole bunch of 1 MB objects that never fit in a TLAB, I > might still be > interested in seeing that in my traces, and not get > surprised that the > allocation rate is very high yet not showing up in any > profiles. > > That is handled by the handle_sample where you wanted me to put a > UseTlab because you hit that case if the allocation is too big. > > > I see. It was not obvious to me that non-TLAB sampling is done in > the TLAB class. That seems like an abstraction crime. > What I wanted in my previous comment was that we do not call into > the TLAB when we are not using TLABs. If there is sampling logic > in the TLAB that is used for something else than TLABs, then it > seems like that logic simply does not belong inside of the TLAB. > It should be moved out of the TLAB, and instead have the TLAB call > this common abstraction that makes sense. > > > So in the incremental version: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/ > , this > is still a "crime". The reason is that the system has to have the > bytes_until_sample on a per-thread level and it made "sense" to have > it with the TLAB implementation. Also, I was not sure how people felt > about adding something to the thread instance instead. > > Do you think it fits better at the Thread level? I can see how > difficult it is to make it happen there and add some logic there. Let > me know what you think. We have an unfortunate situation where everyone that has some fields that are thread local tend to dump them right into Thread, making the size and complexity of Thread grow as it becomes tightly coupled with various unrelated subsystems. It would be desirable to have a separate class for this instead that encapsulates the sampling logic. That class could possibly reside in Thread though as a value object of Thread. > > Hope I have answered your questions and that my feedback makes > sense to you. > > > You have and thank you for them, I think we are getting to a cleaner > implementation and things are getting better and more readable :) Yes it is getting better. Thanks, /Erik > Thanks for your help! > Jc > > Thanks, > /Erik > > > I double checked by changing the test > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/raw_files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonitor/MyPackage/HeapMonitorStatObjectCorrectnessTest.java > > > to use a smaller Tlab (2048) and made the object bigger and it > goes > through that and passes. > > Thanks again for your review and I look forward to your > pointers for > the questions I now have raised! > Jc > > > > > > > > > > Thanks, > /Erik > > > On 2018-01-26 06:45, JC Beyler wrote: > > Thanks Robbin for the reviews :) > > The new full webrev is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/ > The incremental webrev is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/ > > > I inlined my answers: > > On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn > > > wrote: > > Hi JC, great to see another revision! > > #### > heapMonitoring.cpp > > StackTraceData should not contain the oop for > 'safety' reasons. > When StackTraceData is moved from _allocated_traces: > L452 store_garbage_trace(trace); > it contains a dead oop. > _allocated_traces could instead be a tupel of oop > and StackTraceData thus > dead oops are not kept. > > Done I used inheritance to make the copier work > regardless but the > idea is the same. > > You should use the new Access API for loading the > oop, something like > this: > RootAccess AS_NO_KEEPALIVE>::load(...) > I don't think you need to use Access API for > clearing the oop, but it > would > look nicer. And you shouldn't probably be using: > Universe::heap()->is_in_reserved(value) > > I am unfamiliar with this but I think I did do it like > you wanted me > to (all tests pass so that's a start). I'm not sure > how to clear the > oop exactly, is there somewhere that does that, which > I can use to do > the same? > > I removed the is_in_reserved, this came from our > internal version, I > don't know why it was there but my tests work without > so I removed it > :) > > > The lock: > L424 MutexLocker mu(HeapMonitorStorage_lock); > Is not needed as far as I can see. > weak_oops_do is called in a safepoint, no TLAB > allocation can happen and > JVMTI thread can't access these data-structures. > Is there something more > to > this lock that I'm missing? > > Since a thread can call the JVMTI getLiveTraces (or > any of the other > ones), it can get to the point of trying to copying the > _allocated_traces. I imagine it is possible that this > is happening > during a GC or that it can be started and a GC happens > afterwards. > Therefore, it seems to me that you want this > protected, no? > > > #### > You have 6 files without any changes in them (any > more): > g1CollectedHeap.cpp > psMarkSweep.cpp > psParallelCompact.cpp > genCollectedHeap.cpp > referenceProcessor.cpp > thread.hpp > > Done. > > #### > I have not looked closely, but is it possible to > hide heap sampling in > AllocTracer ? (with some minor changes to the > AllocTracer API) > > I am imagining that you are saying to move the code > that does the > sampling code (change the tlab end, do the call to > HeapMonitoring, > etc.) into the AllocTracer code itself? I think that > is right and I'll > look if that is possible and prepare a webrev to show > what would be > needed to make that happen. > > #### > Minor nit, when declaring pointer there is a > little mix of having the > pointer adjacent by type name and data name. (Most > hotspot code is by > type > name) > E.g. > heapMonitoring.cpp:711 jvmtiStackTrace *trace > = .... > heapMonitoring.cpp:733 Method* m = > vfst.method(); > (not just this file) > > Done! > > #### > HeapMonitorThreadOnOffTest.java:77 > I would make g_tmp volatile, otherwise the > assignment in loop may > theoretical be skipped. > > Also done! > > Thanks again! > Jc > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vaibhav.x.choudhary at oracle.com Wed Feb 14 14:15:09 2018 From: vaibhav.x.choudhary at oracle.com (Vaibhav Choudhary) Date: Wed, 14 Feb 2018 19:45:09 +0530 Subject: RFR: JDK-8197534: [TESTBUG] compiler/gcbarriers/PreserveFPRegistersTest.java fails with Conflicting collector combinations In-Reply-To: References: <300849D0-499A-4397-93EB-F162A3ED5A46@oracle.com> Message-ID: Thanks Tobias. In that case, I need to change the scope of the bug. Here is the new webrev :- http://cr.openjdk.java.net/~rpatil/8197534/webrev.01/ Thanks, Vaibhav Choudhary vaibhav.x.choudhary at oracle.com https://blogs.oracle.com/vaibhav > On 13-Feb-2018, at 2:50 PM, Tobias Hartmann wrote: > > Hi Vaibhav, > > You linked the wrong webrev. Here's the correct one: > http://cr.openjdk.java.net/~rpatil/8197534/webrev.00 > > Wouldn't it make more sense to backport the full fix for JDK-8153194 [1]? > > Thanks, > Tobias > > [1] https://bugs.openjdk.java.net/browse/JDK-8153194 > > > On 13.02.2018 09:49, Vaibhav Choudhary wrote: >> Hi All, >> Please review the fix for bug: https://bugs.openjdk.java.net/browse/JDK-8197534 >> Webrev: http://cr.openjdk.java.net/~rpatil/8197534/webrev.00/ >> >> Thanks, >> Vaibhav C >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Wed Feb 14 14:25:10 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 14 Feb 2018 15:25:10 +0100 Subject: RFR: JDK-8197534: [TESTBUG] compiler/gcbarriers/PreserveFPRegistersTest.java fails with Conflicting collector combinations In-Reply-To: References: <300849D0-499A-4397-93EB-F162A3ED5A46@oracle.com>

Message-ID: <46003f96-fb27-17de-e877-0a838f14f3a5@oracle.com> Hi Vaibhav, If the backport applies cleanly, you don't need a new review but request approval on jdk8u-dev according to: http://openjdk.java.net/projects/jdk8u/approval-template.html You also don't need a new bug but the backport will be automatically created once you push with 8153194 as bug id. In this case, you can close 8197534 as duplicate of 8153194. Thanks, Tobias On 14.02.2018 15:15, Vaibhav Choudhary wrote: > Thanks Tobias. > > In that case, I need to change the scope of the bug. Here is the new webrev :- > > http://cr.openjdk.java.net/~rpatil/8197534/webrev.01/ > > Thanks, > Vaibhav Choudhary > vaibhav.x.choudhary at oracle.com > https://blogs.oracle.com/vaibhav > > > >> On 13-Feb-2018, at 2:50 PM, Tobias Hartmann > wrote: >> >> Hi Vaibhav, >> >> You linked the wrong webrev. Here's the correct one: >> http://cr.openjdk.java.net/~rpatil/8197534/webrev.00 >> >> Wouldn't it make more sense to backport the full fix for JDK-8153194 [1]? >> >> Thanks, >> Tobias >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8153194 >> >> >> On 13.02.2018 09:49, Vaibhav Choudhary wrote: >>> Hi All, >>> Please review the fix for bug: https://bugs.openjdk.java.net/browse/JDK-8197534 >>> Webrev: http://cr.openjdk.java.net/~rpatil/8197534/webrev.00/ >>> >>> Thanks, >>> Vaibhav C >>> > From vaibhav.x.choudhary at oracle.com Wed Feb 14 15:22:32 2018 From: vaibhav.x.choudhary at oracle.com (Vaibhav Choudhary) Date: Wed, 14 Feb 2018 20:52:32 +0530 Subject: RFR: JDK-8197534: [TESTBUG] compiler/gcbarriers/PreserveFPRegistersTest.java fails with Conflicting collector combinations In-Reply-To: <46003f96-fb27-17de-e877-0a838f14f3a5@oracle.com> References: <300849D0-499A-4397-93EB-F162A3ED5A46@oracle.com>

<46003f96-fb27-17de-e877-0a838f14f3a5@oracle.com> Message-ID: Thank you again. This issue is not a clean backport (mean to say ?as it is?). There is a trivial change from 9 to 8u code, which is :- In 8u, * @library /testlibrary /testlibrary/whitebox In 9, * @library /testlibrary /test/lib Rest of the code is same. Thanks, Vaibhav Choudhary vaibhav.x.choudhary at oracle.com https://blogs.oracle.com/vaibhav > On 14-Feb-2018, at 7:55 PM, Tobias Hartmann wrote: > > Hi Vaibhav, > > If the backport applies cleanly, you don't need a new review but request approval on jdk8u-dev according to: > http://openjdk.java.net/projects/jdk8u/approval-template.html > > You also don't need a new bug but the backport will be automatically created once you push with 8153194 as bug id. > In this case, you can close 8197534 as duplicate of 8153194. > > Thanks, > Tobias > > > On 14.02.2018 15:15, Vaibhav Choudhary wrote: >> Thanks Tobias. >> >> In that case, I need to change the scope of the bug. Here is the new webrev :- >> >> http://cr.openjdk.java.net/~rpatil/8197534/webrev.01/ >> >> Thanks, >> Vaibhav Choudhary >> vaibhav.x.choudhary at oracle.com >> https://blogs.oracle.com/vaibhav >> >> >> >>> On 13-Feb-2018, at 2:50 PM, Tobias Hartmann > wrote: >>> >>> Hi Vaibhav, >>> >>> You linked the wrong webrev. Here's the correct one: >>> http://cr.openjdk.java.net/~rpatil/8197534/webrev.00 >>> >>> Wouldn't it make more sense to backport the full fix for JDK-8153194 [1]? >>> >>> Thanks, >>> Tobias >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8153194 >>> >>> >>> On 13.02.2018 09:49, Vaibhav Choudhary wrote: >>>> Hi All, >>>> Please review the fix for bug: https://bugs.openjdk.java.net/browse/JDK-8197534 >>>> Webrev: http://cr.openjdk.java.net/~rpatil/8197534/webrev.00/ >>>> >>>> Thanks, >>>> Vaibhav C >>>> >> From vladimir.kozlov at oracle.com Wed Feb 14 19:39:06 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 14 Feb 2018 11:39:06 -0800 Subject: [11] RFR(XS): 8197828: Null pointer dereference in MultiNode::proj_out_or_null In-Reply-To: <3867b86f-88f9-b98c-4a85-77d385418369@oracle.com> References: <008b533a-febe-98a2-f6b5-7344dc8ebdb1@oracle.com> <8b26f2f7-003b-2a23-b0d7-a2ffaa8d1c57@oracle.com> <498810c4-aea4-cf3e-c3db-66b9b5b0b8b8@oracle.com> <00f46bee-d8fa-12f5-8ff0-898d12db4684@oracle.com> <04bb252c-aa4e-654d-0a6c-972bdb2bb8a2@oracle.com> <3867b86f-88f9-b98c-4a85-77d385418369@oracle.com> Message-ID: <88bf6cc3-5ce0-71ea-db28-45eb6f5c7651@oracle.com> On 2/14/18 12:25 AM, Tobias Hartmann wrote: > Hi Dean, > > On 13.02.2018 23:23, dean.long at oracle.com wrote: >> If we can't hit it, then I vote for leaving it as an assert.? If we can hit it, then we need a test to trigger that code >> path. I only assumed that we did not hit it because of conditions. In reality it may be very rare corner case. > > I don't think we can easily trigger this with a test (after all it never showed up). Vladimir, are you fine with leaving > the assert for now? Okay, let have assert for now. Vladimir > > Thanks, > Tobias > From jcbeyler at google.com Thu Feb 15 06:34:48 2018 From: jcbeyler at google.com (JC Beyler) Date: Wed, 14 Feb 2018 22:34:48 -0800 Subject: JDK-8171119: Low-Overhead Heap Profiling In-Reply-To: <5A8414AC.3020209@oracle.com> References: <5A819F10.8040201@oracle.com> <5A8414AC.3020209@oracle.com> Message-ID: Hi Erik, I inlined my answers, which the last one seems to answer Robbin's concerns about the same thing (adding things to Thread). On Wed, Feb 14, 2018 at 2:51 AM, Erik ?sterlund wrote: > Hi JC, > > Comments are inlined below. > > > On 2018-02-13 06:18, JC Beyler wrote: > > Hi Erik, > > Thanks for your answers, I've now inlined my own answers/comments. > > I've done a new webrev here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.08/ > > The incremental is here: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/ > > Note to all: > - I've been integrating changes from Erin/Serguei/David comments so this > webrev incremental is a bit an answer to all comments in one. I apologize > for that :) > > > On Mon, Feb 12, 2018 at 6:05 AM, Erik ?sterlund > wrote: > >> Hi JC, >> >> Sorry for the delayed reply. >> >> Inlined answers: >> >> >> On 2018-02-06 00:04, JC Beyler wrote: >> >>> Hi Erik, >>> >>> (Renaming this to be folded into the newly renamed thread :)) >>> >>> First off, thanks a lot for reviewing the webrev! I appreciate it! >>> >>> I updated the webrev to: >>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/ >>> >>> And the incremental one is here: >>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.04_05a/ >>> >>> It contains: >>> - The change for since from 9 to 11 for the jvmti.xml >>> - The use of the OrderAccess for initialized >>> - Clearing the oop >>> >>> I also have inlined my answers to your comments. The biggest question >>> will come from the multiple *_end variables. A bit of the logic there >>> is due to handling the slow path refill vs fast path refill and >>> checking that the rug was not pulled underneath the slowpath. I >>> believe that a previous comment was that TlabFastRefill was going to >>> be deprecated. >>> >>> If this is true, we could revert this code a bit and just do a : if >>> TlabFastRefill is enabled, disable this. And then deprecate that when >>> TlabFastRefill is deprecated. >>> >>> This might simplify this webrev and I can work on a follow-up that >>> either: removes TlabFastRefill if Robbin does not have the time to do >>> it or add the support to the assembly side to handle this correctly. >>> What do you think? >>> >> >> I support removing TlabFastRefill, but I think it is good to not depend >> on that happening first. >> >> > > I'm slowly pushing on the FastTLABRefill ( > > https://bugs.openjdk.java.net/browse/JDK-8194084), I agree on keeping > both separate for now though so that we can think of both differently > > > >> Now, below, inlined are my answers: >>> >>> On Fri, Feb 2, 2018 at 8:44 AM, Erik ?sterlund >>> wrote: >>> >>>> Hi JC, >>>> >>>> Hope I am reviewing the right version of your work. Here goes... >>>> >>>> src/hotspot/share/gc/shared/collectedHeap.inline.hpp: >>>> >>>> 159 AllocTracer::send_allocation_outside_tlab(klass, result, >>>> size * >>>> HeapWordSize, THREAD); >>>> 160 >>>> 161 THREAD->tlab().handle_sample(THREAD, result, size); >>>> 162 return result; >>>> 163 } >>>> >>>> Should not call tlab()->X without checking if (UseTLAB) IMO. >>>> >>>> Done! >>> >> >> More about this later. >> >> >> >>> src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp: >>>> >>>> So first of all, there seems to quite a few ends. There is an "end", a >>>> "hard >>>> end", a "slow path end", and an "actual end". Moreover, it seems like >>>> the >>>> "hard end" is actually further away than the "actual end". So the "hard >>>> end" >>>> seems like more of a "really definitely actual end" or something. I >>>> don't >>>> know about you, but I think it looks kind of messy. In particular, I >>>> don't >>>> feel like the name "actual end" reflects what it represents, especially >>>> when >>>> there is another end that is behind the "actual end". >>>> >>>> 413 HeapWord* ThreadLocalAllocBuffer::hard_end() { >>>> 414 // Did a fast TLAB refill occur? >>>> 415 if (_slow_path_end != _end) { >>>> 416 // Fix up the actual end to be now the end of this TLAB. >>>> 417 _slow_path_end = _end; >>>> 418 _actual_end = _end; >>>> 419 } >>>> 420 >>>> 421 return _actual_end + alignment_reserve(); >>>> 422 } >>>> >>>> I really do not like making getters unexpectedly have these kind of side >>>> effects. It is not expected that when you ask for the "hard end", you >>>> implicitly update the "slow path end" and "actual end" to new values. >>>> >>>> As I said, a lot of this is due to the FastTlabRefill. If I make this >>> not supporting FastTlabRefill, this goes away. The reason the system >>> needs to update itself at the get is that you only know at that get if >>> things have shifted underneath the tlab slow path. I am not sure of >>> really better names (naming is hard!), perhaps we could do these >>> names: >>> >>> - current_tlab_end // Either the allocated tlab end or a sampling >>> point >>> - last_allocation_address // The end of the tlab allocation >>> - last_slowpath_allocated_end // In case a fast refill occurred the >>> end might have changed, this is to remember slow vs fast past refills >>> >>> the hard_end method can be renamed to something like: >>> tlab_end_pointer() // The end of the lab including a bit of >>> alignment reserved bytes >>> >> >> Those names sound better to me. Could you please provide a mapping from >> the old names to the new names so I understand which one is which please? >> >> This is my current guess of what you are proposing: >> >> end -> current_tlab_end >> actual_end -> last_allocation_address >> slow_path_end -> last_slowpath_allocated_end >> hard_end -> tlab_end_pointer >> >> > Yes that is correct, that was what I was proposing. > > >> I would prefer this naming: >> >> end -> slow_path_end // the end for taking a slow path; either due to >> sampling or refilling >> actual_end -> allocation_end // the end for allocations >> slow_path_end -> last_slow_path_end // last address for slow_path_end (as >> opposed to allocation_end) >> hard_end -> reserved_end // the end of the reserved space of the TLAB >> >> About setting things in the getter... that still seems like a very >> unpleasant thing to me. It would be better to inspect the call hierarchy >> and explicitly update the ends where they need updating, and assert in the >> getter that they are in sync, rather than implicitly setting various ends >> as a surprising side effect in a getter. It looks like the call hierarchy >> is very small. With my new naming convention, reserved_end() would >> presumably return _allocation_end + alignment_reserve(), and have an assert >> checking that _allocation_end == _last_slow_path_allocation_end, >> complaining that this invariant must hold, and that a caller to this >> function, such as make_parsable(), must first explicitly synchronize the >> ends as required, to honor that invariant. >> >> > > I've renamed the variables to how you preferred it except for the _end > one. I did: > current_end > last_allocation_address > tlab_end_ptr > > The reason is that the architecture dependent code use the thread.hpp API > and it already has tlab included into the name so it becomes > tlab_current_end (which is better that tlab_current_tlab_end in my opinion). > > I also moved the update into a separate method with a TODO that says to > remove it when FastTLABRefill is deprecated > > > This looks a lot better now. Thanks. > > Note that the following comment now needs updating accordingly in > threadLocalAllocBuffer.hpp: > > 41 // Heap sampling is performed via the end/actual_end fields. 42 // actual_end contains the real end of the tlab allocation, 43 // whereas end can be set to an arbitrary spot in the tlab to 44 // trip the return and sample the allocation. 45 // slow_path_end is used to track if a fast tlab refill occured 46 // between slowpath calls. > > There might be other comments too, I have not looked in detail. > This was the only spot that still had an actual_end, I fixed it now. I'll do a sweep to double check other comments. > > > > > >> >> Not sure it's better but before updating the webrev, I wanted to try >>> to get input/consensus :) >>> >>> (Note hard_end was always further off than end). >>> >>> src/hotspot/share/prims/jvmti.xml: >>>> >>>> 10357 >>>> 10358 >>>> 10359 Can sample the heap. >>>> 10360 If this capability is enabled then the heap sampling >>>> methods >>>> can be called. >>>> 10361 >>>> 10362 >>>> >>>> Looks like this capability should not be "since 9" if it gets integrated >>>> now. >>>> >>> Updated now to 11, crossing my fingers :) >>> >>> >>> src/hotspot/share/runtime/heapMonitoring.cpp: >>>> >>>> 448 if (is_alive->do_object_b(value)) { >>>> 449 // Update the oop to point to the new object if it is >>>> still >>>> alive. >>>> 450 f->do_oop(&(trace.obj)); >>>> 451 >>>> 452 // Copy the old trace, if it is still live. >>>> 453 _allocated_traces->at_put(curr_pos++, trace); >>>> 454 >>>> 455 // Store the live trace in a cache, to be served up on >>>> /heapz. >>>> 456 _traces_on_last_full_gc->append(trace); >>>> 457 >>>> 458 count++; >>>> 459 } else { >>>> 460 // If the old trace is no longer live, add it to the list >>>> of >>>> 461 // recently collected garbage. >>>> 462 store_garbage_trace(trace); >>>> 463 } >>>> >>>> In the case where the oop was not live, I would like it to be explicitly >>>> cleared. >>>> >>> Done I think how you wanted it. Let me know because I'm not familiar >>> with the RootAccess API. I'm unclear if I'm doing this right or not so >>> reviews of these parts are highly appreciated. Robbin had talked of >>> perhaps later pushing this all into a OopStorage, should I do this now >>> do you think? Or can that wait a second webrev later down the road? >>> >> >> I think using handles can and should be done later. You can use the >> Access API now. >> I noticed that you are missing an #include "oops/access.inline.hpp" in >> your heapMonitoring.cpp file. >> >> > The missing header is there for me so I don't know, I made sure it is > present in the latest webrev. Sorry about that. > > > >> + Did I clear it the way you wanted me to or were you thinking of >>> something else? >>> >> >> That is precisely how I wanted it to be cleared. Thanks. >> >> + Final question here, seems like if I were to want to not do the >>> f->do_oop directly on the trace.obj, I'd need to do something like: >>> >>> f->do_oop(&value); >>> ... >>> trace->store_oop(value); >>> >>> to update the oop internally. Is that right/is that one of the >>> advantages of going to the Oopstorage sooner than later? >>> >> >> I think you really want to do the do_oop on the root directly. Is there a >> particular reason why you would not want to do that? >> Otherwise, yes - the benefit with using the handle approach is that you >> do not need to call do_oop explicitly in your code. >> >> > There is no reason except that now we have a load_oop and a get_oop_addr, > I was not sure what you would think of that. > > > That's fine. > > > >> >>> Also I see a lot of concurrent-looking use of the following field: >>>> 267 volatile bool _initialized; >>>> >>>> Please note that the "volatile" qualifier does not help with reordering >>>> here. Reordering between volatile and non-volatile fields is completely >>>> free >>>> for both compiler and hardware, except for windows with MSVC, where >>>> volatile >>>> semantics is defined to use acquire/release semantics, and the hardware >>>> is >>>> TSO. But for the general case, I would expect this field to be stored >>>> with >>>> OrderAccess::release_store and loaded with OrderAccess::load_acquire. >>>> Otherwise it is not thread safe. >>>> >>> Because everything is behind a mutex, I wasn't really worried about >>> this. I have a test that has multiple threads trying to hit this >>> corner case and it passes. >>> >>> However, to be paranoid, I updated it to using the OrderAccess API >>> now, thanks! Let me know what you think there too! >>> >> >> If it is indeed always supposed to be read and written under a mutex, >> then I would strongly prefer to have it accessed as a normal non-volatile >> member, and have an assertion that given lock is held or we are in a >> safepoint, as we do in many other places. Something like this: >> >> assert(HeapMonitorStorage_lock->owned_by_self() || >> (SafepointSynchronize::is_at_safepoint() && >> Thread::current()->is_VM_thread()), "this should not be accessed >> concurrently"); >> >> It would be confusing to people reading the code if there are uses of >> OrderAccess that are actually always protected under a mutex. >> >> > Thank you for the exact example to be put in the code! I put it around > each access/assignment of the _initialized method and found one case where > yes you can touch it and not have the lock. It actually is "ok" because you > don't act on the storage until later and only when you really want to > modify the storage (see the object_alloc_do_sample method which calls the > add_trace method). > > But, because of this, I'm going to put the OrderAccess here, I'll do some > performance numbers later and if there are issues, I might add a "unsafe" > read and a "safe" one to make it explicit to the reader. But I don't think > it will come to that. > > > Okay. This double return in heapMonitoring.cpp looks wrong: > > 283 bool initialized() { > 284 return OrderAccess::load_acquire(&_initialized) != 0; > 285 return _initialized; > 286 } > > Since you said object_alloc_do_sample() is the only place where you do not > hold the mutex while reading initialized(), I had a closer look at that. It > looks like in its current shape, the lack of a mutex may lead to a memory > leak. In particular, it first checks if (initialized()). Let's assume this > is now true. It then allocates a bunch of stuff, and checks if the number > of frames were over 0. If they were, it calls StackTraceStorage::storage()->add_trace() > seemingly hoping that after grabbing the lock in there, initialized() will > still return true. But it could now return false and skip doing anything, > in which case the allocated stuff will never be freed. > I fixed this now by making add_trace return a boolean and checking for that. It will be in the next webrev. Thanks, the truth is that in our implementation the system is always on or off, so this never really occurs :). In this version though, that is not true and it's important to handle so thanks again! > > So the analysis seems to be that _initialized is only used outside of the > mutex in once instance, where it is used to perform double-checked locking, > that actually causes a memory leak. > > I am not proposing how to fix that, just raising the issue. If you still > want to perform this double-checked locking somehow, then the use of > acquire/release still seems odd. Because the memory ordering restrictions > of it never comes into play in this particular case. If it ever did, then > the use of destroy_stuff(); release_store(_initialized, 0) would be broken > anyway as that would imply that whatever concurrent reader there ever was > would after reading _initialized with load_acquire() could *never* read the > data that is concurrently destroyed anyway. I would be biased to think that > RawAccess::load/store looks like a more appropriate solution, > given that the memory leak issue is resolved. I do not know how painful it > would be to not perform this double-checked locking. > So I agree with this entirely. I looked also a bit more and the difference and code really stems from our internal version. In this version however, there are actually a lot of things going on that I did not go entirely through in my head but this comment made me ponder a bit more on it. Since every object_alloc_do_sample is protected by a check to HeapMonitoring::enabled(), there is only a small chance that the call is happening when things have been disabled. So there is no real need to do a first check on the initialized, it is a rare occurence that a call happens to object_alloc_do_sample and the initialized of the storage returns false. (By the way, even if you did call object_alloc_do_sample without looking at HeapMonitoring::enabled(), that would be ok too. You would gather the stacktrace and get nowhere at the add_trace call, which would return false; so though not optimal performance wise, nothing would break). Furthermore, the add_trace is really the moment of no return and we have the mutex lock and then the initialized check. So, in the end, I did two things: I removed that first check and then I removed the OrderAccess for the storage initialized. I think now I have a better grasp and understanding why it was done in our code and why it is not needed here. Thanks for pointing it out :). This now still passes my JTREG tests, especially the threaded one. > > > > > >> As a kind of meta comment, I wonder if it would make sense to add sampling >>>> for non-TLAB allocations. Seems like if someone is rapidly allocating a >>>> whole bunch of 1 MB objects that never fit in a TLAB, I might still be >>>> interested in seeing that in my traces, and not get surprised that the >>>> allocation rate is very high yet not showing up in any profiles. >>>> >>>> That is handled by the handle_sample where you wanted me to put a >>> UseTlab because you hit that case if the allocation is too big. >>> >> >> I see. It was not obvious to me that non-TLAB sampling is done in the >> TLAB class. That seems like an abstraction crime. >> What I wanted in my previous comment was that we do not call into the >> TLAB when we are not using TLABs. If there is sampling logic in the TLAB >> that is used for something else than TLABs, then it seems like that logic >> simply does not belong inside of the TLAB. It should be moved out of the >> TLAB, and instead have the TLAB call this common abstraction that makes >> sense. >> >> > So in the incremental version: > http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.07_08/, this is still > a "crime". The reason is that the system has to have the bytes_until_sample > on a per-thread level and it made "sense" to have it with the TLAB > implementation. Also, I was not sure how people felt about adding something > to the thread instance instead. > > Do you think it fits better at the Thread level? I can see how difficult > it is to make it happen there and add some logic there. Let me know what > you think. > > > We have an unfortunate situation where everyone that has some fields that > are thread local tend to dump them right into Thread, making the size and > complexity of Thread grow as it becomes tightly coupled with various > unrelated subsystems. It would be desirable to have a separate class for > this instead that encapsulates the sampling logic. That class could > possibly reside in Thread though as a value object of Thread. > I imagined that would be the case but was not sure. I will look at the example that Robbin is talking about (ThreadSMR) and will see how to refactor my code to use that. Thanks again for your help, Jc > > > > > >> Hope I have answered your questions and that my feedback makes sense to >> you. >> >> > You have and thank you for them, I think we are getting to a cleaner > implementation and things are getting better and more readable :) > > > Yes it is getting better. > > Thanks, > /Erik > > > Thanks for your help! > Jc > > > >> Thanks, >> /Erik >> >> >> I double checked by changing the test >>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.05a/raw_ >>> files/new/test/hotspot/jtreg/serviceability/jvmti/HeapMonito >>> r/MyPackage/HeapMonitorStatObjectCorrectnessTest.java >>> >>> to use a smaller Tlab (2048) and made the object bigger and it goes >>> through that and passes. >>> >>> Thanks again for your review and I look forward to your pointers for >>> the questions I now have raised! >>> Jc >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> Thanks, >>>> /Erik >>>> >>>> >>>> On 2018-01-26 06:45, JC Beyler wrote: >>>> >>>>> Thanks Robbin for the reviews :) >>>>> >>>>> The new full webrev is here: >>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.03/ >>>>> The incremental webrev is here: >>>>> http://cr.openjdk.java.net/~jcbeyler/8171119/webrev.02_03/ >>>>> >>>>> I inlined my answers: >>>>> >>>>> On Thu, Jan 25, 2018 at 1:15 AM, Robbin Ehn < >>>>> robbin.ehn at oracle.com> wrote: >>>>> >>>>>> Hi JC, great to see another revision! >>>>>> >>>>>> #### >>>>>> heapMonitoring.cpp >>>>>> >>>>>> StackTraceData should not contain the oop for 'safety' reasons. >>>>>> When StackTraceData is moved from _allocated_traces: >>>>>> L452 store_garbage_trace(trace); >>>>>> it contains a dead oop. >>>>>> _allocated_traces could instead be a tupel of oop and StackTraceData >>>>>> thus >>>>>> dead oops are not kept. >>>>>> >>>>> Done I used inheritance to make the copier work regardless but the >>>>> idea is the same. >>>>> >>>>> You should use the new Access API for loading the oop, something like >>>>>> this: >>>>>> RootAccess::load(...) >>>>>> I don't think you need to use Access API for clearing the oop, but it >>>>>> would >>>>>> look nicer. And you shouldn't probably be using: >>>>>> Universe::heap()->is_in_reserved(value) >>>>>> >>>>> I am unfamiliar with this but I think I did do it like you wanted me >>>>> to (all tests pass so that's a start). I'm not sure how to clear the >>>>> oop exactly, is there somewhere that does that, which I can use to do >>>>> the same? >>>>> >>>>> I removed the is_in_reserved, this came from our internal version, I >>>>> don't know why it was there but my tests work without so I removed it >>>>> :) >>>>> >>>>> >>>>> The lock: >>>>>> L424 MutexLocker mu(HeapMonitorStorage_lock); >>>>>> Is not needed as far as I can see. >>>>>> weak_oops_do is called in a safepoint, no TLAB allocation can happen >>>>>> and >>>>>> JVMTI thread can't access these data-structures. Is there something >>>>>> more >>>>>> to >>>>>> this lock that I'm missing? >>>>>> >>>>> Since a thread can call the JVMTI getLiveTraces (or any of the other >>>>> ones), it can get to the point of trying to copying the >>>>> _allocated_traces. I imagine it is possible that this is happening >>>>> during a GC or that it can be started and a GC happens afterwards. >>>>> Therefore, it seems to me that you want this protected, no? >>>>> >>>>> >>>>> #### >>>>>> You have 6 files without any changes in them (any more): >>>>>> g1CollectedHeap.cpp >>>>>> psMarkSweep.cpp >>>>>> psParallelCompact.cpp >>>>>> genCollectedHeap.cpp >>>>>> referenceProcessor.cpp >>>>>> thread.hpp >>>>>> >>>>>> Done. >>>>> >>>>> #### >>>>>> I have not looked closely, but is it possible to hide heap sampling in >>>>>> AllocTracer ? (with some minor changes to the AllocTracer API) >>>>>> >>>>>> I am imagining that you are saying to move the code that does the >>>>> sampling code (change the tlab end, do the call to HeapMonitoring, >>>>> etc.) into the AllocTracer code itself? I think that is right and I'll >>>>> look if that is possible and prepare a webrev to show what would be >>>>> needed to make that happen. >>>>> >>>>> #### >>>>>> Minor nit, when declaring pointer there is a little mix of having the >>>>>> pointer adjacent by type name and data name. (Most hotspot code is by >>>>>> type >>>>>> name) >>>>>> E.g. >>>>>> heapMonitoring.cpp:711 jvmtiStackTrace *trace = .... >>>>>> heapMonitoring.cpp:733 Method* m = vfst.method(); >>>>>> (not just this file) >>>>>> >>>>>> Done! >>>>> >>>>> #### >>>>>> HeapMonitorThreadOnOffTest.java:77 >>>>>> I would make g_tmp volatile, otherwise the assignment in loop may >>>>>> theoretical be skipped. >>>>>> >>>>>> Also done! >>>>> >>>>> Thanks again! >>>>> Jc >>>>> >>>> >>>> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tobias.hartmann at oracle.com Thu Feb 15 08:26:37 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 15 Feb 2018 09:26:37 +0100 Subject: [11] RFR(XS): 8197828: Null pointer dereference in MultiNode::proj_out_or_null In-Reply-To: <88bf6cc3-5ce0-71ea-db28-45eb6f5c7651@oracle.com> References: <008b533a-febe-98a2-f6b5-7344dc8ebdb1@oracle.com> <8b26f2f7-003b-2a23-b0d7-a2ffaa8d1c57@oracle.com> <498810c4-aea4-cf3e-c3db-66b9b5b0b8b8@oracle.com> <00f46bee-d8fa-12f5-8ff0-898d12db4684@oracle.com> <04bb252c-aa4e-654d-0a6c-972bdb2bb8a2@oracle.com> <3867b86f-88f9-b98c-4a85-77d385418369@oracle.com> <88bf6cc3-5ce0-71ea-db28-45eb6f5c7651@oracle.com> Message-ID: <461aba07-57a2-ab33-0cdf-e01016a6a28b@oracle.com> Hi, On 14.02.2018 20:39, Vladimir Kozlov wrote: > Okay, let have assert for now. Okay, thanks. I've pushed it. Best regards, Tobias From tobias.hartmann at oracle.com Thu Feb 15 08:33:23 2018 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Thu, 15 Feb 2018 09:33:23 +0100 Subject: RFR: JDK-8197534: [TESTBUG] compiler/gcbarriers/PreserveFPRegistersTest.java fails with Conflicting collector combinations In-Reply-To: References: <300849D0-499A-4397-93EB-F162A3ED5A46@oracle.com>

<46003f96-fb27-17de-e877-0a838f14f3a5@oracle.com> Message-ID: <80246da8-b30b-6c11-0418-4b9bedd27d0c@oracle.com> Hi Vaibhav, looks good to me. Unfortunately, I'm not a JDK 8u reviewer. Thanks, Tobias On 14.02.2018 16:22, Vaibhav Choudhary wrote: > Thank you again. > > This issue is not a clean backport (mean to say ?as it is?). There is a trivial change from 9 to 8u code, which is :- > > In 8u, * @library /testlibrary /testlibrary/whitebox > In 9, * @library /testlibrary /test/lib > > Rest of the code is same. > > Thanks, > Vaibhav Choudhary > vaibhav.x.choudhary at oracle.com > https://blogs.oracle.com/vaibhav > > > >> On 14-Feb-2018, at 7:55 PM, Tobias Hartmann wrote: >> >> Hi Vaibhav, >> >> If the backport applies cleanly, you don't need a new review but request approval on jdk8u-dev according to: >> http://openjdk.java.net/projects/jdk8u/approval-template.html >> >> You also don't need a new bug but the backport will be automatically created once you push with 8153194 as bug id. >> In this case, you can close 8197534 as duplicate of 8153194. >> >> Thanks, >> Tobias >> >> >> On 14.02.2018 15:15, Vaibhav Choudhary wrote: >>> Thanks Tobias. >>> >>> In that case, I need to change the scope of the bug. Here is the new webrev :- >>> >>> http://cr.openjdk.java.net/~rpatil/8197534/webrev.01/ >>> >>> Thanks, >>> Vaibhav Choudhary >>> vaibhav.x.choudhary at oracle.com >>> https://blogs.oracle.com/vaibhav >>> >>> >>> >>>> On 13-Feb-2018, at 2:50 PM, Tobias Hartmann > wrote: >>>> >>>> Hi Vaibhav, >>>> >>>> You linked the wrong webrev. Here's the correct one: >>>> http://cr.openjdk.java.net/~rpatil/8197534/webrev.00 >>>> >>>> Wouldn't it make more sense to backport the full fix for JDK-8153194 [1]? >>>> >>>> Thanks, >>>> Tobias >>>> >>>> [1] https://bugs.openjdk.java.net/browse/JDK-8153194 >>>> >>>> >>>> On 13.02.2018 09:49, Vaibhav Choudhary wrote: >>>>> Hi All, >>>>> Please review the fix for bug: https://bugs.openjdk.java.net/browse/JDK-8197534 >>>>> Webrev: http://cr.openjdk.java.net/~rpatil/8197534/webrev.00/ >>>>> >>>>> Thanks, >>>>> Vaibhav C >>>>> >>> > From vladimir.kozlov at oracle.com Thu Feb 15 19:01:54 2018 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Thu, 15 Feb 2018 11:01:54 -0800 Subject: RFR: JDK-8197534: [TESTBUG] compiler/gcbarriers/PreserveFPRegistersTest.java fails with Conflicting collector combinations In-Reply-To: <80246da8-b30b-6c11-0418-4b9bedd27d0c@oracle.com> References: <300849D0-499A-4397-93EB-F162A3ED5A46@oracle.com>

<46003f96-fb27-17de-e877-0a838f14f3a5@oracle.com> <80246da8-b30b-6c11-0418-4b9bedd27d0c@oracle.com> Message-ID: Reviewed. Vaibhav, please, don't forget to use original 8153194 bug id in your changeset for 8u as Tobias suggested. Thanks, Vladimir On 2/15/18 12:33 AM, Tobias Hartmann wrote: > Hi Vaibhav, > > looks good to me. Unfortunately, I'm not a JDK 8u reviewer. > > Thanks, > Tobias > > On 14.02.2018 16:22, Vaibhav Choudhary wrote: >> Thank you again. >> >> This issue is not a clean backport (mean to say ?as it is?). There is a trivial change from 9 to 8u code, which is :- >> >> In 8u, * @library /testlibrary /testlibrary/whitebox >> In 9, * @library /testlibrary /test/lib >> >> Rest of the code is same. >> >> Thanks, >> Vaibhav Choudhary >> vaibhav.x.choudhary at oracle.com >> https://blogs.oracle.com/vaibhav >> >> >> >>> On 14-Feb-2018, at 7:55 PM, Tobias Hartmann wrote: >>> >>> Hi Vaibhav, >>> >>> If the backport applies cleanly, you don't need a new review but request approval on jdk8u-dev according to: >>> http://openjdk.java.net/projects/jdk8u/approval-template.html >>> >>> You also don't need a new bug but the backport will be automatically created once you push with 8153194 as bug id. >>> In this case, you can close 8197534 as duplicate of 8153194. >>> >>> Thanks, >>> Tobias >>> >>> >>> On 14.02.2018 15:15, Vaibhav Choudhary wrote: >>>> Thanks Tobias. >>>> >>>> In that case, I need to change the scope of the bug. Here is the new webrev :- >>>> >>>> http://cr.openjdk.java.net/~rpatil/8197534/webrev.01/ >>>> >>>> Thanks, >>>> Vaibhav Choudhary >>>> vaibhav.x.choudhary at oracle.com >>>> https://blogs.oracle.com/vaibhav >>>> >>>> >>>> >>>>> On 13-Feb-2018, at 2:50 PM, Tobias Hartmann > wrote: >>>>> >>>>> Hi Vaibhav, >>>>> >>>>> You linked the wrong webrev. Here's the correct one: >>>>> http://cr.openjdk.java.net/~rpatil/8197534/webrev.00 >>>>> >>>>> Wouldn't it make more sense to backport the full fix for JDK-8153194 [1]? >>>>> >>>>> Thanks, >>>>> Tobias >>>>> >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8153194 >>>>> >>>>> >>>>> On 13.02.2018 09:49, Vaibhav Choudhary wrote: >>>>>> Hi All, >>>>>> Please review the fix for bug: https://bugs.openjdk.java.net/browse/JDK-8197534 >>>>>> Webrev: http://cr.openjdk.java.net/~rpatil/8197534/webrev.00/ >>>>>> >>>>>> Thanks, >>>>>> Vaibhav C >>>>>> >>>> >> From dmitry.chuyko at bell-sw.com Fri Feb 16 16:35:10 2018 From: dmitry.chuyko at bell-sw.com (Dmitry Chuyko) Date: Fri, 16 Feb 2018 19:35:10 +0300 Subject: [11] RFR (XS): JDK-8198293: AARCH64 - Add CPU detection code for Cavium Thunder X2 Message-ID: Hello, Please review a small change in vendor specific features for Cavium CPUs. T88 should be determined more accurate with its model number. T99 may have 2 vendor-model combinations. Flag values for T99 are raw first approximation, I've created a separate sub-task to check them more precisely. rfe: https://bugs.openjdk.java.net/browse/JDK-8198293 patch: below -Dmitry diff -r 82d9d5744e5f src/hotspot/cpu/aarch64/vm_version_aarch64.cpp --- a/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp??? Fri Feb 16 12:18:09 2018 +0100 +++ b/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp??? Fri Feb 16 19:15:44 2018 +0300 @@ -193,7 +193,9 @@ ?? } ?? // Enable vendor specific features -? if (_cpu == CPU_CAVIUM) { + +? // ThunderX +? if (_cpu == CPU_CAVIUM && (_model == 0xA1)) { ???? if (_variant == 0) _features |= CPU_DMB_ATOMICS; ???? if (FLAG_IS_DEFAULT(AvoidUnalignedAccesses)) { ?????? FLAG_SET_DEFAULT(AvoidUnalignedAccesses, true); @@ -202,6 +204,20 @@ ?????? FLAG_SET_DEFAULT(UseSIMDForMemoryOps, (_variant > 0)); ???? } ?? } +? // ThunderX2 +? if ((_cpu == CPU_CAVIUM && (_model == 0xAF)) || +????? (_cpu == CPU_BROADCOM && (_model == 0x516))) { +??? if (FLAG_IS_DEFAULT(AvoidUnalignedAccesses)) { +????? FLAG_SET_DEFAULT(AvoidUnalignedAccesses, true); +??? } +??? if (FLAG_IS_DEFAULT(UseSIMDForMemoryOps)) { +????? FLAG_SET_DEFAULT(UseSIMDForMemoryOps, true); +??? } +??? if (FLAG_IS_DEFAULT(UseFPUForSpilling)) { +????? FLAG_SET_DEFAULT(UseFPUForSpilling, true); +??? } +? } + ?? if (_cpu == CPU_ARM && (_model == 0xd03 || _model2 == 0xd03)) _features |= CPU_A53MAC; ?? if (_cpu == CPU_ARM && (_model == 0xd07 || _model2 == 0xd07)) _features |= CPU_STXR_PREFETCH; ?? // If an olde style /proc/cpuinfo (cpu_lines == 1) then if _model is an A57 (0xd07) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitry.chuyko at bell-sw.com Fri Feb 16 16:37:35 2018 From: dmitry.chuyko at bell-sw.com (Dmitry Chuyko) Date: Fri, 16 Feb 2018 19:37:35 +0300 Subject: [11] RFR (XS): JDK-8198293: AARCH64 - Add CPU detection code for Cavium Thunder X2 In-Reply-To: References: Message-ID: rfe link corrected: https://bugs.openjdk.java.net/browse/JDK-8198293 On 02/16/2018 07:35 PM, Dmitry Chuyko wrote: > > Hello, > > Please review a small change in vendor specific features for Cavium CPUs. > > T88 should be determined more accurate with its model number. > T99 may have 2 vendor-model combinations. > > Flag values for T99 are raw first approximation, I've created a > separate sub-task to check them more precisely. > > rfe: https://bugs.openjdk.java.net/browse/JDK-8198293 > patch: below > > -Dmitry > > diff -r 82d9d5744e5f src/hotspot/cpu/aarch64/vm_version_aarch64.cpp > --- a/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp??? Fri Feb 16 > 12:18:09 2018 +0100 > +++ b/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp??? Fri Feb 16 > 19:15:44 2018 +0300 > @@ -193,7 +193,9 @@ > ?? } > > ?? // Enable vendor specific features > -? if (_cpu == CPU_CAVIUM) { > + > +? // ThunderX > +? if (_cpu == CPU_CAVIUM && (_model == 0xA1)) { > ???? if (_variant == 0) _features |= CPU_DMB_ATOMICS; > ???? if (FLAG_IS_DEFAULT(AvoidUnalignedAccesses)) { > ?????? FLAG_SET_DEFAULT(AvoidUnalignedAccesses, true); > @@ -202,6 +204,20 @@ > ?????? FLAG_SET_DEFAULT(UseSIMDForMemoryOps, (_variant > 0)); > ???? } > ?? } > +? // ThunderX2 > +? if ((_cpu == CPU_CAVIUM && (_model == 0xAF)) || > +????? (_cpu == CPU_BROADCOM && (_model == 0x516))) { > +??? if (FLAG_IS_DEFAULT(AvoidUnalignedAccesses)) { > +????? FLAG_SET_DEFAULT(AvoidUnalignedAccesses, true); > +??? } > +??? if (FLAG_IS_DEFAULT(UseSIMDForMemoryOps)) { > +????? FLAG_SET_DEFAULT(UseSIMDForMemoryOps, true); > +??? } > +??? if (FLAG_IS_DEFAULT(UseFPUForSpilling)) { > +????? FLAG_SET_DEFAULT(UseFPUForSpilling, true); > +??? } > +? } > + > ?? if (_cpu == CPU_ARM && (_model == 0xd03 || _model2 == 0xd03)) > _features |= CPU_A53MAC; > ?? if (_cpu == CPU_ARM && (_model == 0xd07 || _model2 == 0xd07)) > _features |= CPU_STXR_PREFETCH; > ?? // If an olde style /proc/cpuinfo (cpu_lines == 1) then if _model > is an A57 (0xd07) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From felix.yang at linaro.org Sat Feb 17 01:15:40 2018 From: felix.yang at linaro.org (Felix Yang) Date: Sat, 17 Feb 2018 09:15:40 +0800 Subject: [aarch64-port-dev ] RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: References: <4cdcb619-754f-e6fc-bc18-67df9b892248@redhat.com>

<348d01b1-f15e-bd50-3236-f0b3a1c64a6c@redhat.com> <997eac31-e16c-1bbb-59c8-3515604a9b45@redhat.com> Message-ID: Hi, Overall, the patch looks good to me. One question: is the change in MacroAssembler::zero_memory necessary? Thanks, Felix On 12 February 2018 at 18:03, Zhongwei Yao wrote: > I see. Thanks. > > On 12 February 2018 at 17:51, Andrew Haley wrote: > > On 11/02/18 02:56, Zhongwei Yao wrote: > >> And I forgot to mention that since this patch touches some shared code > >> at hotspot/share/asm/codeBuffer.hpp, do I need a sponsor from Oracle? > > > > In theory, no. The submission forest should do that job. I'll have a > try. > > > > -- > > Andrew Haley > > Java Platform Lead Engineer > > Red Hat UK Ltd. > > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > > > > -- > Best regards, > Zhongwei > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aph at redhat.com Mon Feb 19 09:15:08 2018 From: aph at redhat.com (Andrew Haley) Date: Mon, 19 Feb 2018 09:15:08 +0000 Subject: [aarch64-port-dev ] RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: References: <4cdcb619-754f-e6fc-bc18-67df9b892248@redhat.com>

<348d01b1-f15e-bd50-3236-f0b3a1c64a6c@redhat.com> <997eac31-e16c-1bbb-59c8-3515604a9b45@redhat.com>

Message-ID: On 17/02/18 01:15, Felix Yang wrote: > Overall, the patch looks good to me. > One question: is the change in MacroAssembler::zero_memory necessary? I think it is. Try and see what happens without that change. -- Andrew Haley Java Platform Lead Engineer Red Hat UK Ltd. EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From dmitry.chuyko at bell-sw.com Mon Feb 19 14:22:07 2018 From: dmitry.chuyko at bell-sw.com (Dmitry Chuyko) Date: Mon, 19 Feb 2018 17:22:07 +0300 Subject: [11] RFR (XS): JDK-8198293: AARCH64 - Add CPU detection code for Cavium Thunder X2 In-Reply-To: References:

Message-ID: I also made webrev for convenience. webrev: http://cr.openjdk.java.net/~dchuyko/8198293/webrev.00/ rfe: https://bugs.openjdk.java.net/browse/JDK-8198293 -Dmitry On 02/16/2018 07:37 PM, Dmitry Chuyko wrote: > > rfe link corrected: https://bugs.openjdk.java.net/browse/JDK-8198293 > > On 02/16/2018 07:35 PM, Dmitry Chuyko wrote: >> >> Hello, >> >> Please review a small change in vendor specific features for Cavium CPUs. >> >> T88 should be determined more accurate with its model number. >> T99 may have 2 vendor-model combinations. >> >> Flag values for T99 are raw first approximation, I've created a >> separate sub-task to check them more precisely. >> >> rfe: https://bugs.openjdk.java.net/browse/JDK-8198293 >> patch: below >> >> -Dmitry >> >> diff -r 82d9d5744e5f src/hotspot/cpu/aarch64/vm_version_aarch64.cpp >> --- a/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp??? Fri Feb 16 >> 12:18:09 2018 +0100 >> +++ b/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp??? Fri Feb 16 >> 19:15:44 2018 +0300 >> @@ -193,7 +193,9 @@ >> ?? } >> >> ?? // Enable vendor specific features >> -? if (_cpu == CPU_CAVIUM) { >> + >> +? // ThunderX >> +? if (_cpu == CPU_CAVIUM && (_model == 0xA1)) { >> ???? if (_variant == 0) _features |= CPU_DMB_ATOMICS; >> ???? if (FLAG_IS_DEFAULT(AvoidUnalignedAccesses)) { >> ?????? FLAG_SET_DEFAULT(AvoidUnalignedAccesses, true); >> @@ -202,6 +204,20 @@ >> ?????? FLAG_SET_DEFAULT(UseSIMDForMemoryOps, (_variant > 0)); >> ???? } >> ?? } >> +? // ThunderX2 >> +? if ((_cpu == CPU_CAVIUM && (_model == 0xAF)) || >> +????? (_cpu == CPU_BROADCOM && (_model == 0x516))) { >> +??? if (FLAG_IS_DEFAULT(AvoidUnalignedAccesses)) { >> +????? FLAG_SET_DEFAULT(AvoidUnalignedAccesses, true); >> +??? } >> +??? if (FLAG_IS_DEFAULT(UseSIMDForMemoryOps)) { >> +????? FLAG_SET_DEFAULT(UseSIMDForMemoryOps, true); >> +??? } >> +??? if (FLAG_IS_DEFAULT(UseFPUForSpilling)) { >> +????? FLAG_SET_DEFAULT(UseFPUForSpilling, true); >> +??? } >> +? } >> + >> ?? if (_cpu == CPU_ARM && (_model == 0xd03 || _model2 == 0xd03)) >> _features |= CPU_A53MAC; >> ?? if (_cpu == CPU_ARM && (_model == 0xd07 || _model2 == 0xd07)) >> _features |= CPU_STXR_PREFETCH; >> ?? // If an olde style /proc/cpuinfo (cpu_lines == 1) then if _model >> is an A57 (0xd07) >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dmitry.samersoff at bell-sw.com Tue Feb 20 07:54:18 2018 From: dmitry.samersoff at bell-sw.com (Dmitry Samersoff) Date: Tue, 20 Feb 2018 10:54:18 +0300 Subject: [11] RFR (XS): JDK-8198293: AARCH64 - Add CPU detection code for Cavium Thunder X2 In-Reply-To: References:

Message-ID: <7164aa60-944f-87de-02dd-4ab238735620@bell-sw.com> Dmitry, I would prefer to have this code better readable/maintainable ever at the cost of some extra lines. i.e. if (_cpu == CPU_CAVIUM) { if (_model == CAVIUM_THUNDER_X) //0xA1 { ... } if (_model == CAVIUM_THUNDER_X2) //0xAF { ... } } else if (_cpu == CPU_BROADCOM ) { if (_model == 0x516) { ... } } -Dmitry On 19.02.2018 17:22, Dmitry Chuyko wrote: > I also made webrev for convenience. > > webrev: http://cr.openjdk.java.net/~dchuyko/8198293/webrev.00/ > rfe: https://bugs.openjdk.java.net/browse/JDK-8198293 > > -Dmitry > > On 02/16/2018 07:37 PM, Dmitry Chuyko wrote: >> >> rfe link corrected: https://bugs.openjdk.java.net/browse/JDK-8198293 >> >> On 02/16/2018 07:35 PM, Dmitry Chuyko wrote: >>> >>> Hello, >>> >>> Please review a small change in vendor specific features for Cavium CPUs. >>> >>> T88 should be determined more accurate with its model number. >>> T99 may have 2 vendor-model combinations. >>> >>> Flag values for T99 are raw first approximation, I've created a >>> separate sub-task to check them more precisely. >>> >>> rfe: https://bugs.openjdk.java.net/browse/JDK-8198293 >>> patch: below >>> >>> -Dmitry >>> >>> diff -r 82d9d5744e5f src/hotspot/cpu/aarch64/vm_version_aarch64.cpp >>> --- a/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp??? Fri Feb 16 >>> 12:18:09 2018 +0100 >>> +++ b/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp??? Fri Feb 16 >>> 19:15:44 2018 +0300 >>> @@ -193,7 +193,9 @@ >>> ?? } >>> ? >>> ?? // Enable vendor specific features >>> -? if (_cpu == CPU_CAVIUM) { >>> + >>> +? // ThunderX >>> +? if (_cpu == CPU_CAVIUM && (_model == 0xA1)) { >>> ???? if (_variant == 0) _features |= CPU_DMB_ATOMICS; >>> ???? if (FLAG_IS_DEFAULT(AvoidUnalignedAccesses)) { >>> ?????? FLAG_SET_DEFAULT(AvoidUnalignedAccesses, true); >>> @@ -202,6 +204,20 @@ >>> ?????? FLAG_SET_DEFAULT(UseSIMDForMemoryOps, (_variant > 0)); >>> ???? } >>> ?? } >>> +? // ThunderX2 >>> +? if ((_cpu == CPU_CAVIUM && (_model == 0xAF)) || >>> +????? (_cpu == CPU_BROADCOM && (_model == 0x516))) { >>> +??? if (FLAG_IS_DEFAULT(AvoidUnalignedAccesses)) { >>> +????? FLAG_SET_DEFAULT(AvoidUnalignedAccesses, true); >>> +??? } >>> +??? if (FLAG_IS_DEFAULT(UseSIMDForMemoryOps)) { >>> +????? FLAG_SET_DEFAULT(UseSIMDForMemoryOps, true); >>> +??? } >>> +??? if (FLAG_IS_DEFAULT(UseFPUForSpilling)) { >>> +????? FLAG_SET_DEFAULT(UseFPUForSpilling, true); >>> +??? } >>> +? } >>> + >>> ?? if (_cpu == CPU_ARM && (_model == 0xd03 || _model2 == 0xd03)) >>> _features |= CPU_A53MAC; >>> ?? if (_cpu == CPU_ARM && (_model == 0xd07 || _model2 == 0xd07)) >>> _features |= CPU_STXR_PREFETCH; >>> ?? // If an olde style /proc/cpuinfo (cpu_lines == 1) then if _model >>> is an A57 (0xd07) >>> >> > -- Dmitry Samersoff http://devnull.samersoff.net * There will come soft rains ... -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: OpenPGP digital signature URL: From dmitry.chuyko at bell-sw.com Tue Feb 20 11:27:35 2018 From: dmitry.chuyko at bell-sw.com (Dmitry Chuyko) Date: Tue, 20 Feb 2018 14:27:35 +0300 Subject: [11] RFR (XS): JDK-8198293: AARCH64 - Add CPU detection code for Cavium Thunder X2 In-Reply-To: <7164aa60-944f-87de-02dd-4ab238735620@bell-sw.com> References:

<7164aa60-944f-87de-02dd-4ab238735620@bell-sw.com> Message-ID: <25567088-57ac-510e-063a-2d070bbc03ea@bell-sw.com> Dmitry, Depending on firmware version T99 may be seen as Broadcom 0x516 or as Cavium 0xAF. So I'd prefer to leave per-model detection code blocks. Nothing common is expected here for T88 and T88. -Dmitry C On 02/20/2018 10:54 AM, Dmitry Samersoff wrote: > Dmitry, > > I would prefer to have this code better readable/maintainable ever at > the cost of some extra lines. > > i.e. > > if (_cpu == CPU_CAVIUM) { > if (_model == CAVIUM_THUNDER_X) //0xA1 { > ... > } > if (_model == CAVIUM_THUNDER_X2) //0xAF { > ... > } > } > else if (_cpu == CPU_BROADCOM ) { > if (_model == 0x516) { > ... > } > } > > -Dmitry > > > On 19.02.2018 17:22, Dmitry Chuyko wrote: >> I also made webrev for convenience. >> >> webrev: http://cr.openjdk.java.net/~dchuyko/8198293/webrev.00/ >> rfe: https://bugs.openjdk.java.net/browse/JDK-8198293 >> >> -Dmitry >> >> On 02/16/2018 07:37 PM, Dmitry Chuyko wrote: >>> rfe link corrected: https://bugs.openjdk.java.net/browse/JDK-8198293 >>> >>> On 02/16/2018 07:35 PM, Dmitry Chuyko wrote: >>>> Hello, >>>> >>>> Please review a small change in vendor specific features for Cavium CPUs. >>>> >>>> T88 should be determined more accurate with its model number. >>>> T99 may have 2 vendor-model combinations. >>>> >>>> Flag values for T99 are raw first approximation, I've created a >>>> separate sub-task to check them more precisely. >>>> >>>> rfe: https://bugs.openjdk.java.net/browse/JDK-8198293 >>>> patch: below >>>> >>>> -Dmitry >>>> >>>> diff -r 82d9d5744e5f src/hotspot/cpu/aarch64/vm_version_aarch64.cpp >>>> --- a/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp??? Fri Feb 16 >>>> 12:18:09 2018 +0100 >>>> +++ b/src/hotspot/cpu/aarch64/vm_version_aarch64.cpp??? Fri Feb 16 >>>> 19:15:44 2018 +0300 >>>> @@ -193,7 +193,9 @@ >>>> ?? } >>>> >>>> ?? // Enable vendor specific features >>>> -? if (_cpu == CPU_CAVIUM) { >>>> + >>>> +? // ThunderX >>>> +? if (_cpu == CPU_CAVIUM && (_model == 0xA1)) { >>>> ???? if (_variant == 0) _features |= CPU_DMB_ATOMICS; >>>> ???? if (FLAG_IS_DEFAULT(AvoidUnalignedAccesses)) { >>>> ?????? FLAG_SET_DEFAULT(AvoidUnalignedAccesses, true); >>>> @@ -202,6 +204,20 @@ >>>> ?????? FLAG_SET_DEFAULT(UseSIMDForMemoryOps, (_variant > 0)); >>>> ???? } >>>> ?? } >>>> +? // ThunderX2 >>>> +? if ((_cpu == CPU_CAVIUM && (_model == 0xAF)) || >>>> +????? (_cpu == CPU_BROADCOM && (_model == 0x516))) { >>>> +??? if (FLAG_IS_DEFAULT(AvoidUnalignedAccesses)) { >>>> +????? FLAG_SET_DEFAULT(AvoidUnalignedAccesses, true); >>>> +??? } >>>> +??? if (FLAG_IS_DEFAULT(UseSIMDForMemoryOps)) { >>>> +????? FLAG_SET_DEFAULT(UseSIMDForMemoryOps, true); >>>> +??? } >>>> +??? if (FLAG_IS_DEFAULT(UseFPUForSpilling)) { >>>> +????? FLAG_SET_DEFAULT(UseFPUForSpilling, true); >>>> +??? } >>>> +? } >>>> + >>>> ?? if (_cpu == CPU_ARM && (_model == 0xd03 || _model2 == 0xd03)) >>>> _features |= CPU_A53MAC; >>>> ?? if (_cpu == CPU_ARM && (_model == 0xd07 || _model2 == 0xd07)) >>>> _features |= CPU_STXR_PREFETCH; >>>> ?? // If an olde style /proc/cpuinfo (cpu_lines == 1) then if _model >>>> is an A57 (0xd07) >>>> > From eric.caspole at oracle.com Tue Feb 20 14:58:31 2018 From: eric.caspole at oracle.com (Eric Caspole) Date: Tue, 20 Feb 2018 09:58:31 -0500 Subject: Fwd: RFR(XS) (11) 8197447 : LogCompilation throws "couldn't find bytecode" In-Reply-To: References: Message-ID: <5b290b49-cad4-d689-ea88-52607a4572c9@oracle.com> Hi - could anyone review this patch? This fix is handy for me where I am using the tool a lot lately. Thanks, Eric -------- Forwarded Message -------- Subject: RFR(XS) (11) 8197447 : LogCompilation throws "couldn't find bytecode" Date: Fri, 9 Feb 2018 16:48:24 -0500 From: Eric Caspole To: hotspot-compiler-dev at openjdk.java.net Hi everybody, Please review this small change to the LogCompilation tool, where it did not support processing the unstable_fused_if uncommon trap which would take the place of the first unstable_if in the original code. I would like to add a test case for this but I am not sure what code sequence would be sure to get this optimization, and also not result in a giant log (where I originally found this problem while debugging an unrelated problem). Open to suggestions. Thanks, Eric JBS: https://bugs.openjdk.java.net/browse/JDK-8197447 webrev: http://cr.openjdk.java.net/~ecaspole/JDK-8197447/webrev/ From zhongwei.yao at linaro.org Wed Feb 21 02:37:10 2018 From: zhongwei.yao at linaro.org (Zhongwei Yao) Date: Wed, 21 Feb 2018 10:37:10 +0800 Subject: [aarch64-port-dev ] RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: References: <4cdcb619-754f-e6fc-bc18-67df9b892248@redhat.com>

<348d01b1-f15e-bd50-3236-f0b3a1c64a6c@redhat.com> <997eac31-e16c-1bbb-59c8-3515604a9b45@redhat.com>

Message-ID: Hi, Felix, Thank for your review! The change in zero_memory is necessary. Otherwise, the generated multiple str instructions will be merged into stp, which breaks the assumption of zero_memory's algorithm: each store instruction stores one word. Check the generated assembly of zero_memory function may help understand it better. On 17 February 2018 at 09:15, Felix Yang wrote: > Hi, > > Overall, the patch looks good to me. > One question: is the change in MacroAssembler::zero_memory necessary? > > Thanks, > Felix > > > On 12 February 2018 at 18:03, Zhongwei Yao wrote: >> >> I see. Thanks. >> >> On 12 February 2018 at 17:51, Andrew Haley wrote: >> > On 11/02/18 02:56, Zhongwei Yao wrote: >> >> And I forgot to mention that since this patch touches some shared code >> >> at hotspot/share/asm/codeBuffer.hpp, do I need a sponsor from Oracle? >> > >> > In theory, no. The submission forest should do that job. I'll have a >> > try. >> > >> > -- >> > Andrew Haley >> > Java Platform Lead Engineer >> > Red Hat UK Ltd. >> > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 >> >> >> >> -- >> Best regards, >> Zhongwei > > -- Best regards, Zhongwei From zhongwei.yao at linaro.org Wed Feb 21 02:44:08 2018 From: zhongwei.yao at linaro.org (Zhongwei Yao) Date: Wed, 21 Feb 2018 10:44:08 +0800 Subject: RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: <997eac31-e16c-1bbb-59c8-3515604a9b45@redhat.com> References: <4cdcb619-754f-e6fc-bc18-67df9b892248@redhat.com>

<348d01b1-f15e-bd50-3236-f0b3a1c64a6c@redhat.com> <997eac31-e16c-1bbb-59c8-3515604a9b45@redhat.com> Message-ID: Hi, Andrew, I just got a mail from do-not-reply at oracle.com yesterday with following content: " Mach5 mach5-one-zyao-JDK-8196064-20180219-1915-11753: Builds PASSED. Testing SUCCESSFUL. Mach5 Tasks Results Summary EXECUTED_WITH_FAILURE: 0 PASSED: 81 KILLED: 0 NA: 0 FAILED: 0 UNABLE_TO_RUN: 0 " Is this the submission forest's result? On 12 February 2018 at 17:51, Andrew Haley wrote: > On 11/02/18 02:56, Zhongwei Yao wrote: >> And I forgot to mention that since this patch touches some shared code >> at hotspot/share/asm/codeBuffer.hpp, do I need a sponsor from Oracle? > > In theory, no. The submission forest should do that job. I'll have a try. > > -- > Andrew Haley > Java Platform Lead Engineer > Red Hat UK Ltd. > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 -- Best regards, Zhongwei From aph at redhat.com Wed Feb 21 09:45:01 2018 From: aph at redhat.com (Andrew Haley) Date: Wed, 21 Feb 2018 09:45:01 +0000 Subject: RFR: JDK-8196064: AArch64: Merging ld/st into ldp/stp in macro-assembler In-Reply-To: