From suenaga at oss.nttdata.com Sun Feb 2 16:37:10 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sun, 2 Feb 2020 17:37:10 +0100 Subject: PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> Message-ID: <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> PING: Could you reveiw this change? > JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 > webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ I believe this change helps troubleshooter to fight to postmortem analysis. Thanks, Yasumasa On 2020/01/19 3:16, Yasumasa Suenaga wrote: > PING: Could you review it? > > ? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 > ? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ > > I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 . > It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). > > > Thanks, > > Yasumasa > > > On 2019/12/15 10:51, Yasumasa Suenaga wrote: >> Hi Serguei, >> >> Thanks for your comment! >> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said. >> >> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >> >> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>> Hi Yasumasa, >>> >>> This is nice move in general. >>> Thank you for working on this! >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>> >>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } >>> >>> >>> I'd suggest to simplify the logic by refactoring to something like below: >>> >>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame >>> ?????????? DwarfParser dwarf = null; >>> >>> ?????????? if (libptr != 0L) { // Native frame >>> ???????????? try { >>> ?????????????? dwarf = new DwarfParser(libptr); >>> ?????????????? dwarf.processDwarf(pc); >>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && >>> ????????????????????????????? !dwarf.isBPOffsetAvailable()) >>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>> .addOffsetTo(dwarf.getCFAOffset()); >>> >>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case >>> ??????????? } >>> ????????? } >>> ????????? if (cfa == null) { >>> ??????????? return null; >>> ????????? } >>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>> >>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>> >>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>> >>> ?? Better to rename 'ofs' => 'offs'. >>> >>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA()); >>> >>> ?? Extra space after '-' sign. >>> >>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) { >>> >>> ?? It feels like the logic has to be somehow refactored/simplified as >>> ?? several typical fragments appears in slightly different contexts. >>> ?? But it is not easy to understand what it is. >>> ?? Could you, please, add some comments to key places explaining this logic. >>> ?? Then I'll check if it is possible to make it a little bit simpler. >>> >>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } >>> >>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC): >>> ????? private CFrame javaSender(ThreadContext context) { >>> ??????? Address nextPC = getNextPC(false); >>> ??????? if (nextPC == null) { >>> ????????? return null; >>> ??????? } >>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>> ??????? DwarfParser nextDwarf = null; >>> >>> ??????? if (libptr != 0L) { // Native frame >>> ????????? try { >>> ??????????? nextDwarf = new DwarfParser(libptr); >>> ??????????? nextDwarf.processDwarf(nextPC); >>> ????????? } catch (DebuggerException e) { // Bail out to Java frame >>> ????????? } >>> ??????? } >>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>> ????? } >>> >>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, >>> nextCFA, nextPC, nextDwarf); 167 } >>> >>> ??This one can be also simplified a little: >>> >>> ????? public CFrame sender(ThreadProxy thread) { >>> ??????? ThreadContext context = thread.getContext(); >>> >>> ??????? if (dwarf == null) { // Java frame >>> ????????? return javaSender(context); >>> ??????? } >>> ??????? Address nextPC = getNextPC(true); >>> ??????? if (nextPC == null) { >>> ????????? return null; >>> ??????? } >>> ??????? DwarfParser nextDwarf = null; >>> ??????? if (!dwarf.isIn(nextPC)) { >>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>> ????????? if (libptr != 0L) { >>> ??????????? try { >>> ????????????? nextDwarf = new DwarfParser(libptr); >>> ????????????? nextDwarf.processDwarf(nextPC); >>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>> ??????????? } >>> ????????? } >>> ??????? } >>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>> ????? } >>> >>> Finally, it looks like just one method could replace both >>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>> >>> ????? private CFrame commonSender(ThreadProxy thread) { >>> ??????? ThreadContext context = thread.getContext(); >>> ??????? Address nextPC = getNextPC(false); >>> ??????? if (nextPC == null) { >>> ????????? return null; >>> ??????? } >>> ??????? DwarfParser nextDwarf = null; >>> >>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>> ????????? if (libptr != 0L) { >>> ??????????? try { >>> ????????????? nextDwarf = new DwarfParser(libptr); >>> ????????????? nextDwarf.processDwarf(nextPC); >>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>> ??????????? } >>> ????????? } >>> ??????? } >>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>> ????? } >>> >>> I'm still reviewing the dwarf parser files. >>> >>> Thanks, >>> Serguei >>> >>> >>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>> Hi, >>>> >>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and >>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>> Could you review new webrev? >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>> >>>> The diff from previous webrev is here: >>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>> Hi all, >>>>> >>>>> Please review this change: >>>>> >>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>> >>>>> >>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 >>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame >>>>> for stack unwinding. >>>>> >>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system >>>>> library (e.g. libc) might be compiled with this feature. >>>>> >>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). >>>>> So it might be lack of stack frames. >>>>> >>>>> I guess JDK-8219201 is caused by same issue. >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> [1] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf >>> From chris.plummer at oracle.com Mon Feb 3 18:20:19 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 3 Feb 2020 10:20:19 -0800 Subject: RFR: 8196729: Add jstatd option to specify RMI connector port In-Reply-To: <4B822AF4-BDA2-42B8-B3E8-F4232ECA4784@oracle.com> References: <34e9a3e4-1635-19df-bcbb-b239c6feee64@oracle.com> <4B822AF4-BDA2-42B8-B3E8-F4232ECA4784@oracle.com> Message-ID: <1a7fcead-bba4-bb26-ed18-a039a9230935@oracle.com> Hi Daniil, Ok. Changes look good. thanks, Chris On 1/31/20 2:14 PM, Daniil Titov wrote: > Hi Chris, > > Thank you for describing this in such details! Your description is correct. > > In addition, apart from jstat there is jps tool that also can communicate > with jstatd and currently it faces the same problems if jstatd is deployed behind > a firewall or in a container: after successful connection to RMI registry the > further communication fails since jstatd chooses a random RMI port that is > not published by the container or might be blocked by a firewall configuration. > > Best regards, > Daniil > > ?On 1/31/20, 1:45 PM, "Chris Plummer" wrote: > > Hi Daniil, > > Just want to make sure I understand what communications are going on > here. Your concern is when the jstat and jstatd processes are on > different sides of the firewall. When you launch jstatd, you specify the > socket port it will receive requests on, and when you launch jstat, you > must specify this same socket port, so no firewall problem there > assuming the firewall is configured to allow communication over that > port. However, once the request is received by jstatd, data can be > communicated via RMI rather than over the specified socket port. By > default jstatd was choosing a random RMI port, and I assume this RMI > port was communicated to the jstat process via the initial socket port. > This presents a problem for firewall configuration, since the firewall > configuration cannot know the RMI port that will be used. So now you're > allowing the rmi port to also be specified. > > Am I close? :) > > Chris > > On 1/31/20 1:08 PM, Daniil Titov wrote: > > Please review change [1] that adds a new command line option to jstatd tool to specify a RMI connector port. > > > > Currently a random port is used that prevents this tool from being used behind a firewall or in a container. > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > Man pages for jstatd will be updated in a separate issue. > > > > Testing: Mach5 tier1-tier3 and sun/tools/jstatd/* tests succeeded. Mach5 tier5 tests are in progress. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196729/webrev.01/ > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196729 > > [3] CSR : https://bugs.openjdk.java.net/browse/JDK-8238357 > > > > Thank you, > > Daniil > > > > > > > > > > From richard.reingruber at sap.com Tue Feb 4 08:59:17 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 4 Feb 2020 08:59:17 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi, I have prepared webrev.4 that incorporates feedback from webrev.3 (thanks!) Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ Incremental: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/ I was not able to eliminate the additional suspend flag now. I'll take care of this as soon as the existing suspend-resume-mechanism is reworked. Testing: Nightly tests @SAP: JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite, SAP specific tests with fastdebug and release builds on all platforms Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x parallel for 24h Thanks, Richard. More details on the changes: * Hide DeoptimizeObjectsALotThread from external view. * Changed EscapeBarrier_lock to be a _safepoint_check_never lock. It used to be _safepoint_check_sometimes, which will be eliminated sooner or later. I added explicit thread state changes with ThreadBlockInVM to code paths where we can wait() on EscapeBarrier_lock to become safepoint safe. * Use handshake EscapeBarrierSuspendHandshake to suspend target threads instead of vm operation VM_ThreadSuspendAllForObjDeopt. * Removed uses of Threads_lock. When adding a new thread we suspend it iff EA optimizations are being reverted. In the previous version we were waiting on Threads_lock while EA optimizations were reverted. See EscapeBarrier::thread_added(). * Made tests require Xmixed compilation mode. * Made tests agnostic regarding tiered compilation. I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or disabled. * Exercising EATests.java as well with stress test options DeoptimizeObjectsALot* Due to the non-deterministic deoptimizations some tests need to be skipped. We do this to prevent bit-rot of the stress test code. * Executing EATests.java as well with graal if available. Driver for this is EATestsJVMCI.java. Graal cannot pass all tests, because it does not provide all the new debug info (namely not_global_escape_in_scope and arg_escape in scopeDesc.hpp). And graal does not yet support the JVMTI operations force early return and pop frame. * Removed tracing from new jdi tests in EATests.java. Too much trace output before the debugging connection is established can cause deadlock because output buffers fill up. (See https://bugs.openjdk.java.net/browse/JDK-8173304) * Many copyright year changes and smaller clean-up changes of testing code (trailing white-space and the like). -----Original Message----- From: David Holmes Sent: Donnerstag, 19. Dezember 2019 03:12 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com) Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, I think my issue is with the way EliminateNestedLocks works so I'm going to look into that more deeply. Thanks for the explanations. David On 18/12/2019 12:47 am, Reingruber, Richard wrote: > Hi David, > > > > > Some further queries/concerns: > > > > > > > > src/hotspot/share/runtime/objectMonitor.cpp > > > > > > > > Can you please explain the changes to ObjectMonitor::wait: > > > > > > > > ! _recursions = save // restore the old recursion count > > > > ! + jt->get_and_reset_relock_count_after_wait(); // > > > > increased by the deferred relock count > > > > > > > > what is the "deferred relock count"? I gather it relates to > > > > > > > > "The code was extended to be able to deoptimize objects of a > > > frame that > > > > is not the top frame and to let another thread than the owning > > > thread do > > > > it." > > > > > > Yes, these relate. Currently EA based optimizations are reverted, when a compiled frame is > > > replaced with corresponding interpreter frames. Part of this is relocking objects with eliminated > > > locking. New with the enhancement is that we do this also just before object references are > > > acquired through JVMTI. In this case we deoptimize also the owning compiled frame C and we > > > register deoptimized objects as deferred updates. When control returns to C it gets deoptimized, > > > we notice that objects are already deoptimized (reallocated and relocked), so we don't do it again > > > (relocking twice would be incorrect of course). Deferred updates are copied into the new > > > interpreter frames. > > > > > > Problem: relocking is not possible if the target thread T is waiting on the monitor that needs to > > > be relocked. This happens only with non-local objects with EliminateNestedLocks. Instead relocking > > > is deferred until T owns the monitor again. This is what the piece of code above does. > > > > Sorry I need some more detail here. How can you wait() on an object > > monitor if the object allocation and/or locking was optimised away? And > > what is a "non-local object" in this context? Isn't EA restricted to > > thread-confined objects? > > "Non-local object" is an object that escapes its thread. The issue I'm addressing with the changes > in ObjectMonitor::wait are almost unrelated to EA. They are caused by EliminateNestedLocks, where C2 > eliminates recursive locking of an already owned lock. The lock owning object exists on the heap, it > is locked and you can call wait() on it. > > EliminateLocks is the C2 option that controls lock elimination based on EA. Both optimizations have > in common that objects with eliminated locking need to be relocked when deoptimizing a frame, > i.e. when replacing a compiled frame with equivalent interpreter > frames. Deoptimization::relock_objects does that job for /all/ eliminated locks in scope. /All/ can > be a mix of eliminated nested locks and locks of not-escaping objects. > > New with the enhancement: I call relock_objects earlier, just before objects pontentially > escape. But then later when the owning compiled frame gets deoptimized, I must not do it again: > > See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp: > > 373 if ((jvmci_enabled || ((DoEscapeAnalysis || EliminateNestedLocks) && EliminateLocks)) > 374 && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) { > 375 bool unused; > 376 eliminate_locks(thread, chunk, realloc_failures, deoptee, exec_mode, unused); > 377 } > > Now when calling relock_objects early it is quiet possible that I have to relock an object the > target thread currently waits for. Obviously I cannot relock in this case, instead I chose to > introduce relock_count_after_wait to JavaThread. > > > Is it just that some of the locking gets optimized away e.g. > > > > synchronised(obj) { > > synchronised(obj) { > > synchronised(obj) { > > obj.wait(); > > } > > } > > } > > > > If this is reduced to a form as-if it were a single lock of the monitor > > (due to EA) and the wait() triggers a JVM TI event which leads to the > > escape of "obj" then we need to reconstruct the true lock state, and so > > when the wait() internally unblocks and reacquires the monitor it has to > > set the true recursion count to 3, not the 1 that it appeared to be when > > wait() was initially called. Is that the scenario? > > Kind of... except that the locking is not eliminated due to EA and there is no JVM TI event > triggered by wait. > > Add > > LocalObject l1 = new LocalObject(); > > in front of the synchrnized blocks and assume a JVM TI agent acquires l1. This triggers the code in > question. > > See that relocking/reallocating is transactional. If it is done then for /all/ objects in scope and it is > done at most once. It wouldn't be quite so easy to split this in relocking of nested/EA-based > eliminated locks. > > > If so I find this truly awful. Anyone using wait() in a realistic form > > requires a notification and so the object cannot be thread confined. In > > It is not thread confined. > > > which case I would strongly argue that upon hitting the wait() the deopt > > should occur unconditionally and so the lock state is correct before we > > wait and so we don't need to mess with the recursion count internally > > when we reacquire the monitor. > > > > > > > > > which I don't like the sound of at all when it comes to ObjectMonitor > > > > state. So I'd like to understand in detail exactly what is going on here > > > > and why. This is a very intrusive change that seems to badly break > > > > encapsulation and impacts future changes to ObjectMonitor that are under > > > > investigation. > > > > > > I would not regard this as breaking encapsulation. Certainly not badly. > > > > > > I've added a property relock_count_after_wait to JavaThread. The property is well > > > encapsulated. Future ObjectMonitor implementations have to deal with recursion too. They are free > > > in choosing a way to do that as long as that property is taken into account. This is hardly a > > > limitation. > > > > I do think this badly breaks encapsulation as you have to add a callout > > from the guts of the ObjectMonitor code to reach into the thread to get > > this lock count adjustment. I understand why you have had to do this but > > I would much rather see a change to the EA optimisation strategy so that > > this is not needed. > > > > > Note also that the property is a straight forward extension of the existing concept of deferred > > > local updates. It is embedded into the structure holding them. So not even the footprint of a > > > JavaThread is enlarged if no deferred updates are generated. > > > > [...] > > > > > > > > I'm actually duplicating the existing external suspend mechanism, because a thread can be > > > suspended at most once. And hey, and don't like that either! But it seems not unlikely that the > > > duplicate can be removed together with the original and the new type of handshakes that will be > > > used for thread suspend can be used for object deoptimization too. See today's discussion in > > > JDK-8227745 [2]. > > > > I hope that discussion bears some fruit, at the moment it seems not to > > be possible to use handshakes here. :( > > > > The external suspend mechanism is a royal pain in the proverbial that we > > have to carefully live with. The idea that we're duplicating that for > > use in another fringe area of functionality does not thrill me at all. > > > > To be clear, I understand the problem that exists and that you wish to > > solve, but for the runtime parts I balk at the complexity cost of > > solving it. > > I know it's complex, but by far no rocket science. > > Also I find it hard to imagine another fix for JDK-8233915 besides changing the JVM TI specification. > > Thanks, Richard. > > -----Original Message----- > From: David Holmes > Sent: Dienstag, 17. Dezember 2019 08:03 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com) > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > > > David > > On 17/12/2019 4:57 pm, David Holmes wrote: >> Hi Richard, >> >> On 14/12/2019 5:01 am, Reingruber, Richard wrote: >>> Hi David, >>> >>> ?? > Some further queries/concerns: >>> ?? > >>> ?? > src/hotspot/share/runtime/objectMonitor.cpp >>> ?? > >>> ?? > Can you please explain the changes to ObjectMonitor::wait: >>> ?? > >>> ?? > !?? _recursions = save????? // restore the old recursion count >>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); // >>> ?? > increased by the deferred relock count >>> ?? > >>> ?? > what is the "deferred relock count"? I gather it relates to >>> ?? > >>> ?? > "The code was extended to be able to deoptimize objects of a >>> frame that >>> ?? > is not the top frame and to let another thread than the owning >>> thread do >>> ?? > it." >>> >>> Yes, these relate. Currently EA based optimizations are reverted, when >>> a compiled frame is replaced >>> with corresponding interpreter frames. Part of this is relocking >>> objects with eliminated >>> locking. New with the enhancement is that we do this also just before >>> object references are acquired >>> through JVMTI. In this case we deoptimize also the owning compiled >>> frame C and we register >>> deoptimized objects as deferred updates. When control returns to C it >>> gets deoptimized, we notice >>> that objects are already deoptimized (reallocated and relocked), so we >>> don't do it again (relocking >>> twice would be incorrect of course). Deferred updates are copied into >>> the new interpreter frames. >>> >>> Problem: relocking is not possible if the target thread T is waiting >>> on the monitor that needs to be >>> relocked. This happens only with non-local objects with >>> EliminateNestedLocks. Instead relocking is >>> deferred until T owns the monitor again. This is what the piece of >>> code above does. >> >> Sorry I need some more detail here. How can you wait() on an object >> monitor if the object allocation and/or locking was optimised away? And >> what is a "non-local object" in this context? Isn't EA restricted to >> thread-confined objects? >> >> Is it just that some of the locking gets optimized away e.g. >> >> synchronised(obj) { >> ? synchronised(obj) { >> ??? synchronised(obj) { >> ????? obj.wait(); >> ??? } >> ? } >> } >> >> If this is reduced to a form as-if it were a single lock of the monitor >> (due to EA) and the wait() triggers a JVM TI event which leads to the >> escape of "obj" then we need to reconstruct the true lock state, and so >> when the wait() internally unblocks and reacquires the monitor it has to >> set the true recursion count to 3, not the 1 that it appeared to be when >> wait() was initially called. Is that the scenario? >> >> If so I find this truly awful. Anyone using wait() in a realistic form >> requires a notification and so the object cannot be thread confined. In >> which case I would strongly argue that upon hitting the wait() the deopt >> should occur unconditionally and so the lock state is correct before we >> wait and so we don't need to mess with the recursion count internally >> when we reacquire the monitor. >> >>> >>> ?? > which I don't like the sound of at all when it comes to >>> ObjectMonitor >>> ?? > state. So I'd like to understand in detail exactly what is going >>> on here >>> ?? > and why.? This is a very intrusive change that seems to badly break >>> ?? > encapsulation and impacts future changes to ObjectMonitor that >>> are under >>> ?? > investigation. >>> >>> I would not regard this as breaking encapsulation. Certainly not badly. >>> >>> I've added a property relock_count_after_wait to JavaThread. The >>> property is well >>> encapsulated. Future ObjectMonitor implementations have to deal with >>> recursion too. They are free in >>> choosing a way to do that as long as that property is taken into >>> account. This is hardly a >>> limitation. >> >> I do think this badly breaks encapsulation as you have to add a callout >> from the guts of the ObjectMonitor code to reach into the thread to get >> this lock count adjustment. I understand why you have had to do this but >> I would much rather see a change to the EA optimisation strategy so that >> this is not needed. >> >>> Note also that the property is a straight forward extension of the >>> existing concept of deferred >>> local updates. It is embedded into the structure holding them. So not >>> even the footprint of a >>> JavaThread is enlarged if no deferred updates are generated. >>> >>> ?? > --- >>> ?? > >>> ?? > src/hotspot/share/runtime/thread.cpp >>> ?? > >>> ?? > Can you please explain why >>> JavaThread::wait_for_object_deoptimization >>> ?? > has to be handcrafted in this way rather than using proper >>> transitions. >>> ?? > >>> >>> I wrote wait_for_object_deoptimization taking >>> JavaThread::java_suspend_self_with_safepoint_check >>> as template. So in short: for the same reasons :) >>> >>> Threads reach both methods as part of thread state transitions, >>> therefore special handling is >>> required to change thread state on top of ongoing transitions. >>> >>> ?? > We got rid of "deopt suspend" some time ago and it is disturbing >>> to see >>> ?? > it being added back (effectively). This seems like it may be >>> something >>> ?? > that handshakes could be used for. >>> >>> Deopt suspend used to be something rather different with a similar >>> name[1]. It is not being added back. >> >> I stand corrected. Despite comments in the code to the contrary >> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of >> cleanup in this area 13 years ago :) >> >>> >>> I'm actually duplicating the existing external suspend mechanism, >>> because a thread can be suspended >>> at most once. And hey, and don't like that either! But it seems not >>> unlikely that the duplicate can >>> be removed together with the original and the new type of handshakes >>> that will be used for >>> thread suspend can be used for object deoptimization too. See today's >>> discussion in JDK-8227745 [2]. >> >> I hope that discussion bears some fruit, at the moment it seems not to >> be possible to use handshakes here. :( >> >> The external suspend mechanism is a royal pain in the proverbial that we >> have to carefully live with. The idea that we're duplicating that for >> use in another fringe area of functionality does not thrill me at all. >> >> To be clear, I understand the problem that exists and that you wish to >> solve, but for the runtime parts I balk at the complexity cost of >> solving it. >> >> Thanks, >> David >> ----- >> >>> Thanks, Richard. >>> >>> [1] Deopt suspend was something like an async. handshake for >>> architectures with register windows, >>> ???? where patching the return pc for deoptimization of a compiled >>> frame was racy if the owner thread >>> ???? was in native code. Instead a "deopt" suspend flag was set on >>> which the thread patched its own >>> ???? frame upon return from native. So no thread was suspended. It got >>> its name only from the name of >>> ???? the flags. >>> >>> [2] Discussion about using handshakes to sync. with the target thread: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14306727 >>> >>> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Freitag, 13. Dezember 2019 00:56 >>> To: Reingruber, Richard ; >>> serviceability-dev at openjdk.java.net; >>> hotspot-compiler-dev at openjdk.java.net; >>> hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>> Performance in the Presence of JVMTI Agents >>> >>> Hi Richard, >>> >>> Some further queries/concerns: >>> >>> src/hotspot/share/runtime/objectMonitor.cpp >>> >>> Can you please explain the changes to ObjectMonitor::wait: >>> >>> !?? _recursions = save????? // restore the old recursion count >>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); // >>> increased by the deferred relock count >>> >>> what is the "deferred relock count"? I gather it relates to >>> >>> "The code was extended to be able to deoptimize objects of a frame that >>> is not the top frame and to let another thread than the owning thread do >>> it." >>> >>> which I don't like the sound of at all when it comes to ObjectMonitor >>> state. So I'd like to understand in detail exactly what is going on here >>> and why.? This is a very intrusive change that seems to badly break >>> encapsulation and impacts future changes to ObjectMonitor that are under >>> investigation. >>> >>> --- >>> >>> src/hotspot/share/runtime/thread.cpp >>> >>> Can you please explain why JavaThread::wait_for_object_deoptimization >>> has to be handcrafted in this way rather than using proper transitions. >>> >>> We got rid of "deopt suspend" some time ago and it is disturbing to see >>> it being added back (effectively). This seems like it may be something >>> that handshakes could be used for. >>> >>> Thanks, >>> David >>> ----- >>> >>> On 12/12/2019 7:02 am, David Holmes wrote: >>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote: >>>>> Hi David, >>>>> >>>>> ??? > Most of the details here are in areas I can comment on in detail, >>>>> but I >>>>> ??? > did take an initial general look at things. >>>>> >>>>> Thanks for taking the time! >>>> >>>> Apologies the above should read: >>>> >>>> "Most of the details here are in areas I *can't* comment on in detail >>>> ..." >>>> >>>> David >>>> >>>>> ??? > The only thing that jumped out at me is that I think the >>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. >>>>> ??? > >>>>> ??? > +? bool is_hidden_from_external_view() const { return true; } >>>>> >>>>> Yes, it should. Will add the method like above. >>>>> >>>>> ??? > Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>>> Without >>>>> ??? > active testing this will just bit-rot. >>>>> >>>>> DeoptimizeObjectsALot is meant for stress testing with a larger >>>>> workload. I will add a minimal test >>>>> to keep it fresh. >>>>> >>>>> ??? > Also on the tests I don't understand your @requires clause: >>>>> ??? > >>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>>> ??? > (vm.opt.TieredCompilation != true)) >>>>> ??? > >>>>> ??? > This seems to require that TieredCompilation is disabled, but >>>>> tiered is >>>>> ??? > our normal mode of operation. ?? >>>>> ??? > >>>>> >>>>> I removed the clause. I guess I wanted to target the tests towards the >>>>> code they are supposed to >>>>> test, and it's easier to analyze failures w/o tiered compilation and >>>>> with just one compiler thread. >>>>> >>>>> Additionally I will make use of >>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>> -----Original Message----- >>>>> From: David Holmes >>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 >>>>> To: Reingruber, Richard ; >>>>> serviceability-dev at openjdk.java.net; >>>>> hotspot-compiler-dev at openjdk.java.net; >>>>> hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>>> Performance in the Presence of JVMTI Agents >>>>> >>>>> Hi Richard, >>>>> >>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>>>>> Hi, >>>>>> >>>>>> I would like to get reviews please for >>>>>> >>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>>>>> >>>>>> Corresponding RFE: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>>>>> >>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >>>>>> >>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without >>>>>> issues (thanks!). In addition the >>>>>> change is being tested at SAP since I posted the first RFR some >>>>>> months ago. >>>>>> >>>>>> The intention of this enhancement is to benefit performance wise from >>>>>> escape analysis even if JVMTI >>>>>> agents request capabilities that allow them to access local variable >>>>>> values. E.g. if you start-up >>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then >>>>>> escape analysis is disabled right >>>>>> from the beginning, well before a debugger attaches -- if ever one >>>>>> should do so. With the >>>>>> enhancement, escape analysis will remain enabled until and after a >>>>>> debugger attaches. EA based >>>>>> optimizations are reverted just before an agent acquires the >>>>>> reference to an object. In the JBS item >>>>>> you'll find more details. >>>>> >>>>> Most of the details here are in areas I can comment on in detail, but I >>>>> did take an initial general look at things. >>>>> >>>>> The only thing that jumped out at me is that I think the >>>>> DeoptimizeObjectsALotThread should be a hidden thread. >>>>> >>>>> +? bool is_hidden_from_external_view() const { return true; } >>>>> >>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>>> Without >>>>> active testing this will just bit-rot. >>>>> >>>>> Also on the tests I don't understand your @requires clause: >>>>> >>>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>>> (vm.opt.TieredCompilation != true)) >>>>> >>>>> This seems to require that TieredCompilation is disabled, but tiered is >>>>> our normal mode of operation. ?? >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch >>>>>> >>>>>> >>>>>> From chris.plummer at oracle.com Wed Feb 5 01:04:47 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 4 Feb 2020 17:04:47 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <3edb4c50-588d-d111-2a39-c9859ec64931@oracle.com> References: <39200476-D017-4BFD-ABF2-8ABC59ADA8C0@oracle.com> <3edb4c50-588d-d111-2a39-c9859ec64931@oracle.com> Message-ID: <6609ac07-de22-dcf5-6bea-420fdbcdddb8@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Wed Feb 5 03:51:15 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 4 Feb 2020 19:51:15 -0800 Subject: RFR: 8196729: Add jstatd option to specify RMI connector port In-Reply-To: References: Message-ID: <9bc20e75-afc9-2910-9d4d-be07a9dae731@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Feb 5 04:33:27 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 4 Feb 2020 20:33:27 -0800 Subject: RFR: 8196729: Add jstatd option to specify RMI connector port In-Reply-To: <9bc20e75-afc9-2910-9d4d-be07a9dae731@oracle.com> References: <9bc20e75-afc9-2910-9d4d-be07a9dae731@oracle.com> Message-ID: <0f3c210a-0798-d654-6dd9-fb2ac44ea19a@oracle.com> Good catch. It's a copy-n-paste bug from the block of code just above this block. You can use "-r " or "-r". The buggy code is handling the second form. The test case uses the first form so didn't catch this error. Chris On 2/4/20 7:51 PM, serguei.spitsyn at oracle.com wrote: > Hi Daniil, > > It looks okay to me in general. > But I'm puzzled with this part: > > http://cr.openjdk.java.net/~dtitov/8196729/webrev.01/src/jdk.jstatd/share/classes/sun/tools/jstatd/Jstatd.java.udiff.html > + } else if (arg.startsWith("-r")) { > + if (arg.compareTo("-r") != 0) { > + port = Integer.parseInt(arg.substring(2)); > + } else { > + argc++; > + if (argc >= args.length) { > + printUsage(); > + System.exit(1); > + } > + rmiPort = Integer.parseInt(args[argc]); > + } > > The option -r is for rmi connection port number. > Why does this code set the RMI registry port? : > + if (arg.compareTo("-r") != 0) { > + port = Integer.parseInt(arg.substring(2)); > > Thanks, > Serguei > > > On 1/31/20 13:08, Daniil Titov wrote: >> Please review change [1] that adds a new command line option to jstatd tool to specify a RMI connector port. >> >> Currently a random port is used that prevents this tool from being used behind a firewall or in a container. >> >> New CSR [3] was created for this change and it needs to be reviewed as well. >> >> Man pages for jstatd will be updated in a separate issue. >> >> Testing: Mach5 tier1-tier3 and sun/tools/jstatd/* tests succeeded. Mach5 tier5 tests are in progress. >> >> [1] Webrev:http://cr.openjdk.java.net/~dtitov/8196729/webrev.01/ >> [2] Jira issue:https://bugs.openjdk.java.net/browse/JDK-8196729 >> [3] CSR :https://bugs.openjdk.java.net/browse/JDK-8238357 >> >> Thank you, >> Daniil >> >> >> > From daniil.x.titov at oracle.com Wed Feb 5 06:00:38 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 04 Feb 2020 22:00:38 -0800 Subject: RFR: 8196729: Add jstatd option to specify RMI connector port In-Reply-To: <9bc20e75-afc9-2910-9d4d-be07a9dae731@oracle.com> References: <9bc20e75-afc9-2910-9d4d-be07a9dae731@oracle.com> Message-ID: <4D95BBC2-5989-401F-8F18-6356C0D0CD05@oracle.com> Hi Serguei, Thank you for finding this! Please review the new version of webrev [1] that has it corrected. The new webrev also includes changes in the test to make sure that all jstatd tests run for both styles of command line options. Testing: Mach5 jobs for sun/tools/jstatd succeeded. Tiers1, tiers2, tiers3, and tiers5 job are in the progress. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196729/webrev.02/ [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196729 [3] CSR : https://bugs.openjdk.java.net/browse/JDK-8238357 Thanks, Daniil From: "serguei.spitsyn at oracle.com" Date: Tuesday, February 4, 2020 at 7:51 PM To: Daniil Titov , "serviceability-dev at openjdk.java.net" Subject: Re: RFR: 8196729: Add jstatd option to specify RMI connector port Hi Daniil, It looks okay to me in general. But I'm puzzled with this part: http://cr.openjdk.java.net/~dtitov/8196729/webrev.01/src/jdk.jstatd/share/classes/sun/tools/jstatd/Jstatd.java.udiff.html + } else if (arg.startsWith("-r")) { + if (arg.compareTo("-r") != 0) { + port = Integer.parseInt(arg.substring(2)); + } else { + argc++; + if (argc >= args.length) { + printUsage(); + System.exit(1); + } + rmiPort = Integer.parseInt(args[argc]); + } The option -r is for rmi connection port number. Why does this code set the RMI registry port? : + if (arg.compareTo("-r") != 0) { + port = Integer.parseInt(arg.substring(2)); Thanks, Serguei On 1/31/20 13:08, Daniil Titov wrote: Please review change [1] that adds a new command line option to jstatd tool to specify a RMI connector port. Currently a random port is used that prevents this tool from being used behind a firewall or in a container. New CSR [3] was created for this change and it needs to be reviewed as well. Man pages for jstatd will be updated in a separate issue. Testing: Mach5 tier1-tier3 and sun/tools/jstatd/* tests succeeded. Mach5 tier5 tests are in progress. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196729/webrev.01/ [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196729 [3] CSR : https://bugs.openjdk.java.net/browse/JDK-8238357 Thank you, Daniil From serguei.spitsyn at oracle.com Wed Feb 5 17:36:40 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 5 Feb 2020 09:36:40 -0800 Subject: RFR: 8196729: Add jstatd option to specify RMI connector port In-Reply-To: <4D95BBC2-5989-401F-8F18-6356C0D0CD05@oracle.com> References: <9bc20e75-afc9-2910-9d4d-be07a9dae731@oracle.com> <4D95BBC2-5989-401F-8F18-6356C0D0CD05@oracle.com> Message-ID: Hi Daniil, Looks good. Thank you for the update! Thanks, Serguei On 2/4/20 22:00, Daniil Titov wrote: > Hi Serguei, > > Thank you for finding this! Please review the new version of webrev [1] > that has it corrected. The new webrev also includes changes in the test to > make sure that all jstatd tests run for both styles of command line options. > > Testing: Mach5 jobs for sun/tools/jstatd succeeded. Tiers1, tiers2, tiers3, > and tiers5 job are in the progress. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196729/webrev.02/ > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196729 > [3] CSR : https://bugs.openjdk.java.net/browse/JDK-8238357 > > Thanks, > Daniil > > From: "serguei.spitsyn at oracle.com" > Date: Tuesday, February 4, 2020 at 7:51 PM > To: Daniil Titov , "serviceability-dev at openjdk.java.net" > Subject: Re: RFR: 8196729: Add jstatd option to specify RMI connector port > > Hi Daniil, > > It looks okay to me in general. > But I'm puzzled with this part: > > http://cr.openjdk.java.net/~dtitov/8196729/webrev.01/src/jdk.jstatd/share/classes/sun/tools/jstatd/Jstatd.java.udiff.html > + } else if (arg.startsWith("-r")) { > + if (arg.compareTo("-r") != 0) { > + port = Integer.parseInt(arg.substring(2)); > + } else { > + argc++; > + if (argc >= args.length) { > + printUsage(); > + System.exit(1); > + } > + rmiPort = Integer.parseInt(args[argc]); > + } > > The option -r is for rmi connection port number. > Why does this code set the RMI registry port? : > + if (arg.compareTo("-r") != 0) { > + port = Integer.parseInt(arg.substring(2)); > > Thanks, > Serguei > > > On 1/31/20 13:08, Daniil Titov wrote: > Please review change [1] that adds a new command line option to jstatd tool to specify a RMI connector port. > > Currently a random port is used that prevents this tool from being used behind a firewall or in a container. > > New CSR [3] was created for this change and it needs to be reviewed as well. > > Man pages for jstatd will be updated in a separate issue. > > Testing: Mach5 tier1-tier3 and sun/tools/jstatd/* tests succeeded. Mach5 tier5 tests are in progress. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196729/webrev.01/ > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196729 > [3] CSR : https://bugs.openjdk.java.net/browse/JDK-8238357 > > Thank you, > Daniil > > > > > > > From serguei.spitsyn at oracle.com Wed Feb 5 18:24:31 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 5 Feb 2020 10:24:31 -0800 Subject: PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> Message-ID: <6c2164f3-5be6-d12a-5f78-eabb645db4af@oracle.com> Hi Yasumasa, The fix looks good to me. Thank you for your updates and patience! Thanks, Serguei On 2/2/20 08:37, Yasumasa Suenaga wrote: > PING: Could you reveiw this change? > >> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ > > I believe this change helps troubleshooter to fight to postmortem > analysis. > > > Thanks, > > Yasumasa > > > On 2020/01/19 3:16, Yasumasa Suenaga wrote: >> PING: Could you review it? >> >> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >> >> I updated webrev. I discussed with Serguei in off list, and I >> refactored webrev.02 . >> It has passed tests on submit repo >> (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>> Hi Serguei, >>> >>> Thanks for your comment! >>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry >>> said. >>> >>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>> >>> This change has been passed all tests on submit repo >>> (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>> Hi Yasumasa, >>>> >>>> This is nice move in general. >>>> Thank you for working on this! >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>> >>>> >>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) >>>> { // Java frame 98 Address rbp = >>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == >>>> null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, >>>> rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; >>>> 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch >>>> (DebuggerException e) { 108 Address rbp = >>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp >>>> == null) { 110 return null; 111 } 112 return new >>>> LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 >>>> dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() >>>> == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 >>>> ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : >>>> context.getRegisterAsAddress(dwarf.getCFARegister()) 119 >>>> .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 >>>> return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, >>>> dwarf); 124 } >>>> >>>> >>>> I'd suggest to simplify the logic by refactoring to something like >>>> below: >>>> >>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>> ?????????? Address cfa = >>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame >>>> ?????????? DwarfParser dwarf = null; >>>> >>>> ?????????? if (libptr != 0L) { // Native frame >>>> ???????????? try { >>>> ?????????????? dwarf = new DwarfParser(libptr); >>>> ?????????????? dwarf.processDwarf(pc); >>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == >>>> AMD64ThreadContext.RBP) && >>>> ????????????????????????????? !dwarf.isBPOffsetAvailable()) >>>> ???????????????????????????????? ? >>>> context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>> ???????????????????????????????? : >>>> context.getRegisterAsAddress(dwarf.getCFARegister()) >>>> .addOffsetTo(dwarf.getCFAOffset()); >>>> >>>> ??????????? } catch (DebuggerException e) { // bail out to Java >>>> frame case >>>> ??????????? } >>>> ????????? } >>>> ????????? if (cfa == null) { >>>> ??????????? return null; >>>> ????????? } >>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>> >>>> >>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>> >>>> ?? Better to rename 'ofs' => 'offs'. >>>> >>>> 77 nextCFA = nextCFA.addOffsetTo(- >>>> nextDwarf.getBasePointerOffsetFromCFA()); >>>> >>>> ?? Extra space after '-' sign. >>>> >>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext >>>> context) { >>>> >>>> ?? It feels like the logic has to be somehow refactored/simplified as >>>> ?? several typical fragments appears in slightly different contexts. >>>> ?? But it is not easy to understand what it is. >>>> ?? Could you, please, add some comments to key places explaining >>>> this logic. >>>> ?? Then I'll check if it is possible to make it a little bit simpler. >>>> >>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address >>>> nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 >>>> if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser >>>> nextDwarf = null; 119 long libptr = >>>> dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native >>>> frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } >>>> catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, >>>> context); 125 return (nextCFA == null) ? null : new >>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 >>>> nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = >>>> getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null >>>> 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } >>>> >>>> ??The above can be simplified if a DebuggerException can not be >>>> thrown from processDwarf(nextPC): >>>> ????? private CFrame javaSender(ThreadContext context) { >>>> ??????? Address nextPC = getNextPC(false); >>>> ??????? if (nextPC == null) { >>>> ????????? return null; >>>> ??????? } >>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>> ??????? DwarfParser nextDwarf = null; >>>> >>>> ??????? if (libptr != 0L) { // Native frame >>>> ????????? try { >>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>> ??????????? nextDwarf.processDwarf(nextPC); >>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame >>>> ????????? } >>>> ??????? } >>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, >>>> nextCFA, nextPC, nextDwarf); >>>> ????? } >>>> >>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext >>>> context = thread.getContext(); 137 138 if (dwarf == null) { // Java >>>> frame 139 return javaSender(context); 140 } 141 142 Address nextPC >>>> = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } >>>> 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if >>>> (!dwarf.isIn(nextPC)) { 150 long libptr = >>>> dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // >>>> Next frame might be Java frame 153 nextCFA = getNextCFA(null, >>>> context); 154 return (nextCFA == null) ? null : new >>>> LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 >>>> nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException >>>> e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA >>>> == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); >>>> 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = >>>> getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null >>>> : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 167 } >>>> >>>> ??This one can be also simplified a little: >>>> >>>> ????? public CFrame sender(ThreadProxy thread) { >>>> ??????? ThreadContext context = thread.getContext(); >>>> >>>> ??????? if (dwarf == null) { // Java frame >>>> ????????? return javaSender(context); >>>> ??????? } >>>> ??????? Address nextPC = getNextPC(true); >>>> ??????? if (nextPC == null) { >>>> ????????? return null; >>>> ??????? } >>>> ??????? DwarfParser nextDwarf = null; >>>> ??????? if (!dwarf.isIn(nextPC)) { >>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>> ????????? if (libptr != 0L) { >>>> ??????????? try { >>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>> ????????????? nextDwarf.processDwarf(nextPC); >>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>> ??????????? } >>>> ????????? } >>>> ??????? } >>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, >>>> nextCFA, nextPC, nextDwarf); >>>> ????? } >>>> >>>> Finally, it looks like just one method could replace both >>>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>>> >>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>> ??????? ThreadContext context = thread.getContext(); >>>> ??????? Address nextPC = getNextPC(false); >>>> ??????? if (nextPC == null) { >>>> ????????? return null; >>>> ??????? } >>>> ??????? DwarfParser nextDwarf = null; >>>> >>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>> ????????? if (libptr != 0L) { >>>> ??????????? try { >>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>> ????????????? nextDwarf.processDwarf(nextPC); >>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>> ??????????? } >>>> ????????? } >>>> ??????? } >>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, >>>> nextCFA, nextPC, nextDwarf); >>>> ????? } >>>> >>>> I'm still reviewing the dwarf parser files. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>> Hi, >>>>> >>>>> I refactored LinuxAMD64CFrame.java . It works fine in >>>>> serviceability/sa tests and >>>>> all tests on submit repo >>>>> (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>> Could you review new webrev? >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>> >>>>> The diff from previous webrev is here: >>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review this change: >>>>>> >>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>> ?? webrev: >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>> >>>>>> >>>>>> According to 2.7 Stack Unwind Algorithm in System V Application >>>>>> Binary Interface AMD64 >>>>>> Architecture Processor Supplement [1], we need to use DWARF in >>>>>> .eh_frame or .debug_frame >>>>>> for stack unwinding. >>>>>> >>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default >>>>>> since GCC 4.6, so system >>>>>> library (e.g. libc) might be compiled with this feature. >>>>>> >>>>>> However `jhsdb jstack --mixed` does not do so, it uses base >>>>>> pointer register (RBP). >>>>>> So it might be lack of stack frames. >>>>>> >>>>>> I guess JDK-8219201 is caused by same issue. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> [1] >>>>>> https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf >>>> From chris.plummer at oracle.com Wed Feb 5 19:15:33 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 5 Feb 2020 11:15:33 -0800 Subject: RFR: 8196729: Add jstatd option to specify RMI connector port In-Reply-To: References: <9bc20e75-afc9-2910-9d4d-be07a9dae731@oracle.com> <4D95BBC2-5989-401F-8F18-6356C0D0CD05@oracle.com> Message-ID: <75e2f64a-1073-2bef-da44-fb420b92fc71@oracle.com> +1 Chris On 2/5/20 9:36 AM, serguei.spitsyn at oracle.com wrote: > Hi Daniil, > > Looks good. > Thank you for the update! > > Thanks, > Serguei > > > On 2/4/20 22:00, Daniil Titov wrote: >> Hi Serguei, >> >> Thank you for finding this! Please review the new version of webrev [1] >> that has it corrected. The new webrev also includes changes in the >> test to >> make sure that all jstatd tests run for both styles of command line >> options. >> ? Testing: Mach5 jobs for sun/tools/jstatd? succeeded.? Tiers1, >> tiers2, tiers3, >> and tiers5 job are in the progress. >> >> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196729/webrev.02/ >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196729 >> [3] CSR : https://bugs.openjdk.java.net/browse/JDK-8238357 >> >> Thanks, >> Daniil >> >> From: "serguei.spitsyn at oracle.com" >> Date: Tuesday, February 4, 2020 at 7:51 PM >> To: Daniil Titov , >> "serviceability-dev at openjdk.java.net" >> >> Subject: Re: RFR: 8196729: Add jstatd option to specify RMI connector >> port >> >> Hi Daniil, >> >> It looks okay to me in general. >> But I'm puzzled with this part: >> >> http://cr.openjdk.java.net/~dtitov/8196729/webrev.01/src/jdk.jstatd/share/classes/sun/tools/jstatd/Jstatd.java.udiff.html >> >> +??????????? } else if (arg.startsWith("-r")) { >> +??????????????? if (arg.compareTo("-r") != 0) { >> +??????????????????? port = Integer.parseInt(arg.substring(2)); >> +??????????????? } else { >> +??????????????????? argc++; >> +??????????????????? if (argc >= args.length) { >> +??????????????????????? printUsage(); >> +??????????????????????? System.exit(1); >> +??????????????????? } >> +??????????????????? rmiPort = Integer.parseInt(args[argc]); >> +??????????????? } >> >> The option -r is for rmi connection port number. >> Why does this code set the RMI registry port? : >> +??????????????? if (arg.compareTo("-r") != 0) { >> +??????????????????? port = Integer.parseInt(arg.substring(2)); >> >> Thanks, >> Serguei >> >> >> On 1/31/20 13:08, Daniil Titov wrote: >> Please review change [1] that adds a new command line option to >> jstatd tool to specify a RMI connector port. >> >> Currently a random port is used that prevents this tool from being >> used behind a firewall or in a container. >> >> New CSR [3] was created for this change and it needs to be reviewed >> as well. >> >> Man pages for jstatd will be updated in a separate issue. >> >> Testing: Mach5 tier1-tier3 and sun/tools/jstatd/* tests succeeded. >> Mach5 tier5 tests are in progress. >> ? [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196729/webrev.01/ >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196729 >> [3] CSR : https://bugs.openjdk.java.net/browse/JDK-8238357 >> >> Thank you, >> Daniil >> >> >> >> >> >> >> > From suenaga at oss.nttdata.com Thu Feb 6 06:54:40 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 6 Feb 2020 15:54:40 +0900 Subject: PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <6c2164f3-5be6-d12a-5f78-eabb645db4af@oracle.com> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> <6c2164f3-5be6-d12a-5f78-eabb645db4af@oracle.com> Message-ID: Thanks Serguei! Yasumasa On 2020/02/06 3:24, serguei.spitsyn at oracle.com wrote: > Hi Yasumasa, > > The fix looks good to me. > Thank you for your updates and patience! > > Thanks, > Serguei > > > On 2/2/20 08:37, Yasumasa Suenaga wrote: >> PING: Could you reveiw this change? >> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >> >> I believe this change helps troubleshooter to fight to postmortem analysis. >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>> PING: Could you review it? >>> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>> >>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 . >>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>> Hi Serguei, >>>> >>>> Thanks for your comment! >>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said. >>>> >>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>> >>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>> Hi Yasumasa, >>>>> >>>>> This is nice move in general. >>>>> Thank you for working on this! >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>> >>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } >>>>> >>>>> >>>>> I'd suggest to simplify the logic by refactoring to something like below: >>>>> >>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame >>>>> ?????????? DwarfParser dwarf = null; >>>>> >>>>> ?????????? if (libptr != 0L) { // Native frame >>>>> ???????????? try { >>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>> ?????????????? dwarf.processDwarf(pc); >>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && >>>>> ????????????????????????????? !dwarf.isBPOffsetAvailable()) >>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>> >>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case >>>>> ??????????? } >>>>> ????????? } >>>>> ????????? if (cfa == null) { >>>>> ??????????? return null; >>>>> ????????? } >>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>> >>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>>> >>>>> ?? Better to rename 'ofs' => 'offs'. >>>>> >>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA()); >>>>> >>>>> ?? Extra space after '-' sign. >>>>> >>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) { >>>>> >>>>> ?? It feels like the logic has to be somehow refactored/simplified as >>>>> ?? several typical fragments appears in slightly different contexts. >>>>> ?? But it is not easy to understand what it is. >>>>> ?? Could you, please, add some comments to key places explaining this logic. >>>>> ?? Then I'll check if it is possible to make it a little bit simpler. >>>>> >>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } >>>>> >>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC): >>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>> ??????? Address nextPC = getNextPC(false); >>>>> ??????? if (nextPC == null) { >>>>> ????????? return null; >>>>> ??????? } >>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>> ??????? DwarfParser nextDwarf = null; >>>>> >>>>> ??????? if (libptr != 0L) { // Native frame >>>>> ????????? try { >>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>> ??????????? nextDwarf.processDwarf(nextPC); >>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>> ????????? } >>>>> ??????? } >>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>> ????? } >>>>> >>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, >>>>> nextCFA, nextPC, nextDwarf); 167 } >>>>> >>>>> ??This one can be also simplified a little: >>>>> >>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>> ??????? ThreadContext context = thread.getContext(); >>>>> >>>>> ??????? if (dwarf == null) { // Java frame >>>>> ????????? return javaSender(context); >>>>> ??????? } >>>>> ??????? Address nextPC = getNextPC(true); >>>>> ??????? if (nextPC == null) { >>>>> ????????? return null; >>>>> ??????? } >>>>> ??????? DwarfParser nextDwarf = null; >>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>> ????????? if (libptr != 0L) { >>>>> ??????????? try { >>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>> ??????????? } >>>>> ????????? } >>>>> ??????? } >>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>> ????? } >>>>> >>>>> Finally, it looks like just one method could replace both >>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>>>> >>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>> ??????? ThreadContext context = thread.getContext(); >>>>> ??????? Address nextPC = getNextPC(false); >>>>> ??????? if (nextPC == null) { >>>>> ????????? return null; >>>>> ??????? } >>>>> ??????? DwarfParser nextDwarf = null; >>>>> >>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>> ????????? if (libptr != 0L) { >>>>> ??????????? try { >>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>> ??????????? } >>>>> ????????? } >>>>> ??????? } >>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>> ????? } >>>>> >>>>> I'm still reviewing the dwarf parser files. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>> Hi, >>>>>> >>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and >>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>> Could you review new webrev? >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>> >>>>>> The diff from previous webrev is here: >>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review this change: >>>>>>> >>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>> >>>>>>> >>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 >>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame >>>>>>> for stack unwinding. >>>>>>> >>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system >>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>> >>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). >>>>>>> So it might be lack of stack frames. >>>>>>> >>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> [1] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf >>>>> > From richard.reingruber at sap.com Thu Feb 6 12:39:08 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Thu, 6 Feb 2020 12:39:08 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant Message-ID: Hi, could I please get reviews for this small enhancement: Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 The change avoids making all compiled methods on stack not_entrant when switching a java thread to interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. Thanks, Richard. See also my question if anyone knows a reason for making the compiled methods not_entrant: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From matthias.baesken at sap.com Thu Feb 6 16:06:00 2020 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 6 Feb 2020 16:06:00 +0000 Subject: RFR [XS]: 8238602: remove obsolete functions from libinstrument/FileSystemSupport_md.c Message-ID: Hello, the link time section gc (see https://bugs.openjdk.java.net/browse/JDK-8236714 , on linux s390x it prints the removed sections) showed some obsolete / unused functions in FileSystemSupport_md.c : ld: Removing unused section '.text.pathSeparator' in file '/nightly/output-jdk-dev/support/native/java.instrument/libinstrument/FileSystemSupport_md.o' ld: Removing unused section '.text.filenameStrcmp' in file '/nightly/output-jdk-dev/support/native/java.instrument/libinstrument/FileSystemSupport_md.o' They can be cleaned up. Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8238602 http://cr.openjdk.java.net/~mbaesken/webrevs/8238602.0/ Thanks, Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Thu Feb 6 18:26:29 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Thu, 06 Feb 2020 10:26:29 -0800 Subject: RFR: 8196729: Add jstatd option to specify RMI connector port In-Reply-To: <75e2f64a-1073-2bef-da44-fb420b92fc71@oracle.com> References: <9bc20e75-afc9-2910-9d4d-be07a9dae731@oracle.com> <4D95BBC2-5989-401F-8F18-6356C0D0CD05@oracle.com> <75e2f64a-1073-2bef-da44-fb420b92fc71@oracle.com> Message-ID: <51118795-16BB-49F9-BA0D-257BFC06223D@oracle.com> Thank you Chris and Serguei for reviewing this change! Best regards, Daniil ?On 2/5/20, 11:15 AM, "Chris Plummer" wrote: +1 Chris On 2/5/20 9:36 AM, serguei.spitsyn at oracle.com wrote: > Hi Daniil, > > Looks good. > Thank you for the update! > > Thanks, > Serguei > > > On 2/4/20 22:00, Daniil Titov wrote: >> Hi Serguei, >> >> Thank you for finding this! Please review the new version of webrev [1] >> that has it corrected. The new webrev also includes changes in the >> test to >> make sure that all jstatd tests run for both styles of command line >> options. >> Testing: Mach5 jobs for sun/tools/jstatd succeeded. Tiers1, >> tiers2, tiers3, >> and tiers5 job are in the progress. >> >> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196729/webrev.02/ >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196729 >> [3] CSR : https://bugs.openjdk.java.net/browse/JDK-8238357 >> >> Thanks, >> Daniil >> >> From: "serguei.spitsyn at oracle.com" >> Date: Tuesday, February 4, 2020 at 7:51 PM >> To: Daniil Titov , >> "serviceability-dev at openjdk.java.net" >> >> Subject: Re: RFR: 8196729: Add jstatd option to specify RMI connector >> port >> >> Hi Daniil, >> >> It looks okay to me in general. >> But I'm puzzled with this part: >> >> http://cr.openjdk.java.net/~dtitov/8196729/webrev.01/src/jdk.jstatd/share/classes/sun/tools/jstatd/Jstatd.java.udiff.html >> >> + } else if (arg.startsWith("-r")) { >> + if (arg.compareTo("-r") != 0) { >> + port = Integer.parseInt(arg.substring(2)); >> + } else { >> + argc++; >> + if (argc >= args.length) { >> + printUsage(); >> + System.exit(1); >> + } >> + rmiPort = Integer.parseInt(args[argc]); >> + } >> >> The option -r is for rmi connection port number. >> Why does this code set the RMI registry port? : >> + if (arg.compareTo("-r") != 0) { >> + port = Integer.parseInt(arg.substring(2)); >> >> Thanks, >> Serguei >> >> >> On 1/31/20 13:08, Daniil Titov wrote: >> Please review change [1] that adds a new command line option to >> jstatd tool to specify a RMI connector port. >> >> Currently a random port is used that prevents this tool from being >> used behind a firewall or in a container. >> >> New CSR [3] was created for this change and it needs to be reviewed >> as well. >> >> Man pages for jstatd will be updated in a separate issue. >> >> Testing: Mach5 tier1-tier3 and sun/tools/jstatd/* tests succeeded. >> Mach5 tier5 tests are in progress. >> [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196729/webrev.01/ >> [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196729 >> [3] CSR : https://bugs.openjdk.java.net/browse/JDK-8238357 >> >> Thank you, >> Daniil >> >> >> >> >> >> >> > From alexey.menkov at oracle.com Thu Feb 6 21:14:05 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 6 Feb 2020 13:14:05 -0800 Subject: RFR: JDK-8234935: JdwpListenTest.java and JdwpAttachTest.java getting bind failures on Windows 2016 hosts Message-ID: <4e751737-af60-f79c-cfff-73d300a73e57@oracle.com> Hi all, Please review the fix for https://bugs.openjdk.java.net/browse/JDK-8234935 webrev: http://cr.openjdk.java.net/~amenkov/jdk15/JdwpTestsTeredo/webrev/ The failures are caused by Teredo clients (https://en.wikipedia.org/wiki/Teredo_tunneling). The fix filters out corresponding addresses. JdwpListenTest and JdwpAttachTest use the same way to get addresses for testing. As this is not the 1st time the algorithm is updated I decided to deduplicate the code and move shared code to new base class. So actual change is the addition of 71 // Teredo clients cause intermittent errors on listen ("bind failed") 72 // and attach ("no route to host"). 73 // Teredo is supposed to be a temporary measure, but some test machines have it. 74 if (isTeredo(addr6)) { 75 continue; 76 } and isTeredo method implementation. --alex From chris.plummer at oracle.com Thu Feb 6 23:01:15 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 6 Feb 2020 15:01:15 -0800 Subject: RFR: JDK-8234935: JdwpListenTest.java and JdwpAttachTest.java getting bind failures on Windows 2016 hosts In-Reply-To: <4e751737-af60-f79c-cfff-73d300a73e57@oracle.com> References: <4e751737-af60-f79c-cfff-73d300a73e57@oracle.com> Message-ID: Hi Alex, When refactoring is big and the bug fix is small, I prefer to see the refactoring done first. It just keeps things cleaner and makes it easier for the reviewer to see the important changes. It also helps anyone looking at this bug or these tests in the future to better recognize what the actual bug fix was, and what was just refactoring. Think if there was another test with this issue, and someone was looking at the diff of this fix to see how to apply it to the other test. BTW, there is already a Platform.isWindows() API. It should probably be used rather than the check the test is using. It is a slightly different test however, testing for a prefix of "win" rather than "windows" anywhere in the string. thanks, Chris On 2/6/20 1:14 PM, Alex Menkov wrote: > Hi all, > > Please review the fix for > https://bugs.openjdk.java.net/browse/JDK-8234935 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk15/JdwpTestsTeredo/webrev/ > > The failures are caused by Teredo clients > (https://en.wikipedia.org/wiki/Teredo_tunneling). > The fix filters out corresponding addresses. > > JdwpListenTest and JdwpAttachTest use the same way to get addresses > for testing. As this is not the 1st time the algorithm is updated I > decided to deduplicate the code and move shared code to new base class. > So actual change is the addition of > > 71? // Teredo clients cause intermittent errors on listen ("bind failed") > 72? // and attach ("no route to host"). > 73? // Teredo is supposed to be a temporary measure, but some test > machines have it. > 74?? if (isTeredo(addr6)) { > 75??? continue; > 76? } > > and isTeredo method implementation. > > --alex From alexey.menkov at oracle.com Thu Feb 6 23:40:44 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 6 Feb 2020 15:40:44 -0800 Subject: RFR: JDK-8234935: JdwpListenTest.java and JdwpAttachTest.java getting bind failures on Windows 2016 hosts In-Reply-To: References: <4e751737-af60-f79c-cfff-73d300a73e57@oracle.com> Message-ID: Hi Chris, Thank you for the review. So we have 2 ways - create new RFE for refactoring and then fix this bug in updated code. or just fix this 2 tests without refactoring (the changes in the tests will be identical). Do you think it makes sense to go #1 or just do #2? Regarding using Platform.isWindows - it's good for the case, I'll fix it in the next iteration. --alex On 02/06/2020 15:01, Chris Plummer wrote: > Hi Alex, > > When refactoring is big and the bug fix is small, I prefer to see the > refactoring done first. It just keeps things cleaner and makes it easier > for the reviewer to see the important changes. It also helps anyone > looking at this bug or these tests in the future to better recognize > what the actual bug fix was, and what was just refactoring. Think if > there was another test with this issue, and someone was looking at the > diff of this fix to see how to apply it to the other test. > > BTW, there is already a Platform.isWindows() API. It should probably be > used rather than the check the test is using. It is a slightly different > test however, testing for a prefix of "win" rather than "windows" > anywhere in the string. > > thanks, > > Chris > > On 2/6/20 1:14 PM, Alex Menkov wrote: >> Hi all, >> >> Please review the fix for >> https://bugs.openjdk.java.net/browse/JDK-8234935 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk15/JdwpTestsTeredo/webrev/ >> >> The failures are caused by Teredo clients >> (https://en.wikipedia.org/wiki/Teredo_tunneling). >> The fix filters out corresponding addresses. >> >> JdwpListenTest and JdwpAttachTest use the same way to get addresses >> for testing. As this is not the 1st time the algorithm is updated I >> decided to deduplicate the code and move shared code to new base class. >> So actual change is the addition of >> >> 71? // Teredo clients cause intermittent errors on listen ("bind failed") >> 72? // and attach ("no route to host"). >> 73? // Teredo is supposed to be a temporary measure, but some test >> machines have it. >> 74?? if (isTeredo(addr6)) { >> 75??? continue; >> 76? } >> >> and isTeredo method implementation. >> >> --alex > > From chris.plummer at oracle.com Fri Feb 7 01:31:38 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Thu, 6 Feb 2020 17:31:38 -0800 Subject: RFR: JDK-8234935: JdwpListenTest.java and JdwpAttachTest.java getting bind failures on Windows 2016 hosts In-Reply-To: References: <4e751737-af60-f79c-cfff-73d300a73e57@oracle.com> Message-ID: Either is fine by me. Chris On 2/6/20 3:40 PM, Alex Menkov wrote: > Hi Chris, > > Thank you for the review. > So we have 2 ways - create new RFE for refactoring and then fix this > bug in updated code. > or just fix this 2 tests without refactoring (the changes in the tests > will be identical). > Do you think it makes sense to go #1 or just do #2? > > Regarding using Platform.isWindows - it's good for the case, I'll fix > it in the next iteration. > > --alex > > On 02/06/2020 15:01, Chris Plummer wrote: >> Hi Alex, >> >> When refactoring is big and the bug fix is small, I prefer to see the >> refactoring done first. It just keeps things cleaner and makes it >> easier for the reviewer to see the important changes. It also helps >> anyone looking at this bug or these tests in the future to better >> recognize what the actual bug fix was, and what was just refactoring. >> Think if there was another test with this issue, and someone was >> looking at the diff of this fix to see how to apply it to the other >> test. >> >> BTW, there is already a Platform.isWindows() API. It should probably >> be used rather than the check the test is using. It is a slightly >> different test however, testing for a prefix of "win" rather than >> "windows" anywhere in the string. >> >> thanks, >> >> Chris >> >> On 2/6/20 1:14 PM, Alex Menkov wrote: >>> Hi all, >>> >>> Please review the fix for >>> https://bugs.openjdk.java.net/browse/JDK-8234935 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk15/JdwpTestsTeredo/webrev/ >>> >>> The failures are caused by Teredo clients >>> (https://en.wikipedia.org/wiki/Teredo_tunneling). >>> The fix filters out corresponding addresses. >>> >>> JdwpListenTest and JdwpAttachTest use the same way to get addresses >>> for testing. As this is not the 1st time the algorithm is updated I >>> decided to deduplicate the code and move shared code to new base class. >>> So actual change is the addition of >>> >>> 71? // Teredo clients cause intermittent errors on listen ("bind >>> failed") >>> 72? // and attach ("no route to host"). >>> 73? // Teredo is supposed to be a temporary measure, but some test >>> machines have it. >>> 74?? if (isTeredo(addr6)) { >>> 75??? continue; >>> 76? } >>> >>> and isTeredo method implementation. >>> >>> --alex >> >> From markus.gaisbauer at dynatrace.com Thu Feb 6 18:07:52 2020 From: markus.gaisbauer at dynatrace.com (Gaisbauer, Markus) Date: Thu, 6 Feb 2020 18:07:52 +0000 Subject: Call new Win32 API SetThreadDescription in os::set_native_thread_name Message-ID: Hi, I am looking for a sponsor who could create a ticket for the following proposal: Microsoft recently introduced a new API to assign a name to native Windows threads. https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setthreaddescription These thread names can be shown by debuggers, C++ profilers, etc. The new API is available since either Windows 10 1607 or Windows Server 2016. The JVM already tries to set a native thread name both for all internal JVM threads and all Java threads (except main). But the Windows implementation of os::set_native_thread_name currently uses a weird hack described here. https://docs.microsoft.com/en-us/visualstudio/debugger/how-to-set-a-thread-name-in-native-code?view=vs-2015&redirectedfrom=MSDN For this hack, debugger has to be already attached when a thread starts. I propose to check in os::set_native_thread_name if SetThreadDescription is available. If yes, either call it instead or in addition to the current code. Here is some prototype code that worked for me: typedef HRESULT(WINAPI *SetThreadDescriptionT)(HANDLE, PCWSTR); static SetThreadDescriptionT getSetThreadDescriptionT() { HMODULE kernel32 = GetModuleHandle("Kernel32.dll"); return kernel32 ? reinterpret_cast(GetProcAddress(kernel32, "SetThreadDescription")) : nullptr; } static LPWSTR utf8_decode(const char *name) { if (name == nullptr) return nullptr; int name_len = (int) strlen(name); int size_needed = MultiByteToWideChar(CP_UTF8, 0, name, name_len, NULL, 0); size_t buffer_len = sizeof(wchar_t) * (size_needed + 1); LPWSTR result = (LPWSTR) os::malloc(buffer_len, mtInternal); memset(result, 0, buffer_len); MultiByteToWideChar(CP_UTF8, 0, name, name_len, result, size_needed); return result; } void os::set_native_thread_name(const char *name) { // First try calling SetThreadDescription available since Windows 10 1607 / Windows Server 2016 // See: https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setthreaddescription static SetThreadDescriptionT SetThreadDescription = getSetThreadDescriptionT(); if (SetThreadDescription) { LPWSTR nameWide = utf8_decode(name); if (nameWide != nullptr) { SetThreadDescription(GetCurrentThread(), nameWide); os::free(nameWide); } return; } // fallback ... } Best regards, Markus The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4020 Linz, Austria, Am F?nfundzwanziger Turm 20 From david.holmes at oracle.com Fri Feb 7 02:00:12 2020 From: david.holmes at oracle.com (David Holmes) Date: Fri, 7 Feb 2020 12:00:12 +1000 Subject: Call new Win32 API SetThreadDescription in os::set_native_thread_name In-Reply-To: References: Message-ID: <51b20779-a6de-b9ce-08d5-8c155fd2a3f9@oracle.com> Hi Markus, Adding hotspot-runtime-dev as runtime owns this area of code. On 7/02/2020 4:07 am, Gaisbauer, Markus wrote: > Hi, > > I am looking for a sponsor who could create a ticket for the following proposal: Have you signed the OCA? https://www.oracle.com/technetwork/community/oca-486395.html I don't see you listed. > Microsoft recently introduced a new API to assign a name to native Windows threads. > https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setthreaddescription > > These thread names can be shown by debuggers, C++ profilers, etc. The new API is available since either Windows 10 1607 or Windows Server 2016. Thanks for that heads up about the new API. I have filed: https://bugs.openjdk.java.net/browse/JDK-8238649 for that enhancement. Thanks, David ----- > The JVM already tries to set a native thread name both for all internal JVM threads and all Java threads (except main). > > But the Windows implementation of os::set_native_thread_name currently uses a weird hack described here. > https://docs.microsoft.com/en-us/visualstudio/debugger/how-to-set-a-thread-name-in-native-code?view=vs-2015&redirectedfrom=MSDN > > For this hack, debugger has to be already attached when a thread starts. > > I propose to check in os::set_native_thread_name if SetThreadDescription is available. If yes, either call it instead or in addition to the current code. > > Here is some prototype code that worked for me: > > typedef HRESULT(WINAPI *SetThreadDescriptionT)(HANDLE, PCWSTR); > > static SetThreadDescriptionT getSetThreadDescriptionT() { > HMODULE kernel32 = GetModuleHandle("Kernel32.dll"); > return kernel32 ? reinterpret_cast(GetProcAddress(kernel32, "SetThreadDescription")) : nullptr; > } > > static LPWSTR utf8_decode(const char *name) { > if (name == nullptr) return nullptr; > int name_len = (int) strlen(name); > int size_needed = MultiByteToWideChar(CP_UTF8, 0, name, name_len, NULL, 0); > size_t buffer_len = sizeof(wchar_t) * (size_needed + 1); > LPWSTR result = (LPWSTR) os::malloc(buffer_len, mtInternal); > memset(result, 0, buffer_len); > MultiByteToWideChar(CP_UTF8, 0, name, name_len, result, size_needed); > return result; > } > > void os::set_native_thread_name(const char *name) { > > // First try calling SetThreadDescription available since Windows 10 1607 / Windows Server 2016 > // See: https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setthreaddescription > > static SetThreadDescriptionT SetThreadDescription = getSetThreadDescriptionT(); > if (SetThreadDescription) { > LPWSTR nameWide = utf8_decode(name); > if (nameWide != nullptr) { > SetThreadDescription(GetCurrentThread(), nameWide); > os::free(nameWide); > } > return; > } > > // fallback > ... > } > > Best regards, > Markus > The contents of this e-mail are intended for the named addressee only. It contains information that may be confidential. Unless you are the named addressee or an authorized designee, you may not copy or use it, or disclose it to anyone else. If you received it in error please notify us immediately and then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a company registered in Linz whose registered office is at 4020 Linz, Austria, Am F?nfundzwanziger Turm 20 > From vladimir.x.ivanov at oracle.com Fri Feb 7 08:18:47 2020 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Fri, 7 Feb 2020 11:18:47 +0300 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: References: Message-ID: <32f34616-cf17-8caa-5064-455e013e2313@oracle.com> > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ Not an expert in JVMTI code base, so can't comment on the actual changes. From JIT-compilers perspective it looks good. Best regards, Vladimir Ivanov > Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 > > The change avoids making all compiled methods on stack not_entrant when switching a java thread to > interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. > > Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. > > Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. > > Thanks, Richard. > > See also my question if anyone knows a reason for making the compiled methods not_entrant: > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html > From ralf.schmelter at sap.com Fri Feb 7 12:14:45 2020 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Fri, 7 Feb 2020 12:14:45 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump Message-ID: Hi everyone, this change adds the option to write a hprof heap dump directly gzipped. Currently this is supported for the GC.heap_dump diagnostic command via the "-gz" flag. Since gzip is not particular fast when compressing the data, the actual compression will be done parallel in a bunch of background threads. The created file itself is not a single gzip stream, but a concatenation of streams of about 1 MB gzippped data. This makes it easier to parallelize the compression and it allows for an at least semi-efficient random access to the created file. I've adjusted the jhat library to be able to directly parse a such a gzipped hrpof file, without the need of prior decompression. bugreport: https://bugs.openjdk.java.net/browse/JDK-8237354 webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.0/ If this get reviewed, a CSR is still needed for the change to the GC.heap_dump command. Best regards, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From zgu at redhat.com Fri Feb 7 15:53:56 2020 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 7 Feb 2020 10:53:56 -0500 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops Message-ID: Hi, I would like purpose this change that allows GC to provide ObjectMarker during JVMTI heap walk. Currently, JVMTI heap walk uses oop markword's 'marked' pattern to indicate 'visited' oop. Unfortunately, it conflicts with Shenandoah, who uses the pattern to indicate 'forwarding'. When JVMTI heap walk occurs in some of Shenandoah's concurrent heap (e.g. concurrent evacuation or concurrent reference updating phases), it can result corrupted heap, as it tries to resolve a real oop header as a forwarding pointer. This patch allows GC to provide ObjectMarker for JVMTI to track 'visited' oop, and uses current implementation as default, so that, it has no impact to GCs other than Shenandoah, who provides its own implementation. Bug: https://bugs.openjdk.java.net/browse/JDK-8238633 Webrev: http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.00/index.html Test: hotspot_gc vmTestbase_nsk_jdi vmTestbase_nsk_jvmti Thanks, -Zhengyu From serguei.spitsyn at oracle.com Fri Feb 7 18:06:21 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Fri, 7 Feb 2020 10:06:21 -0800 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: References: Message-ID: <2714abb6-e077-c6c4-9dd7-2809604a862c@oracle.com> Hi Richard, It looks good to me. I can't comment on compiled methods non-entrancy. What exact tests do you run to verify the fix? Thanks, Serguei On 2/6/20 04:39, Reingruber, Richard wrote: > Hi, > > could I please get reviews for this small enhancement: > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 > > The change avoids making all compiled methods on stack not_entrant when switching a java thread to > interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. > > Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. > > Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. > > Thanks, Richard. > > See also my question if anyone knows a reason for making the compiled methods not_entrant: > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From alexey.menkov at oracle.com Fri Feb 7 22:06:17 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 7 Feb 2020 14:06:17 -0800 Subject: RFR: JDK-8234935: JdwpListenTest.java and JdwpAttachTest.java getting bind failures on Windows 2016 hosts In-Reply-To: References: <4e751737-af60-f79c-cfff-73d300a73e57@oracle.com> Message-ID: Updated webrev: http://cr.openjdk.java.net/~amenkov/jdk15/JdwpTestsTeredo/webrev.02/ I decided to go 2nd way. --alex On 02/06/2020 17:31, Chris Plummer wrote: > Either is fine by me. > > Chris > > On 2/6/20 3:40 PM, Alex Menkov wrote: >> Hi Chris, >> >> Thank you for the review. >> So we have 2 ways - create new RFE for refactoring and then fix this >> bug in updated code. >> or just fix this 2 tests without refactoring (the changes in the tests >> will be identical). >> Do you think it makes sense to go #1 or just do #2? >> >> Regarding using Platform.isWindows - it's good for the case, I'll fix >> it in the next iteration. >> >> --alex >> >> On 02/06/2020 15:01, Chris Plummer wrote: >>> Hi Alex, >>> >>> When refactoring is big and the bug fix is small, I prefer to see the >>> refactoring done first. It just keeps things cleaner and makes it >>> easier for the reviewer to see the important changes. It also helps >>> anyone looking at this bug or these tests in the future to better >>> recognize what the actual bug fix was, and what was just refactoring. >>> Think if there was another test with this issue, and someone was >>> looking at the diff of this fix to see how to apply it to the other >>> test. >>> >>> BTW, there is already a Platform.isWindows() API. It should probably >>> be used rather than the check the test is using. It is a slightly >>> different test however, testing for a prefix of "win" rather than >>> "windows" anywhere in the string. >>> >>> thanks, >>> >>> Chris >>> >>> On 2/6/20 1:14 PM, Alex Menkov wrote: >>>> Hi all, >>>> >>>> Please review the fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8234935 >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/jdk15/JdwpTestsTeredo/webrev/ >>>> >>>> The failures are caused by Teredo clients >>>> (https://en.wikipedia.org/wiki/Teredo_tunneling). >>>> The fix filters out corresponding addresses. >>>> >>>> JdwpListenTest and JdwpAttachTest use the same way to get addresses >>>> for testing. As this is not the 1st time the algorithm is updated I >>>> decided to deduplicate the code and move shared code to new base class. >>>> So actual change is the addition of >>>> >>>> 71? // Teredo clients cause intermittent errors on listen ("bind >>>> failed") >>>> 72? // and attach ("no route to host"). >>>> 73? // Teredo is supposed to be a temporary measure, but some test >>>> machines have it. >>>> 74?? if (isTeredo(addr6)) { >>>> 75??? continue; >>>> 76? } >>>> >>>> and isTeredo method implementation. >>>> >>>> --alex >>> >>> > > From suenaga at oss.nttdata.com Sat Feb 8 13:45:37 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 8 Feb 2020 22:45:37 +0900 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: Message-ID: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> Hi Ralf, - diagnosticCommand.cpp You can use `DCmdArgument` for -gz option. If you want to use lesser type (e.g. int, unsigned char), I guess you need to modify GenDCmdArgument class. - heapDumper.cpp _nr_of_threads, _id_to_write, _current in CompressionBackend should be added `volatile` at least. (Other values need to be checked) BTW how much processing time is different between single threaded and multi threaded? Also I want to know what number is set to ParallelGCThreads. ParallelGCThreads seems to affect to thread num for GZip compression. Thanks, Yasumasa On 2020/02/07 21:14, Schmelter, Ralf wrote: > Hi everyone, > > this change adds the option to write a hprof heap dump directly gzipped. Currently this is supported for the GC.heap_dump diagnostic command via the ?-gz? flag. > > Since gzip is not particular fast when compressing the data, the actual compression will be done parallel in a bunch of background threads. > > The created file itself is not a single gzip stream, but a concatenation of streams of about 1 MB gzippped data. This makes it easier to parallelize the compression and it allows for an at least semi-efficient random access to the created file. I?ve adjusted the jhat library to be able to directly parse a such a gzipped hrpof ?file, without the need of prior decompression. > > bugreport: https://bugs.openjdk.java.net/browse/JDK-8237354 > > webrev: http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.0/ > > If this get reviewed, a CSR is still needed for the change to the GC.heap_dump command. > > Best regards, > > Ralf > From christoph.langer at sap.com Mon Feb 10 09:00:30 2020 From: christoph.langer at sap.com (Langer, Christoph) Date: Mon, 10 Feb 2020 09:00:30 +0000 Subject: RFR [XS]: 8238602: remove obsolete functions from libinstrument/FileSystemSupport_md.c In-Reply-To: References: Message-ID: Hi Matthias, I think this removal is ok. I can see that you tested the patch in our CI without regressions. So +1 from my end. Cheers Christoph From: serviceability-dev On Behalf Of Baesken, Matthias Sent: Donnerstag, 6. Februar 2020 17:06 To: serviceability-dev at openjdk.java.net Subject: [CAUTION] RFR [XS]: 8238602: remove obsolete functions from libinstrument/FileSystemSupport_md.c Hello, the link time section gc (see https://bugs.openjdk.java.net/browse/JDK-8236714 , on linux s390x it prints the removed sections) showed some obsolete / unused functions in FileSystemSupport_md.c : ld: Removing unused section '.text.pathSeparator' in file '/nightly/output-jdk-dev/support/native/java.instrument/libinstrument/FileSystemSupport_md.o' ld: Removing unused section '.text.filenameStrcmp' in file '/nightly/output-jdk-dev/support/native/java.instrument/libinstrument/FileSystemSupport_md.o' They can be cleaned up. Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8238602 http://cr.openjdk.java.net/~mbaesken/webrevs/8238602.0/ Thanks, Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.reingruber at sap.com Mon Feb 10 11:26:13 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Mon, 10 Feb 2020 11:26:13 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: <2714abb6-e077-c6c4-9dd7-2809604a862c@oracle.com> References: <2714abb6-e077-c6c4-9dd7-2809604a862c@oracle.com> Message-ID: Hi Vladimir and Serguei, thanks for looking at the change! > What exact tests do you run to verify the fix? The enhancement was tested running the JCK and JTREG tests which include many JVMTI, JDI and JDWP tests. To see if the tests cover this part of the JVMTI implementation I had removed the deoptimization of compiled frames on stack. I found that e.g. the following test covers this: vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t012 The test vmTestbase/nsk/jvmti/scenarios/hotswap/HS202/hs202t002/hs202t002.java triggers the guarantee 238 void JvmtiThreadState::invalidate_cur_stack_depth() { 239 guarantee(SafepointSynchronize::is_at_safepoint() || 240 (JavaThread *)Thread::current() == get_thread(), 241 "must be current thread or at safepoint"); 242 243 _cur_stack_depth = UNKNOWN_STACK_DEPTH; 244 } 245 because with the enhancement invalidate_cur_stack_depth() gets called by the VMThread executing the new handshake. So this is covered as well. Thanks again for reviewing. Do I need more reviews or are your reviews enough to push the enhancement? Best regards, Richard. -----Original Message----- From: serguei.spitsyn at oracle.com Sent: Freitag, 7. Februar 2020 19:06 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant Hi Richard, It looks good to me. I can't comment on compiled methods non-entrancy. What exact tests do you run to verify the fix? Thanks, Serguei On 2/6/20 04:39, Reingruber, Richard wrote: > Hi, > > could I please get reviews for this small enhancement: > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 > > The change avoids making all compiled methods on stack not_entrant when switching a java thread to > interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. > > Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. > > Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. > > Thanks, Richard. > > See also my question if anyone knows a reason for making the compiled methods not_entrant: > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From ralf.schmelter at sap.com Mon Feb 10 15:33:33 2020 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Mon, 10 Feb 2020 15:33:33 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> Message-ID: Hi Yasumasa, > You can use `DCmdArgument` for -gz option. That is what I originally tried. But then you always have to supply a compression level (just specifying -gz doesn't work). Since I would expect most users never caring about the compression level, I switched to a string option, which can handle this pattern. > _nr_of_threads, _id_to_write, _current in CompressionBackend should be added `volatile` at least. I don't think that is needed. Apart from the initialization, they are only changed under lock protection. > BTW how much processing time is different between single threaded and multi threaded? I've benchmarked an example, which creates a ~31 GB uncompressed hprof file, with a VM which doesn't use any background threads. Here are the size of the create files, the compression level and the time spend: Uncompressed, 31.6 G, 71 sec gzipped level 1, 7.57 G, 463 sec (x6.5) gzipped level 3, 7.10 G, 609 sec (x8.6) gzipped level 6, 6.49 G, 1415 sec (x19.9) So even the fastest gzip compression makes writing the dump at least 5 times as slow. > Also I want to know what number is set to ParallelGCThreads. > ParallelGCThreads seems to affect to thread num for GZip compression. Originally, I've tried to use the WorkGang (CollectedHeap:: get_safepoint_workers()) of the GC to do the work. But this wouldn't work because Shenandoah could not iterate the heap from a worker thread. So I've opted to start the needed threads itself for the time of the heap dump. I've used ParallelGCThreads as the maximum number of threads, since this is what would be used for a GC too. So it should not clog up the machine more than a GC. Maybe it would be even better to additionally limit the threads by the compression level. Best regards, Ralf Schmelter -----Original Message----- From: Yasumasa Suenaga Sent: Samstag, 8. Februar 2020 14:46 To: Schmelter, Ralf ; OpenJDK Serviceability Cc: yasuenag at gmail.com Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump Hi Ralf, - diagnosticCommand.cpp You can use `DCmdArgument` for -gz option. If you want to use lesser type (e.g. int, unsigned char), I guess you need to modify GenDCmdArgument class. - heapDumper.cpp _nr_of_threads, _id_to_write, _current in CompressionBackend should be added `volatile` at least. (Other values need to be checked) BTW how much processing time is different between single threaded and multi threaded? Also I want to know what number is set to ParallelGCThreads. ParallelGCThreads seems to affect to thread num for GZip compression. Thanks, Yasumasa From serguei.spitsyn at oracle.com Mon Feb 10 18:11:22 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 10 Feb 2020 10:11:22 -0800 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: References: <2714abb6-e077-c6c4-9dd7-2809604a862c@oracle.com> Message-ID: <9e83b9b2-9fae-7017-0435-549b181fa974@oracle.com> Hi Richard, Thank you for the details on testing! Two reviews has to be good enough unless anyone else did not want to review it as well. I guess, it is good to push. Thanks, Serguei On 2/10/20 03:26, Reingruber, Richard wrote: > Hi Vladimir and Serguei, > > thanks for looking at the change! > > > What exact tests do you run to verify the fix? > > The enhancement was tested running the JCK and JTREG tests which include many JVMTI, JDI and JDWP tests. > > To see if the tests cover this part of the JVMTI implementation I had removed the deoptimization of > compiled frames on stack. I found that e.g. the following test covers this: > > vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t012 > > The test > > vmTestbase/nsk/jvmti/scenarios/hotswap/HS202/hs202t002/hs202t002.java > > triggers the guarantee > > 238 void JvmtiThreadState::invalidate_cur_stack_depth() { > 239 guarantee(SafepointSynchronize::is_at_safepoint() || > 240 (JavaThread *)Thread::current() == get_thread(), > 241 "must be current thread or at safepoint"); > 242 > 243 _cur_stack_depth = UNKNOWN_STACK_DEPTH; > 244 } > 245 > > because with the enhancement invalidate_cur_stack_depth() gets called by the VMThread executing the > new handshake. So this is covered as well. > > Thanks again for reviewing. > > Do I need more reviews or are your reviews enough to push the enhancement? > > Best regards, > Richard. > > -----Original Message----- > From: serguei.spitsyn at oracle.com > Sent: Freitag, 7. Februar 2020 19:06 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > Hi Richard, > > It looks good to me. > I can't comment on compiled methods non-entrancy. > > What exact tests do you run to verify the fix? > > Thanks, > Serguei > > > On 2/6/20 04:39, Reingruber, Richard wrote: >> Hi, >> >> could I please get reviews for this small enhancement: >> >> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >> >> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >> >> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >> >> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >> >> Thanks, Richard. >> >> See also my question if anyone knows a reason for making the compiled methods not_entrant: >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From chris.plummer at oracle.com Mon Feb 10 19:43:51 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 10 Feb 2020 11:43:51 -0800 Subject: RFR: JDK-8234935: JdwpListenTest.java and JdwpAttachTest.java getting bind failures on Windows 2016 hosts In-Reply-To: References: <4e751737-af60-f79c-cfff-73d300a73e57@oracle.com> Message-ID: Hi Alex, The changes look good. Please up the copyright in JdwpListenTest.java. thanks, Chris On 2/7/20 2:06 PM, Alex Menkov wrote: > Updated webrev: > http://cr.openjdk.java.net/~amenkov/jdk15/JdwpTestsTeredo/webrev.02/ > > I decided to go 2nd way. > > --alex > > On 02/06/2020 17:31, Chris Plummer wrote: >> Either is fine by me. >> >> Chris >> >> On 2/6/20 3:40 PM, Alex Menkov wrote: >>> Hi Chris, >>> >>> Thank you for the review. >>> So we have 2 ways - create new RFE for refactoring and then fix this >>> bug in updated code. >>> or just fix this 2 tests without refactoring (the changes in the >>> tests will be identical). >>> Do you think it makes sense to go #1 or just do #2? >>> >>> Regarding using Platform.isWindows - it's good for the case, I'll >>> fix it in the next iteration. >>> >>> --alex >>> >>> On 02/06/2020 15:01, Chris Plummer wrote: >>>> Hi Alex, >>>> >>>> When refactoring is big and the bug fix is small, I prefer to see >>>> the refactoring done first. It just keeps things cleaner and makes >>>> it easier for the reviewer to see the important changes. It also >>>> helps anyone looking at this bug or these tests in the future to >>>> better recognize what the actual bug fix was, and what was just >>>> refactoring. Think if there was another test with this issue, and >>>> someone was looking at the diff of this fix to see how to apply it >>>> to the other test. >>>> >>>> BTW, there is already a Platform.isWindows() API. It should >>>> probably be used rather than the check the test is using. It is a >>>> slightly different test however, testing for a prefix of "win" >>>> rather than "windows" anywhere in the string. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 2/6/20 1:14 PM, Alex Menkov wrote: >>>>> Hi all, >>>>> >>>>> Please review the fix for >>>>> https://bugs.openjdk.java.net/browse/JDK-8234935 >>>>> webrev: >>>>> http://cr.openjdk.java.net/~amenkov/jdk15/JdwpTestsTeredo/webrev/ >>>>> >>>>> The failures are caused by Teredo clients >>>>> (https://en.wikipedia.org/wiki/Teredo_tunneling). >>>>> The fix filters out corresponding addresses. >>>>> >>>>> JdwpListenTest and JdwpAttachTest use the same way to get >>>>> addresses for testing. As this is not the 1st time the algorithm >>>>> is updated I decided to deduplicate the code and move shared code >>>>> to new base class. >>>>> So actual change is the addition of >>>>> >>>>> 71? // Teredo clients cause intermittent errors on listen ("bind >>>>> failed") >>>>> 72? // and attach ("no route to host"). >>>>> 73? // Teredo is supposed to be a temporary measure, but some test >>>>> machines have it. >>>>> 74?? if (isTeredo(addr6)) { >>>>> 75??? continue; >>>>> 76? } >>>>> >>>>> and isTeredo method implementation. >>>>> >>>>> --alex >>>> >>>> >> >> From chris.plummer at oracle.com Mon Feb 10 19:48:29 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 10 Feb 2020 11:48:29 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <6609ac07-de22-dcf5-6bea-420fdbcdddb8@oracle.com> References: <39200476-D017-4BFD-ABF2-8ABC59ADA8C0@oracle.com> <3edb4c50-588d-d111-2a39-c9859ec64931@oracle.com> <6609ac07-de22-dcf5-6bea-420fdbcdddb8@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Feb 10 20:56:14 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 10 Feb 2020 12:56:14 -0800 Subject: RFR: JDK-8234935: JdwpListenTest.java and JdwpAttachTest.java getting bind failures on Windows 2016 hosts In-Reply-To: References: <4e751737-af60-f79c-cfff-73d300a73e57@oracle.com> Message-ID: <9ac56c04-1c3a-6ab9-66fd-59c9637840a4@oracle.com> An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Mon Feb 10 21:34:59 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Mon, 10 Feb 2020 13:34:59 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: References: <39200476-D017-4BFD-ABF2-8ABC59ADA8C0@oracle.com> <3edb4c50-588d-d111-2a39-c9859ec64931@oracle.com> <6609ac07-de22-dcf5-6bea-420fdbcdddb8@oracle.com> Message-ID: <42C18C7D-7FD3-4B4F-AFDC-9A16218AD578@oracle.com> Hi Chris, in general it all looks good, I have a few comments (most of them are editorial): in Platform.java: 1. you have doubled spaced at line#238 (b/w boolean and isSignedOSX) 2. as FileNotFoundException is IOException, there is no need to declare the former in the signature of isSignedOSX 3. it's better to pass jdkPath, "bin" and "java" as separate arguments to Path.get, so the code won't depend on file separator 4. you are waiting for codesign to finish w/o reading its cout / cerr, which might lead to a deadlock (if codesign will exhaust IO buffer before exiting), so you need to either create two separate threads to read cout and cerr or redirect these streams them to files and read these files afterwards or just ignore cout/cerr by using Redirect.DISCARD. I'd personally recommend the latter as the result of codesign can be reliably deduced from its exitcode (0 - signed, 1 - verification failed, 2 - wrong arguments, 3 - not all requirements from R are satisfied) and using cout/cerr is somewhat fragile as there is no guarantee output format won't be changed. the rest looks good to me. -- Igor > On Feb 10, 2020, at 11:48 AM, Chris Plummer wrote: > > Ping #2. It's not that hard of a review. Most of it is the new Platform.isSignedOSX() method, which is well commented and pretty straight froward. > > thanks, > > Chris > > On 2/4/20 5:04 PM, Chris Plummer wrote: >> Ping! >> >> And I decided to push to 15 instead of 14. Will backport to 14 eventually. >> >> thanks, >> >> Chris >> >> On 1/30/20 10:20 PM, Chris Plummer wrote: >>> Yes, you are correct: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00 >>> >>> thanks, >>> >>> Chris >>> >>> On 1/30/20 10:13 PM, Igor Ignatyev wrote: >>>> Hi Chris, >>>> >>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 ? seems to be a webrev from another issue, should it have been? http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00/ ? ? >>>> >>>> -- Igor >>>> >>>>> On Jan 30, 2020, at 10:10 PM, Chris Plummer > wrote: >>>>> >>>>> Hello, >>>>> >>>>> Please review the following fix for some SA tests that are failing on Mac OS X 10.14.5 and later: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 >>>>> >>>>> The issue is that SA can't attach to a signed binary starting with 10.14.5. There is no workaround for this, so these tests are being disabled when it is detected that the binary is signed and we are running on 10.14 or later (I chose all 10.14 releases to simplify the check). >>>>> >>>>> Some background may help explain the fix. In order for SA to attach to a live process (not a core file) on OSX, either the attaching process (ie. the test) has to be run as root, or sudo needs to be supported. However, the only tests that make the sudo check are the 20 or so that use ClhsdbLauncher. The rest all rely on "@requires vm.hasSAandCanAttach" to filter out tests that use SA attach. vm.hasSAandCanAttach only checks if the test is being run as root. Thus all our non-ClhsdbLauncher tests that SA attach to a live process are currently not run unless they are run as root. 8238268 [1] has been filed to address this, making it so all the tests will attempt to use sudo if not run as root. >>>>> >>>>> Because of the difference in how ClhsdbLauncher tests and "@requires? vm.hasSAandCanAttach" tests check to see if they are runnable, this fix needs to address both types of checks. The common code for both these cases is Platform.shouldSAAttach(), which on OSX basically equates to check to see if we are running as root. I changed it to also return false if running on signed binary with 10.14 or later. However, this confused the ClhsdbLauncher use of Platform.shouldSAAttach() somewhat, since it assumed a false result only happens because you are not running as root (in which case it would then check if sudo will work). So ClhsdbLauncher now has double check that the false result was not because of running a signed binary. If it is signed, it won't do the sudo check. This will get cleaned up with 8238268 [1], which will move the sudo check into Platform.shouldSAAttach(). >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238268 >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Mon Feb 10 21:46:14 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 10 Feb 2020 13:46:14 -0800 Subject: RFR: JDK-8234935: JdwpListenTest.java and JdwpAttachTest.java getting bind failures on Windows 2016 hosts In-Reply-To: <9ac56c04-1c3a-6ab9-66fd-59c9637840a4@oracle.com> References: <4e751737-af60-f79c-cfff-73d300a73e57@oracle.com> <9ac56c04-1c3a-6ab9-66fd-59c9637840a4@oracle.com> Message-ID: <543f50d2-4abe-931f-e17c-abaf5d70c881@oracle.com> Thanks, will make all values 0x.. before push --alex On 02/10/2020 12:56, serguei.spitsyn at oracle.com wrote: > Hi Alex, > > It looks okay to me. > Minor: > > + return bytes[0] == 0x20 && bytes[1] == 0x01 && bytes[2] == 00 && > bytes[3] == 0; '00' looks strange, maybe you want something like this: + > return bytes[0] == 0x20 && bytes[1] == 0x01 && bytes[2] == 0x0 && > bytes[3] == 0x0; > > > Thanks, > Serguei > > > On 2/7/20 14:06, Alex Menkov wrote: >> Updated webrev: >> http://cr.openjdk.java.net/~amenkov/jdk15/JdwpTestsTeredo/webrev.02/ >> >> I decided to go 2nd way. >> >> --alex >> >> On 02/06/2020 17:31, Chris Plummer wrote: >>> Either is fine by me. >>> >>> Chris >>> >>> On 2/6/20 3:40 PM, Alex Menkov wrote: >>>> Hi Chris, >>>> >>>> Thank you for the review. >>>> So we have 2 ways - create new RFE for refactoring and then fix this >>>> bug in updated code. >>>> or just fix this 2 tests without refactoring (the changes in the >>>> tests will be identical). >>>> Do you think it makes sense to go #1 or just do #2? >>>> >>>> Regarding using Platform.isWindows - it's good for the case, I'll >>>> fix it in the next iteration. >>>> >>>> --alex >>>> >>>> On 02/06/2020 15:01, Chris Plummer wrote: >>>>> Hi Alex, >>>>> >>>>> When refactoring is big and the bug fix is small, I prefer to see >>>>> the refactoring done first. It just keeps things cleaner and makes >>>>> it easier for the reviewer to see the important changes. It also >>>>> helps anyone looking at this bug or these tests in the future to >>>>> better recognize what the actual bug fix was, and what was just >>>>> refactoring. Think if there was another test with this issue, and >>>>> someone was looking at the diff of this fix to see how to apply it >>>>> to the other test. >>>>> >>>>> BTW, there is already a Platform.isWindows() API. It should >>>>> probably be used rather than the check the test is using. It is a >>>>> slightly different test however, testing for a prefix of "win" >>>>> rather than "windows" anywhere in the string. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 2/6/20 1:14 PM, Alex Menkov wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review the fix for >>>>>> https://bugs.openjdk.java.net/browse/JDK-8234935 >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~amenkov/jdk15/JdwpTestsTeredo/webrev/ >>>>>> >>>>>> The failures are caused by Teredo clients >>>>>> (https://en.wikipedia.org/wiki/Teredo_tunneling). >>>>>> The fix filters out corresponding addresses. >>>>>> >>>>>> JdwpListenTest and JdwpAttachTest use the same way to get >>>>>> addresses for testing. As this is not the 1st time the algorithm >>>>>> is updated I decided to deduplicate the code and move shared code >>>>>> to new base class. >>>>>> So actual change is the addition of >>>>>> >>>>>> 71? // Teredo clients cause intermittent errors on listen ("bind >>>>>> failed") >>>>>> 72? // and attach ("no route to host"). >>>>>> 73? // Teredo is supposed to be a temporary measure, but some test >>>>>> machines have it. >>>>>> 74?? if (isTeredo(addr6)) { >>>>>> 75??? continue; >>>>>> 76? } >>>>>> >>>>>> and isTeredo method implementation. >>>>>> >>>>>> --alex >>>>> >>>>> >>> >>> > From alexey.menkov at oracle.com Mon Feb 10 21:49:20 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Mon, 10 Feb 2020 13:49:20 -0800 Subject: RFR: JDK-8234935: JdwpListenTest.java and JdwpAttachTest.java getting bind failures on Windows 2016 hosts In-Reply-To: References: <4e751737-af60-f79c-cfff-73d300a73e57@oracle.com> Message-ID: <7d56589c-76b8-c17e-f1e6-1073a15b57d1@oracle.com> Thanks, will do --alex On 02/10/2020 11:43, Chris Plummer wrote: > Hi Alex, > > The changes look good. Please up the copyright in JdwpListenTest.java. > > thanks, > > Chris > > On 2/7/20 2:06 PM, Alex Menkov wrote: >> Updated webrev: >> http://cr.openjdk.java.net/~amenkov/jdk15/JdwpTestsTeredo/webrev.02/ >> >> I decided to go 2nd way. >> >> --alex >> >> On 02/06/2020 17:31, Chris Plummer wrote: >>> Either is fine by me. >>> >>> Chris >>> >>> On 2/6/20 3:40 PM, Alex Menkov wrote: >>>> Hi Chris, >>>> >>>> Thank you for the review. >>>> So we have 2 ways - create new RFE for refactoring and then fix this >>>> bug in updated code. >>>> or just fix this 2 tests without refactoring (the changes in the >>>> tests will be identical). >>>> Do you think it makes sense to go #1 or just do #2? >>>> >>>> Regarding using Platform.isWindows - it's good for the case, I'll >>>> fix it in the next iteration. >>>> >>>> --alex >>>> >>>> On 02/06/2020 15:01, Chris Plummer wrote: >>>>> Hi Alex, >>>>> >>>>> When refactoring is big and the bug fix is small, I prefer to see >>>>> the refactoring done first. It just keeps things cleaner and makes >>>>> it easier for the reviewer to see the important changes. It also >>>>> helps anyone looking at this bug or these tests in the future to >>>>> better recognize what the actual bug fix was, and what was just >>>>> refactoring. Think if there was another test with this issue, and >>>>> someone was looking at the diff of this fix to see how to apply it >>>>> to the other test. >>>>> >>>>> BTW, there is already a Platform.isWindows() API. It should >>>>> probably be used rather than the check the test is using. It is a >>>>> slightly different test however, testing for a prefix of "win" >>>>> rather than "windows" anywhere in the string. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 2/6/20 1:14 PM, Alex Menkov wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review the fix for >>>>>> https://bugs.openjdk.java.net/browse/JDK-8234935 >>>>>> webrev: >>>>>> http://cr.openjdk.java.net/~amenkov/jdk15/JdwpTestsTeredo/webrev/ >>>>>> >>>>>> The failures are caused by Teredo clients >>>>>> (https://en.wikipedia.org/wiki/Teredo_tunneling). >>>>>> The fix filters out corresponding addresses. >>>>>> >>>>>> JdwpListenTest and JdwpAttachTest use the same way to get >>>>>> addresses for testing. As this is not the 1st time the algorithm >>>>>> is updated I decided to deduplicate the code and move shared code >>>>>> to new base class. >>>>>> So actual change is the addition of >>>>>> >>>>>> 71? // Teredo clients cause intermittent errors on listen ("bind >>>>>> failed") >>>>>> 72? // and attach ("no route to host"). >>>>>> 73? // Teredo is supposed to be a temporary measure, but some test >>>>>> machines have it. >>>>>> 74?? if (isTeredo(addr6)) { >>>>>> 75??? continue; >>>>>> 76? } >>>>>> >>>>>> and isTeredo method implementation. >>>>>> >>>>>> --alex >>>>> >>>>> >>> >>> > > From david.holmes at oracle.com Mon Feb 10 22:19:40 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 11 Feb 2020 08:19:40 +1000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> Message-ID: <1f2c938f-f9ce-7fb7-7169-94daaf3542a4@oracle.com> Hi Ralf, One part of this caught my eye and now I look at the webrev I have some concerns. Introducing new threads to the VM is not something that should be done lightly and it has to be done very carefully - I need to look closer at this aspect. Further when using Mutexes/Monitors in such code you have to be extremely careful about how (or even if) those Mutex/Monitor get deleted. The code you have at present is not safe because you cannot know when other threads have completely exited the Monitor/Mutex code. The last thread to terminate will signal the destructing thread (blocked in wait) then release the monitor, allowing the destructing thread to acquire the monitor and then delete the _lock. But at the point at which the monitor becomes free and the destructor thread is unparked, the terminating thread may be context switched out and remain inside the Monitor code. The destructor thread then deletes the monitor and frees it. When the terminating thread resumes, if it touches any memory associated with the Monitor it could SEGV. To safely delete a Monitor/Mutex you have to know for certain that all threads using it have completely ceased to use it. You cannot use that Monitor/Mutex as the means for determining that. It is a non-trivial problem to solve. Cheers, David ----- On 11/02/2020 1:33 am, Schmelter, Ralf wrote: > Hi Yasumasa, > >> You can use `DCmdArgument` for -gz option. > > That is what I originally tried. But then you always have to supply a compression level (just specifying -gz doesn't work). Since I would expect most users never caring about the compression level, I switched to a string option, which can handle this pattern. > >> _nr_of_threads, _id_to_write, _current in CompressionBackend should be added `volatile` at least. > > I don't think that is needed. Apart from the initialization, they are only changed under lock protection. > >> BTW how much processing time is different between single threaded and multi threaded? > > I've benchmarked an example, which creates a ~31 GB uncompressed hprof file, with a VM which doesn't use any background threads. Here are the size of the create files, the compression level and the time spend: > > Uncompressed, 31.6 G, 71 sec > gzipped level 1, 7.57 G, 463 sec (x6.5) > gzipped level 3, 7.10 G, 609 sec (x8.6) > gzipped level 6, 6.49 G, 1415 sec (x19.9) > > So even the fastest gzip compression makes writing the dump at least 5 times as slow. > >> Also I want to know what number is set to ParallelGCThreads. >> ParallelGCThreads seems to affect to thread num for GZip compression. > > Originally, I've tried to use the WorkGang (CollectedHeap:: get_safepoint_workers()) of the GC to do the work. But this wouldn't work because Shenandoah could not iterate the heap from a worker thread. So I've opted to start the needed threads itself for the time of the heap dump. I've used ParallelGCThreads as the maximum number of threads, since this is what would be used for a GC too. So it should not clog up the machine more than a GC. Maybe it would be even better to additionally limit the threads by the compression level. > > Best regards, > Ralf Schmelter > > -----Original Message----- > From: Yasumasa Suenaga > Sent: Samstag, 8. Februar 2020 14:46 > To: Schmelter, Ralf ; OpenJDK Serviceability > Cc: yasuenag at gmail.com > Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump > > Hi Ralf, > > > - diagnosticCommand.cpp > You can use `DCmdArgument` for -gz option. > If you want to use lesser type (e.g. int, unsigned char), I guess you need to modify GenDCmdArgument class. > > - heapDumper.cpp > _nr_of_threads, _id_to_write, _current in CompressionBackend should be added `volatile` at least. > (Other values need to be checked) > > > BTW how much processing time is different between single threaded and multi threaded? > Also I want to know what number is set to ParallelGCThreads. > ParallelGCThreads seems to affect to thread num for GZip compression. > > > Thanks, > > Yasumasa > From david.holmes at oracle.com Tue Feb 11 07:44:17 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 11 Feb 2020 17:44:17 +1000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <1f2c938f-f9ce-7fb7-7169-94daaf3542a4@oracle.com> References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <1f2c938f-f9ce-7fb7-7169-94daaf3542a4@oracle.com> Message-ID: Hi again Ralf, :) A few more comments after taking a closer look at the thread code. On the surface it seems to me this is a case where it would be okay to introduce a subclass of Thread that is not JavaThread nor NonJavaThread. I see little point in subclassing NonJavaThread (via NamedThread) but then overriding pre_run() and post_run() so that you don't do anything that NonJavaThread is supposed to do regarding the NJT iterator capabilities. But we currently expect all threads to fit into one category or another, so this is problematic. :( I thinking disabling the NJT functionality is also problematic. So not sure what to suggest yet. BTW you extended NamedThread but you never actually set a name AFAICS. ?? For your monitor operations, you should use a MonitorLocker and then call ml->wait() which will do the right thing with respect to "no safepoint checks" without you needing to specify it directly. Cheers, David On 11/02/2020 8:19 am, David Holmes wrote: > Hi Ralf, > > One part of this caught my eye and now I look at the webrev I have some > concerns. Introducing new threads to the VM is not something that should > be done lightly and it has to be done very carefully - I need to look > closer at this aspect. Further when using Mutexes/Monitors in such code > you have to be extremely careful about how (or even if) those > Mutex/Monitor get deleted. The code you have at present is not safe > because you cannot know when other threads have completely exited the > Monitor/Mutex code. The last thread to terminate will signal the > destructing thread (blocked in wait) then release the monitor, allowing > the destructing thread to acquire the monitor and then delete the _lock. > But at the point at which the monitor becomes free and the destructor > thread is unparked, the terminating thread may be context switched out > and remain inside the Monitor code. The destructor thread then deletes > the monitor and frees it. When the terminating thread resumes, if it > touches any memory associated with the Monitor it could SEGV. > > To safely delete a Monitor/Mutex you have to know for certain that all > threads using it have completely ceased to use it. You cannot use that > Monitor/Mutex as the means for determining that. It is a non-trivial > problem to solve. > > Cheers, > David > ----- > > On 11/02/2020 1:33 am, Schmelter, Ralf wrote: >> Hi Yasumasa, >> >>> ?? You can use `DCmdArgument` for -gz option. >> >> That is what I originally tried. But then you always have to supply a >> compression level (just specifying -gz doesn't work). Since I would >> expect most users never caring about the compression level, I switched >> to a string option, which can handle this pattern. >> >>> _nr_of_threads, _id_to_write, _current in CompressionBackend should >>> be added `volatile` at least. >> >> I don't think that is needed. Apart from the initialization, they are >> only changed under lock protection. >> >>> BTW how much processing time is different between single threaded and >>> multi threaded? >> >> I've benchmarked an example, which creates a ~31 GB uncompressed hprof >> file, with a VM which doesn't use any background threads. Here are the >> size of the create files, the compression level and the time spend: >> >> Uncompressed, 31.6 G, 71 sec >> gzipped level 1, 7.57 G, 463 sec (x6.5) >> gzipped level 3, 7.10 G, 609 sec (x8.6) >> gzipped level 6, 6.49 G, 1415 sec (x19.9) >> >> So even the fastest gzip compression makes writing the dump at least 5 >> times as slow. >> >>> Also I want to know what number is set to ParallelGCThreads. >>> ParallelGCThreads seems to affect to thread num for GZip compression. >> >> Originally, I've tried to use the WorkGang (CollectedHeap:: >> get_safepoint_workers()) of the GC to do the work. But this wouldn't >> work because Shenandoah could not iterate the heap from a worker >> thread. So I've opted to start the needed threads itself for the time >> of the heap dump. I've used ParallelGCThreads as the maximum number of >> threads, since this is what would be used for a GC too. So it should >> not clog up the machine more than a GC. Maybe it would be even better >> to additionally limit the threads by the compression level. >> >> Best regards, >> Ralf Schmelter >> >> -----Original Message----- >> From: Yasumasa Suenaga >> Sent: Samstag, 8. Februar 2020 14:46 >> To: Schmelter, Ralf ; OpenJDK Serviceability >> >> Cc: yasuenag at gmail.com >> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped >> heap dump >> >> Hi Ralf, >> >> >> - diagnosticCommand.cpp >> ?? You can use `DCmdArgument` for -gz option. >> ?? If you want to use lesser type (e.g. int, unsigned char), I guess >> you need to modify GenDCmdArgument class. >> >> - heapDumper.cpp >> ?? _nr_of_threads, _id_to_write, _current in CompressionBackend should >> be added `volatile` at least. >> ?? (Other values need to be checked) >> >> >> BTW how much processing time is different between single threaded and >> multi threaded? >> Also I want to know what number is set to ParallelGCThreads. >> ParallelGCThreads seems to affect to thread num for GZip compression. >> >> >> Thanks, >> >> Yasumasa >> From matthias.baesken at sap.com Tue Feb 11 08:50:08 2020 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 11 Feb 2020 08:50:08 +0000 Subject: RFR [XS]: 8238602: remove obsolete functions from libinstrument/FileSystemSupport_md.c In-Reply-To: References: Message-ID: Hi Christoph, thanks for the review ! Will push it as XS in case no objections show up . Best regards, Matthias From: Langer, Christoph Sent: Montag, 10. Februar 2020 10:01 To: Baesken, Matthias ; serviceability-dev at openjdk.java.net Subject: RE: RFR [XS]: 8238602: remove obsolete functions from libinstrument/FileSystemSupport_md.c Hi Matthias, I think this removal is ok. I can see that you tested the patch in our CI without regressions. So +1 from my end. Cheers Christoph From: serviceability-dev > On Behalf Of Baesken, Matthias Sent: Donnerstag, 6. Februar 2020 17:06 To: serviceability-dev at openjdk.java.net Subject: [CAUTION] RFR [XS]: 8238602: remove obsolete functions from libinstrument/FileSystemSupport_md.c Hello, the link time section gc (see https://bugs.openjdk.java.net/browse/JDK-8236714 , on linux s390x it prints the removed sections) showed some obsolete / unused functions in FileSystemSupport_md.c : ld: Removing unused section '.text.pathSeparator' in file '/nightly/output-jdk-dev/support/native/java.instrument/libinstrument/FileSystemSupport_md.o' ld: Removing unused section '.text.filenameStrcmp' in file '/nightly/output-jdk-dev/support/native/java.instrument/libinstrument/FileSystemSupport_md.o' They can be cleaned up. Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8238602 http://cr.openjdk.java.net/~mbaesken/webrevs/8238602.0/ Thanks, Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.schmelter at sap.com Tue Feb 11 09:16:15 2020 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Tue, 11 Feb 2020 09:16:15 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <1f2c938f-f9ce-7fb7-7169-94daaf3542a4@oracle.com> References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <1f2c938f-f9ce-7fb7-7169-94daaf3542a4@oracle.com> Message-ID: Hi David, thanks for the feedback. I fear you are right, the current code is not safe. During much of the development I actually used a global lock. But then I thought it would look strange, that more than one CompressionBackend would use the same. But apart from aesthetics, this would pose no actual problem I can think of. And I could move the code in ~ CompressionBackend to a deactivate() method, which would be called at the end of the VM operation. This way the mutex would only used during the actual VM operation, so there should be no problem even in theory. So the easiest way to fix this would be to use a global lock instead, again. What do you think? Best regards, Ralf -----Original Message----- From: David Holmes Sent: Montag, 10. Februar 2020 23:20 To: Schmelter, Ralf ; Yasumasa Suenaga ; OpenJDK Serviceability Cc: yasuenag at gmail.com Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump Hi Ralf, One part of this caught my eye and now I look at the webrev I have some concerns. Introducing new threads to the VM is not something that should be done lightly and it has to be done very carefully - I need to look closer at this aspect. Further when using Mutexes/Monitors in such code you have to be extremely careful about how (or even if) those Mutex/Monitor get deleted. The code you have at present is not safe because you cannot know when other threads have completely exited the Monitor/Mutex code. The last thread to terminate will signal the destructing thread (blocked in wait) then release the monitor, allowing the destructing thread to acquire the monitor and then delete the _lock. But at the point at which the monitor becomes free and the destructor thread is unparked, the terminating thread may be context switched out and remain inside the Monitor code. The destructor thread then deletes the monitor and frees it. When the terminating thread resumes, if it touches any memory associated with the Monitor it could SEGV. To safely delete a Monitor/Mutex you have to know for certain that all threads using it have completely ceased to use it. You cannot use that Monitor/Mutex as the means for determining that. It is a non-trivial problem to solve. Cheers, David ----- From suenaga at oss.nttdata.com Tue Feb 11 12:55:26 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Tue, 11 Feb 2020 21:55:26 +0900 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> Message-ID: <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> Hi Ralf, On 2020/02/11 0:33, Schmelter, Ralf wrote: > Hi Yasumasa, > >> You can use `DCmdArgument` for -gz option. > > That is what I originally tried. But then you always have to supply a compression level (just specifying -gz doesn't work). Since I would expect most users never caring about the compression level, I switched to a string option, which can handle this pattern. I think you can modify DCmdArgument::parse_value() to allow the operation without argument. Or you can add new impl function for integer types which can handle default value. >> _nr_of_threads, _id_to_write, _current in CompressionBackend should be added `volatile` at least. > > I don't think that is needed. Apart from the initialization, they are only changed under lock protection. I concerned with compiler optimization. They are class members and they are used in `while(true)` loop. Of course the problem would not appear in all C++ compiler, but I guess it is more safely if `volatile` is added. >> BTW how much processing time is different between single threaded and multi threaded? > > I've benchmarked an example, which creates a ~31 GB uncompressed hprof file, with a VM which doesn't use any background threads. Here are the size of the create files, the compression level and the time spend: > > Uncompressed, 31.6 G, 71 sec > gzipped level 1, 7.57 G, 463 sec (x6.5) > gzipped level 3, 7.10 G, 609 sec (x8.6) > gzipped level 6, 6.49 G, 1415 sec (x19.9) > > So even the fastest gzip compression makes writing the dump at least 5 times as slow. > >> Also I want to know what number is set to ParallelGCThreads. >> ParallelGCThreads seems to affect to thread num for GZip compression. > > Originally, I've tried to use the WorkGang (CollectedHeap:: get_safepoint_workers()) of the GC to do the work. But this wouldn't work because Shenandoah could not iterate the heap from a worker thread. So I've opted to start the needed threads itself for the time of the heap dump. I've used ParallelGCThreads as the maximum number of threads, since this is what would be used for a GC too. So it should not clog up the machine more than a GC. Maybe it would be even better to additionally limit the threads by the compression level. Thanks! Yasumasa (ysuenaga) > Best regards, > Ralf Schmelter > > -----Original Message----- > From: Yasumasa Suenaga > Sent: Samstag, 8. Februar 2020 14:46 > To: Schmelter, Ralf ; OpenJDK Serviceability > Cc: yasuenag at gmail.com > Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump > > Hi Ralf, > > > - diagnosticCommand.cpp > You can use `DCmdArgument` for -gz option. > If you want to use lesser type (e.g. int, unsigned char), I guess you need to modify GenDCmdArgument class. > > - heapDumper.cpp > _nr_of_threads, _id_to_write, _current in CompressionBackend should be added `volatile` at least. > (Other values need to be checked) > > > BTW how much processing time is different between single threaded and multi threaded? > Also I want to know what number is set to ParallelGCThreads. > ParallelGCThreads seems to affect to thread num for GZip compression. > > > Thanks, > > Yasumasa > From richard.reingruber at sap.com Tue Feb 11 13:19:14 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Tue, 11 Feb 2020 13:19:14 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: <9e83b9b2-9fae-7017-0435-549b181fa974@oracle.com> References: <2714abb6-e077-c6c4-9dd7-2809604a862c@oracle.com> <9e83b9b2-9fae-7017-0435-549b181fa974@oracle.com> Message-ID: Hi Serguei, > Two reviews has to be good enough unless anyone else did not want to > review it as well. > I guess, it is good to push. Ok. I'll wait a little longer and on Thursday I'll push it. Thanks, Richard. -----Original Message----- From: serguei.spitsyn at oracle.com Sent: Montag, 10. Februar 2020 19:11 To: Reingruber, Richard ; Vladimir Ivanov ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant Hi Richard, Thank you for the details on testing! Two reviews has to be good enough unless anyone else did not want to review it as well. I guess, it is good to push. Thanks, Serguei On 2/10/20 03:26, Reingruber, Richard wrote: > Hi Vladimir and Serguei, > > thanks for looking at the change! > > > What exact tests do you run to verify the fix? > > The enhancement was tested running the JCK and JTREG tests which include many JVMTI, JDI and JDWP tests. > > To see if the tests cover this part of the JVMTI implementation I had removed the deoptimization of > compiled frames on stack. I found that e.g. the following test covers this: > > vmTestbase/nsk/jvmti/scenarios/events/EM02/em02t012 > > The test > > vmTestbase/nsk/jvmti/scenarios/hotswap/HS202/hs202t002/hs202t002.java > > triggers the guarantee > > 238 void JvmtiThreadState::invalidate_cur_stack_depth() { > 239 guarantee(SafepointSynchronize::is_at_safepoint() || > 240 (JavaThread *)Thread::current() == get_thread(), > 241 "must be current thread or at safepoint"); > 242 > 243 _cur_stack_depth = UNKNOWN_STACK_DEPTH; > 244 } > 245 > > because with the enhancement invalidate_cur_stack_depth() gets called by the VMThread executing the > new handshake. So this is covered as well. > > Thanks again for reviewing. > > Do I need more reviews or are your reviews enough to push the enhancement? > > Best regards, > Richard. > > -----Original Message----- > From: serguei.spitsyn at oracle.com > Sent: Freitag, 7. Februar 2020 19:06 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > Hi Richard, > > It looks good to me. > I can't comment on compiled methods non-entrancy. > > What exact tests do you run to verify the fix? > > Thanks, > Serguei > > > On 2/6/20 04:39, Reingruber, Richard wrote: >> Hi, >> >> could I please get reviews for this small enhancement: >> >> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >> >> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >> >> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >> >> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >> >> Thanks, Richard. >> >> See also my question if anyone knows a reason for making the compiled methods not_entrant: >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From ralf.schmelter at sap.com Tue Feb 11 15:35:46 2020 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Tue, 11 Feb 2020 15:35:46 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> Message-ID: Hi Yasumasa, I think I've tried too much by using the -gz flag as a boolean plus a int value. I've decided to use two options instead: -gz as a boolean option to turn compression on and -gz-level to specify the compression level. E.g. GC.heap_dump -gz -gz-level=3 test.hprof.gz Best regards, Ralf From dean.long at oracle.com Tue Feb 11 17:27:36 2020 From: dean.long at oracle.com (Dean Long) Date: Tue, 11 Feb 2020 09:27:36 -0800 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: References: Message-ID: <29a1b008-a322-a058-fdd6-e8c54cef8e6c@oracle.com> You might want to have some runtime/GC folks look at the handshake changes. dl On 2/6/20 4:39 AM, Reingruber, Richard wrote: > Hi, > > could I please get reviews for this small enhancement: > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 > > The change avoids making all compiled methods on stack not_entrant when switching a java thread to > interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. > > Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. > > Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. > > Thanks, Richard. > > See also my question if anyone knows a reason for making the compiled methods not_entrant: > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From sgehwolf at redhat.com Tue Feb 11 18:04:23 2020 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 11 Feb 2020 19:04:23 +0100 Subject: [PING] RFR: 8231111: Cgroups v2: Rework Metrics in java.base so as to recognize unified hierarchy In-Reply-To: <97e7ff2b-f2cc-a666-afb8-521c0f5c37e7@oracle.com> References: <75fc377f8d5ca76b7dac02f55db640cbdd305633.camel@redhat.com> <4bf65380bc26cd3bf684d7994b33e66bcb87927b.camel@redhat.com> <6CACCC0D-7F5A-42A3-83F1-746497940CCA@oracle.com> <29544339574e34c4c25cbec0314c26f35e8d1a99.camel@redhat.com> <4CE7C7F6-ABFA-4263-98B2-32BBD5013A3C@oracle.com> <10b5e83bfb7e618e5f5906c8e707057ff8680785.camel@redhat.com> <4a304e2ce72a53859b4e9cc8b21db404a260b531.camel@redhat.com> <97e7ff2b-f2cc-a666-afb8-521c0f5c37e7@oracle.com> Message-ID: <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> Hi Mandy, Bob, Thanks again for the reviews and patience on this. Sorry it took me so long to get back to this :-/ Updated webrev: Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/webrev/ incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/incremental/webrev/ I've tested this with docker tests on cgroup v1 and via podman on a cgroup v2 system. They pass. I'll be running this through jdk-submit as well. More below. On Tue, 2020-01-21 at 16:09 -0800, Mandy Chung wrote: > Hi Severin, > > Thanks for the update. > > On 1/21/20 11:30 AM, Severin Gehwolf wrote: > > Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/webrev/ > > incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/incremental/webrev/ > > > > I have answered my own question. Most of the metrics used to return 0 if unavailable due to IOException reading a file or malformed content and now are changed to return -2 due to error fetching the metric. > > The following are about limits which used to return -1 if unlimited or no limit set. > public long getCpuQuota(); > public long getCpuShares(); > public long getMemoryLimit(); > public long getMemoryAndSwapLimit(); > public long getMemorySoftLimit(); > > With this patch, only getMemoryLimit and getMemoryAndSwapLimit specify to return -1 if unlimited or no limit set. However the implementation does return -1. All of the above specify to return -2 if unavailable due to error fetching the metric. > > I found the implementation quite hard to follow. I spent some time reviewing the code to see if the implementation matches the spec but I can't easily tell yet. For example, > CgroupSubsystemController::getLongValueMatchingLine returns -1 when IOException occurs. > CgroupSubsystemController::getLongEntry returns 0L if IOException occurs. > > CgroupV1SubsystemController::convertStringToLong returns Long.MAX_VALUE > when value overflows This one is intentional. It's mapped back to unlimited via longValOrUnlimited(). The reason for this is that cgroup v1 doesn't have a concept of "unlimited". Unlimited values will be a very large numbers in cgroup v1 files. > CgroupV2SubsystemController::convertStringToLong returns -1 when IOException occurs > > CgroupV1Subsystem::getCpuShares return -1 if cpu.shares == 0 or 1024 > CgroupV2Subsystem::getCpuShares returns -1 if cpu.weight == 100 or 0 These two are special cases too. See the implementation note of Metrics.getCpuShares(). In the cgroup v2 case the default value is 100 (over 1024 in cgroup v1). That's why unlimited is being returned for those values. > CgroupV2Subsystem::getFromCpuMax returns -1 if error in reading cpu.max or malformed content > CgroupV2Subsystem::sumTokensIOStat returns -2 if IOException error > This is called by getBlkIOServiceCount and getBlkIOServiced > > I think this can be improved and add the documentation to describe > what the methods do. Since Metrics APIs consistently return -2 if > unavailable due to error in fetching the metric, why some utility > methods in *Subsystem and *SubsystemController return -1 upon error > and 0 when unlimited? > > I suspect if the getXXXValue and other methods are clearly documented > with the error cases (possibly renaming the method name if appropriate) > CgroupV1Subsystem and CgroupV2SubSystem will become very explicit > to understand. This should be fixed now. I've gone through the API doc of Metrics.java and have updated it. In general, I've updated it to return -1 if metric is unavailable (due to error in reading some files or content being empty), and -2 if not supported. No method returns -2 currently, but it might change and it's good to have some way of saying "not implementable" for this subsystem in the spec. That's my take on it anyway. There is also a new unit test for shared controller logic: TestCgroupSubsystemController.java It execises various cases of error/success. That is to ensure proper symmetry across the various cases (including IOException). I've also documented static methods in CgroupSubsystemController. Overall, all methods now return the same values for cgroup v1 and cgroup v2 (given the impl nuances) for the various cases. > > CgroupSubsystem.java > > 44 public static final double DOUBLE_RETVAL_NOT_SUPPORTED = LONG_RETVAL_NOT_SUPPORTED; > 49 public static final Boolean BOOL_RETVAL_NOT_SUPPORTED = null; > > They are no longer needed, right? Removed. > > CgroupSubsystemFactory.java > > 89 System.err.println("Warning: Mixed cgroupv1 and cgroupv2 not supported. Metrics disabled."); > > > I expect this be a System.Logger log Updated. > 114 if (!Integer.valueOf(0).toString().equals(tokens[0])) { > > This can be simplified to if (!"0".equals(tokens[0])) Done, thanks! > LauncherHelper.java > > 407 // Extended cgroupv1 specific metrics > 408 if (c instanceof MetricsCgroupV1) { > 409 MetricsCgroupV1 cgroupV1 = (MetricsCgroupV1)c; > 410 limit = cgroupV1.getKernelMemoryLimit(); > 411 ostream.println(formatLimitString(limit, INDENT + "Kernel Memory Limit: ")); > 412 limit = cgroupV1.getTcpMemoryLimit(); > 413 ostream.println(formatLimitString(limit, INDENT + "TCP Memory Limit: ")); > 414 Boolean value = cgroupV1.isMemoryOOMKillEnabled(); > 415 ostream.println(formatBoolean(value, INDENT + "Out Of Memory Killer Enabled: ")); > 416 value = cgroupV1.isCpuSetMemoryPressureEnabled(); > 417 ostream.println(formatBoolean(value, INDENT + "CPUSet Memory Pressure Enabled: ")); > 418 } > > MetricsCgroupV1 is linux-only. It will fail the compilation when > building on non-linux. One option is to move this code to > src/java.base/linux/share/sun/launcher/CgroupMetrics.java > > Are they continued to be interesting metrics to be output from > -XshowSetting? I wonder if they can simply be dropped from the output. > Bob will have an opinion. I've removed those extra cgroup v1 specific metrics printed via -XshowSettings:system. Not sure what to do with MetricsCgroupV1. It's only used in tests in webrev 10. On the other hand the idea would be for consumers to downcast it to MetricsCgroupV1 if they needed those extra metrics. Thanks, Severin From serguei.spitsyn at oracle.com Tue Feb 11 19:19:55 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 11 Feb 2020 11:19:55 -0800 Subject: RFR [XS]: 8238602: remove obsolete functions from libinstrument/FileSystemSupport_md.c In-Reply-To: References: Message-ID: <5eba2808-811b-15d4-0c81-05051f7c3c7e@oracle.com> Hi Matthias, It looks good. Thanks, Serguei On 2/6/20 8:06 AM, Baesken, Matthias wrote: > > Hello, > > ? the link time section gc (see > https://bugs.openjdk.java.net/browse/JDK-8236714?, on linux s390x it > prints the removed sections) showed some obsolete / unused functions > in FileSystemSupport_md.c : > > ld: Removing unused section '.text.pathSeparator' in file > '/nightly/output-jdk-dev/support/native/java.instrument/libinstrument/FileSystemSupport_md.o' > > ld: Removing unused section '.text.filenameStrcmp' in file > '/nightly/output-jdk-dev/support/native/java.instrument/libinstrument/FileSystemSupport_md.o' > > They can be cleaned up. > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8238602 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8238602.0/ > > Thanks, Matthias > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Feb 11 19:41:52 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 11 Feb 2020 11:41:52 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> Message-ID: Hi Ralf, I'd suggest for the option format something like this: ? -gz[=level] where level is an int. The part [=level] is optional. The level is 0 by default (if it is not set). Thanks, Serguei On 2/11/20 7:35 AM, Schmelter, Ralf wrote: > Hi Yasumasa, > > I think I've tried too much by using the -gz flag as a boolean plus a int value. I've decided to use two options instead: -gz as a boolean option to turn compression on and -gz-level to specify the compression level. E.g. > GC.heap_dump -gz -gz-level=3 test.hprof.gz > > Best regards, > Ralf From serguei.spitsyn at oracle.com Tue Feb 11 19:49:20 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 11 Feb 2020 11:49:20 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> Message-ID: <01361a9d-2855-db67-a176-73731fada08f@oracle.com> Ralf, I see this feature adds a lot of code. In fact, I'm not sure, it is worth to add this kind of complexity (including new compressing threads) into the VM implementation. What is a real use case behind it? Could this compressing be done separately from VM implementation? Thanks, Serguei On 2/11/20 11:41 AM, serguei.spitsyn at oracle.com wrote: > Hi Ralf, > > I'd suggest for the option format something like this: > ? -gz[=level] > > where level is an int. The part [=level] is optional. > The level is 0 by default (if it is not set). > > Thanks, > Serguei > > On 2/11/20 7:35 AM, Schmelter, Ralf wrote: >> Hi Yasumasa, >> >> I think I've tried too much by using the -gz flag as a boolean plus a >> int value. I've decided to use two options instead: -gz as a boolean >> option to turn compression on and -gz-level to specify the >> compression level. E.g. >> GC.heap_dump -gz -gz-level=3 test.hprof.gz >> >> Best regards, >> Ralf > From larry.cable at oracle.com Tue Feb 11 21:20:19 2020 From: larry.cable at oracle.com (Laurence Cable) Date: Tue, 11 Feb 2020 13:20:19 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> Message-ID: <5f4382c3-652a-33f4-7b5e-f5529a3eb32b@oracle.com> On 2/11/20 11:41 AM, serguei.spitsyn at oracle.com wrote: > Hi Ralf, > > I'd suggest for the option format something like this: > ? -gz[=level] > > where level is an int. The part [=level] is optional. > The level is 0 by default (if it is not set). +1! > > > Thanks, > Serguei > > On 2/11/20 7:35 AM, Schmelter, Ralf wrote: >> Hi Yasumasa, >> >> I think I've tried too much by using the -gz flag as a boolean plus a >> int value. I've decided to use two options instead: -gz as a boolean >> option to turn compression on and -gz-level to specify the >> compression level. E.g. >> GC.heap_dump -gz -gz-level=3 test.hprof.gz >> >> Best regards, >> Ralf > From larry.cable at oracle.com Tue Feb 11 21:21:44 2020 From: larry.cable at oracle.com (Laurence Cable) Date: Tue, 11 Feb 2020 13:21:44 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <01361a9d-2855-db67-a176-73731fada08f@oracle.com> References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> Message-ID: <0dbf1e54-58f9-8d26-6ed4-8d7eb15f2d04@oracle.com> On 2/11/20 11:49 AM, serguei.spitsyn at oracle.com wrote: > Ralf, > > I see this feature adds a lot of code. In fact, I'm not sure, it is > worth to add this kind of complexity (including new compressing > threads) into the VM implementation. What is a real use case behind > it? Could this compressing be done separately from VM implementation? > I have to say that the very same thoughts were also occurring to me ... > Thanks, > Serguei > > > On 2/11/20 11:41 AM, serguei.spitsyn at oracle.com wrote: >> Hi Ralf, >> >> I'd suggest for the option format something like this: >> ? -gz[=level] >> >> where level is an int. The part [=level] is optional. >> The level is 0 by default (if it is not set). >> >> Thanks, >> Serguei >> >> On 2/11/20 7:35 AM, Schmelter, Ralf wrote: >>> Hi Yasumasa, >>> >>> I think I've tried too much by using the -gz flag as a boolean plus >>> a int value. I've decided to use two options instead: -gz as a >>> boolean option to turn compression on and -gz-level to specify the >>> compression level. E.g. >>> GC.heap_dump -gz -gz-level=3 test.hprof.gz >>> >>> Best regards, >>> Ralf >> > From chris.plummer at oracle.com Tue Feb 11 21:49:27 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 11 Feb 2020 13:49:27 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <42C18C7D-7FD3-4B4F-AFDC-9A16218AD578@oracle.com> References: <39200476-D017-4BFD-ABF2-8ABC59ADA8C0@oracle.com> <3edb4c50-588d-d111-2a39-c9859ec64931@oracle.com> <6609ac07-de22-dcf5-6bea-420fdbcdddb8@oracle.com> <42C18C7D-7FD3-4B4F-AFDC-9A16218AD578@oracle.com> Message-ID: <380a4ffe-de77-ee92-2ae1-26b73e95d26a@oracle.com> Hi Igor, Here's an updated webrev: http://cr.openjdk.java.net/~cjplummer/8238196/webrev.01/index.html I rebased to JDK 15 and made all the changes you suggested except for (3). I did not think it is necessary since the code is only executed on OSX. However, if you still feel allowing flexibility in the path separator is important, I can add that change too. thanks, Chris On 2/10/20 1:34 PM, Igor Ignatyev wrote: > Hi Chris, > > in general it all looks good, I have a few comments (most of them are > editorial): > in?Platform.java: > 1. you have doubled spaced at line#238 (b/w boolean and?isSignedOSX) > 2. as?FileNotFoundException is?IOException, there is no need to > declare the former in the signature of?isSignedOSX > 3. it's better to pass jdkPath, "bin" and "java" as separate arguments > to Path.get, so the code won't depend on file separator > 4. you are waiting for codesign to finish w/o reading its cout / cerr, > which might lead to a deadlock (if?codesign will exhaust IO buffer > before exiting), so you need to either create two separate threads to > read cout and cerr or ?redirect these streams them to files and read > these files afterwards or just ignore cout/cerr by using > Redirect.DISCARD. I'd personally recommend the latter as the result of > codesign can be reliably deduced from its exitcode (0 - signed, 1 - > verification failed, 2 - wrong arguments, 3 - not all requirements > from R are satisfied) and using cout/cerr is somewhat fragile as there > is no guarantee output format won't be changed. > > the rest looks good to me. > > -- Igor > >> On Feb 10, 2020, at 11:48 AM, Chris Plummer > > wrote: >> >> Ping #2. It's not that hard of a review. Most of it is the new >> Platform.isSignedOSX() method, which is well commented and pretty >> straight froward. >> >> thanks, >> >> Chris >> >> On 2/4/20 5:04 PM, Chris Plummer wrote: >>> Ping! >>> >>> And I decided to push to 15 instead of 14. Will backport to 14 >>> eventually. >>> >>> thanks, >>> >>> Chris >>> >>> On 1/30/20 10:20 PM, Chris Plummer wrote: >>>> Yes, you are correct: >>>> >>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00 >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 1/30/20 10:13 PM, Igor Ignatyev wrote: >>>>> Hi Chris, >>>>> >>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00??seems to >>>>> be a webrev from another issue, should it have been? >>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00/??? >>>>> >>>>> -- Igor >>>>> >>>>>> On Jan 30, 2020, at 10:10 PM, Chris Plummer >>>>>> > wrote: >>>>>> >>>>>> Hello, >>>>>> >>>>>> Please review the following fix for some SA tests that are >>>>>> failing on Mac OS X 10.14.5 and later: >>>>>> >>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 >>>>>> >>>>>> The issue is that SA can't attach to a signed binary starting >>>>>> with 10.14.5. There is no workaround for this, so these tests are >>>>>> being disabled when it is detected that the binary is signed and >>>>>> we are running on 10.14 or later (I chose all 10.14 releases to >>>>>> simplify the check). >>>>>> >>>>>> Some background may help explain the fix. In order for SA to >>>>>> attach to a live process (not a core file) on OSX, either the >>>>>> attaching process (ie. the test) has to be run as root, or sudo >>>>>> needs to be supported. However, the only tests that make the sudo >>>>>> check are the 20 or so that use ClhsdbLauncher. The rest all rely >>>>>> on "@requires vm.hasSAandCanAttach" to filter out tests that use >>>>>> SA attach. vm.hasSAandCanAttach only checks if the test is being >>>>>> run as root. Thus all our non-ClhsdbLauncher tests that SA attach >>>>>> to a live process are currently not run unless they are run as >>>>>> root. 8238268 [1] has been filed to address this, making it so >>>>>> all the tests will attempt to use sudo if not run as root. >>>>>> >>>>>> Because of the difference in how ClhsdbLauncher tests and >>>>>> "@requires? vm.hasSAandCanAttach" tests check to see if they are >>>>>> runnable, this fix needs to address both types of checks. The >>>>>> common code for both these cases is Platform.shouldSAAttach(), >>>>>> which on OSX basically equates to check to see if we are running >>>>>> as root. I changed it to also return false if running on signed >>>>>> binary with 10.14 or later. However, this confused the >>>>>> ClhsdbLauncher use of Platform.shouldSAAttach() somewhat, since >>>>>> it assumed a false result only happens because you are not >>>>>> running as root (in which case it would then check if sudo will >>>>>> work). So ClhsdbLauncher now has double check that the false >>>>>> result was not because of running a signed binary. If it is >>>>>> signed, it won't do the sudo check. This will get cleaned up with >>>>>> 8238268 [1], which will move the sudo check into >>>>>> Platform.shouldSAAttach(). >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238268 >>>>>> >>>>> >>>> >>> >> > From igor.ignatyev at oracle.com Tue Feb 11 21:55:19 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 11 Feb 2020 13:55:19 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <380a4ffe-de77-ee92-2ae1-26b73e95d26a@oracle.com> References: <39200476-D017-4BFD-ABF2-8ABC59ADA8C0@oracle.com> <3edb4c50-588d-d111-2a39-c9859ec64931@oracle.com> <6609ac07-de22-dcf5-6bea-420fdbcdddb8@oracle.com> <42C18C7D-7FD3-4B4F-AFDC-9A16218AD578@oracle.com> <380a4ffe-de77-ee92-2ae1-26b73e95d26a@oracle.com> Message-ID: Hi Chris, I don't insist on (3), so I'm fine if you don't want to change that part. one thing I'd change though is to restore thread interrupted state at L#266 of Platform.java (no need to publish new webrev) Thanks, -- Igor > On Feb 11, 2020, at 1:49 PM, Chris Plummer wrote: > > Hi Igor, > > Here's an updated webrev: > > http://cr.openjdk.java.net/~cjplummer/8238196/webrev.01/index.html > > I rebased to JDK 15 and made all the changes you suggested except for (3). I did not think it is necessary since the code is only executed on OSX. However, if you still feel allowing flexibility in the path separator is important, I can add that change too. > > thanks, > > Chris > > On 2/10/20 1:34 PM, Igor Ignatyev wrote: >> Hi Chris, >> >> in general it all looks good, I have a few comments (most of them are editorial): >> in Platform.java: >> 1. you have doubled spaced at line#238 (b/w boolean and isSignedOSX) >> 2. as FileNotFoundException is IOException, there is no need to declare the former in the signature of isSignedOSX >> 3. it's better to pass jdkPath, "bin" and "java" as separate arguments to Path.get, so the code won't depend on file separator >> 4. you are waiting for codesign to finish w/o reading its cout / cerr, which might lead to a deadlock (if codesign will exhaust IO buffer before exiting), so you need to either create two separate threads to read cout and cerr or redirect these streams them to files and read these files afterwards or just ignore cout/cerr by using Redirect.DISCARD. I'd personally recommend the latter as the result of codesign can be reliably deduced from its exitcode (0 - signed, 1 - verification failed, 2 - wrong arguments, 3 - not all requirements from R are satisfied) and using cout/cerr is somewhat fragile as there is no guarantee output format won't be changed. >> >> the rest looks good to me. >> >> -- Igor >> >>> On Feb 10, 2020, at 11:48 AM, Chris Plummer >> wrote: >>> >>> Ping #2. It's not that hard of a review. Most of it is the new Platform.isSignedOSX() method, which is well commented and pretty straight froward. >>> >>> thanks, >>> >>> Chris >>> >>> On 2/4/20 5:04 PM, Chris Plummer wrote: >>>> Ping! >>>> >>>> And I decided to push to 15 instead of 14. Will backport to 14 eventually. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 1/30/20 10:20 PM, Chris Plummer wrote: >>>>> Yes, you are correct: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00 >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 1/30/20 10:13 PM, Igor Ignatyev wrote: >>>>>> Hi Chris, >>>>>> >>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00? seems to be a webrev from another issue, should it have been? http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00/? ? >>>>>> >>>>>> -- Igor >>>>>> >>>>>>> On Jan 30, 2020, at 10:10 PM, Chris Plummer >> wrote: >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> Please review the following fix for some SA tests that are failing on Mac OS X 10.14.5 and later: >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 >>>>>>> >>>>>>> The issue is that SA can't attach to a signed binary starting with 10.14.5. There is no workaround for this, so these tests are being disabled when it is detected that the binary is signed and we are running on 10.14 or later (I chose all 10.14 releases to simplify the check). >>>>>>> >>>>>>> Some background may help explain the fix. In order for SA to attach to a live process (not a core file) on OSX, either the attaching process (ie. the test) has to be run as root, or sudo needs to be supported. However, the only tests that make the sudo check are the 20 or so that use ClhsdbLauncher. The rest all rely on "@requires vm.hasSAandCanAttach" to filter out tests that use SA attach. vm.hasSAandCanAttach only checks if the test is being run as root. Thus all our non-ClhsdbLauncher tests that SA attach to a live process are currently not run unless they are run as root. 8238268 [1] has been filed to address this, making it so all the tests will attempt to use sudo if not run as root. >>>>>>> >>>>>>> Because of the difference in how ClhsdbLauncher tests and "@requires? vm.hasSAandCanAttach" tests check to see if they are runnable, this fix needs to address both types of checks. The common code for both these cases is Platform.shouldSAAttach(), which on OSX basically equates to check to see if we are running as root. I changed it to also return false if running on signed binary with 10.14 or later. However, this confused the ClhsdbLauncher use of Platform.shouldSAAttach() somewhat, since it assumed a false result only happens because you are not running as root (in which case it would then check if sudo will work). So ClhsdbLauncher now has double check that the false result was not because of running a signed binary. If it is signed, it won't do the sudo check. This will get cleaned up with 8238268 [1], which will move the sudo check into Platform.shouldSAAttach(). >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238268 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bob.vandette at oracle.com Tue Feb 11 21:59:38 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 11 Feb 2020 16:59:38 -0500 Subject: [PING] RFR: 8231111: Cgroups v2: Rework Metrics in java.base so as to recognize unified hierarchy In-Reply-To: <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> References: <75fc377f8d5ca76b7dac02f55db640cbdd305633.camel@redhat.com> <4bf65380bc26cd3bf684d7994b33e66bcb87927b.camel@redhat.com> <6CACCC0D-7F5A-42A3-83F1-746497940CCA@oracle.com> <29544339574e34c4c25cbec0314c26f35e8d1a99.camel@redhat.com> <4CE7C7F6-ABFA-4263-98B2-32BBD5013A3C@oracle.com> <10b5e83bfb7e618e5f5906c8e707057ff8680785.camel@redhat.com> <4a304e2ce72a53859b4e9cc8b21db404a260b531.camel@redhat.com> <97e7ff2b-f2cc-a666-afb8-521c0f5c37e7@oracle.com> <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> Message-ID: <436303A0-4D46-48BD-8EC3-23A7B213C6FA@oracle.com> I applied your patch to the latest JDK 15 sources and ran the container tests on Oracle Linux 8.1 with podman/cgroupv2 enabled. There were some issues. I?m not sure if its my setup or not. I also ran the same build on Ubuntu with docker/cgroupv1. I didn't see any failures on the cgroupv1 system. Here are some notes: The podman version on OL 8.1 doesn't yes support cgroupv2. I built the latest from sources for this test. The docker tests take a very long time under podman! Longer than the cgroupv1 run. cpusets and cpusets.mems are blank on host and if none are specified on podman/docker run command. On cgroupv1 they are host values if not specified for container. Effective cpusets and cpusets.mems are set properly in a container. HOST OUTPUT: ./java -XshowSettings:system -version Operating System Metrics: Provider: cgroupv2 Effective CPU Count: 32 CPU Period: -1 CPU Quota: -1 CPU Shares: -1 List of Processors: N/A List of Effective Processors: N/A List of Memory Nodes: N/A List of Available Memory Nodes: N/A Memory Limit: Unlimited Memory Soft Limit: Unlimited Memory & Swap Limit: Unlimited CONTAINER OUTPUT: # podman run -it -v `pwd`:/mnt ubuntu bash root at 3c6654a3b834:/mnt/jdk/bin# ./java -XshowSettings:system -version Operating System Metrics: Provider: cgroupv2 Effective CPU Count: 32 CPU Period: 100000us CPU Quota: -1 CPU Shares: -1 List of Processors: N/A List of Effective Processors, 32 total: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 List of Memory Nodes: N/A List of Available Memory Nodes, 1 total: 0 Memory Limit: Unlimited Memory Soft Limit: Unlimited Memory & Swap Limit: Unlimited Docker tests fail if /bin/docker is not available in podman setup. We probably should enhance the docker check to also look for podman. Two container tests failed: FAILED: containers/cgroup/PlainRead.java failed Memory Limit is: -2 instead of unlimited or -1. This is because memory.max is not foumd. FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java This fails because nr_periods line doesn't always exist. I think you?ve got to enable a quota for this to appear (not sure). Here?s the contents: % more cpu.stat usage_usec 23974562755 user_usec 22257183568 system_usec 1717379186 CGROUPV2 results on Oracle Linux 8.1 --------- testing container APIs FAILED: containers/cgroup/PlainRead.java Passed: containers/docker/DockerBasicTest.java Passed: containers/docker/TestCPUAwareness.java Passed: containers/docker/TestCPUSets.java Passed: containers/docker/TestJcmdWithSideCar.java Passed: containers/docker/TestJFREvents.java Passed: containers/docker/TestJFRNetworkEvents.java Passed: containers/docker/TestMemoryAwareness.java Passed: containers/docker/TestMisc.java Test results: passed: 8; failed: 1 Results written to /export/users/bobv/jdk15/build/jtreg/JTwork Error: Some tests failed or other problems occurred. testing jdk.internal.platform APIs FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java Passed: jdk/internal/platform/docker/TestSystemMetrics.java Test results: passed: 4; failed: 1 Results written to /export/users/bobv/jdk15/build/jtreg/JTwork Error: Some tests failed or other problems occurred. testing -XshowSettings:system launcher option Passed: tools/launcher/Settings.java Test results: passed: 1 Bob. > On Feb 11, 2020, at 1:04 PM, Severin Gehwolf wrote: > > Hi Mandy, Bob, > > Thanks again for the reviews and patience on this. Sorry it took me so > long to get back to this :-/ > > Updated webrev: > Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/webrev/ > incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/incremental/webrev/ > > I've tested this with docker tests on cgroup v1 and via podman on a > cgroup v2 system. They pass. I'll be running this through jdk-submit as > well. > > More below. > > On Tue, 2020-01-21 at 16:09 -0800, Mandy Chung wrote: >> Hi Severin, >> >> Thanks for the update. >> >> On 1/21/20 11:30 AM, Severin Gehwolf wrote: >>> Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/webrev/ >>> incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/incremental/webrev/ >>> >> >> I have answered my own question. Most of the metrics used to return 0 if unavailable due to IOException reading a file or malformed content and now are changed to return -2 due to error fetching the metric. >> >> The following are about limits which used to return -1 if unlimited or no limit set. >> public long getCpuQuota(); >> public long getCpuShares(); >> public long getMemoryLimit(); >> public long getMemoryAndSwapLimit(); >> public long getMemorySoftLimit(); >> >> With this patch, only getMemoryLimit and getMemoryAndSwapLimit specify to return -1 if unlimited or no limit set. However the implementation does return -1. All of the above specify to return -2 if unavailable due to error fetching the metric. >> >> I found the implementation quite hard to follow. I spent some time reviewing the code to see if the implementation matches the spec but I can't easily tell yet. For example, >> CgroupSubsystemController::getLongValueMatchingLine returns -1 when IOException occurs. >> CgroupSubsystemController::getLongEntry returns 0L if IOException occurs. >> >> CgroupV1SubsystemController::convertStringToLong returns Long.MAX_VALUE >> when value overflows > > This one is intentional. It's mapped back to unlimited via > longValOrUnlimited(). The reason for this is that cgroup v1 doesn't > have a concept of "unlimited". Unlimited values will be a very large > numbers in cgroup v1 files. > >> CgroupV2SubsystemController::convertStringToLong returns -1 when IOException occurs >> >> CgroupV1Subsystem::getCpuShares return -1 if cpu.shares == 0 or 1024 >> CgroupV2Subsystem::getCpuShares returns -1 if cpu.weight == 100 or 0 > > These two are special cases too. See the implementation note of > Metrics.getCpuShares(). In the cgroup v2 case the default value is 100 > (over 1024 in cgroup v1). That's why unlimited is being returned for > those values. > >> CgroupV2Subsystem::getFromCpuMax returns -1 if error in reading cpu.max or malformed content >> CgroupV2Subsystem::sumTokensIOStat returns -2 if IOException error >> This is called by getBlkIOServiceCount and getBlkIOServiced >> >> I think this can be improved and add the documentation to describe >> what the methods do. Since Metrics APIs consistently return -2 if >> unavailable due to error in fetching the metric, why some utility >> methods in *Subsystem and *SubsystemController return -1 upon error >> and 0 when unlimited? >> >> I suspect if the getXXXValue and other methods are clearly documented >> with the error cases (possibly renaming the method name if appropriate) >> CgroupV1Subsystem and CgroupV2SubSystem will become very explicit >> to understand. > > This should be fixed now. > > I've gone through the API doc of Metrics.java and have updated it. In > general, I've updated it to return -1 if metric is unavailable (due to > error in reading some files or content being empty), and -2 if not > supported. No method returns -2 currently, but it might change and it's > good to have some way of saying "not implementable" for this subsystem > in the spec. That's my take on it anyway. > > There is also a new unit test for shared controller logic: > TestCgroupSubsystemController.java > > It execises various cases of error/success. > > That is to ensure proper symmetry across the various cases (including > IOException). I've also documented static methods in > CgroupSubsystemController. Overall, all methods now return the same > values for cgroup v1 and cgroup v2 (given the impl nuances) for the > various cases. > >> >> CgroupSubsystem.java >> >> 44 public static final double DOUBLE_RETVAL_NOT_SUPPORTED = LONG_RETVAL_NOT_SUPPORTED; >> 49 public static final Boolean BOOL_RETVAL_NOT_SUPPORTED = null; >> >> They are no longer needed, right? > > Removed. > >> >> CgroupSubsystemFactory.java >> >> 89 System.err.println("Warning: Mixed cgroupv1 and cgroupv2 not supported. Metrics disabled."); >> >> >> I expect this be a System.Logger log > > Updated. > >> 114 if (!Integer.valueOf(0).toString().equals(tokens[0])) { >> >> This can be simplified to if (!"0".equals(tokens[0])) > > Done, thanks! > >> LauncherHelper.java >> >> 407 // Extended cgroupv1 specific metrics >> 408 if (c instanceof MetricsCgroupV1) { >> 409 MetricsCgroupV1 cgroupV1 = (MetricsCgroupV1)c; >> 410 limit = cgroupV1.getKernelMemoryLimit(); >> 411 ostream.println(formatLimitString(limit, INDENT + "Kernel Memory Limit: ")); >> 412 limit = cgroupV1.getTcpMemoryLimit(); >> 413 ostream.println(formatLimitString(limit, INDENT + "TCP Memory Limit: ")); >> 414 Boolean value = cgroupV1.isMemoryOOMKillEnabled(); >> 415 ostream.println(formatBoolean(value, INDENT + "Out Of Memory Killer Enabled: ")); >> 416 value = cgroupV1.isCpuSetMemoryPressureEnabled(); >> 417 ostream.println(formatBoolean(value, INDENT + "CPUSet Memory Pressure Enabled: ")); >> 418 } >> >> MetricsCgroupV1 is linux-only. It will fail the compilation when >> building on non-linux. One option is to move this code to >> src/java.base/linux/share/sun/launcher/CgroupMetrics.java >> >> Are they continued to be interesting metrics to be output from >> -XshowSetting? I wonder if they can simply be dropped from the output. >> Bob will have an opinion. > > I've removed those extra cgroup v1 specific metrics printed via > -XshowSettings:system. Not sure what to do with MetricsCgroupV1. It's > only used in tests in webrev 10. On the other hand the idea would be > for consumers to downcast it to MetricsCgroupV1 if they needed those > extra metrics. > > Thanks, > Severin > From chris.plummer at oracle.com Tue Feb 11 22:02:48 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 11 Feb 2020 14:02:48 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: References: <39200476-D017-4BFD-ABF2-8ABC59ADA8C0@oracle.com> <3edb4c50-588d-d111-2a39-c9859ec64931@oracle.com> <6609ac07-de22-dcf5-6bea-420fdbcdddb8@oracle.com> <42C18C7D-7FD3-4B4F-AFDC-9A16218AD578@oracle.com> <380a4ffe-de77-ee92-2ae1-26b73e95d26a@oracle.com> Message-ID: <37616cfd-4ad9-bed4-0838-d1238caf4c45@oracle.com> An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Tue Feb 11 22:23:19 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 11 Feb 2020 14:23:19 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <37616cfd-4ad9-bed4-0838-d1238caf4c45@oracle.com> References: <39200476-D017-4BFD-ABF2-8ABC59ADA8C0@oracle.com> <3edb4c50-588d-d111-2a39-c9859ec64931@oracle.com> <6609ac07-de22-dcf5-6bea-420fdbcdddb8@oracle.com> <42C18C7D-7FD3-4B4F-AFDC-9A16218AD578@oracle.com> <380a4ffe-de77-ee92-2ae1-26b73e95d26a@oracle.com> <37616cfd-4ad9-bed4-0838-d1238caf4c45@oracle.com> Message-ID: <066B2149-8EB6-4835-AA0A-49D3D5B9CFCC@oracle.com> no, I meant to call Thread.currentThread().interrupt(), calling that will restore interrupted state of the thread, so an user of Platform class will be able to response to it appropriately, w/ your current code, the fact that the thread was interrupted will be missed, and in most cases it is not right thing to do. -- Igor > On Feb 11, 2020, at 2:02 PM, Chris Plummer wrote: > > Hi Igor, > > I'm not sure what you mean by restore the interrupt state. Do you mean loop back to the waitFor() call? > > thanks, > > Chris > > On 2/11/20 1:55 PM, Igor Ignatyev wrote: >> Hi Chris, >> >> I don't insist on (3), so I'm fine if you don't want to change that part. one thing I'd change though is to restore thread interrupted state at L#266 of Platform.java (no need to publish new webrev) >> >> Thanks, >> -- Igor >> >>> On Feb 11, 2020, at 1:49 PM, Chris Plummer > wrote: >>> >>> Hi Igor, >>> >>> Here's an updated webrev: >>> >>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.01/index.html >>> >>> I rebased to JDK 15 and made all the changes you suggested except for (3). I did not think it is necessary since the code is only executed on OSX. However, if you still feel allowing flexibility in the path separator is important, I can add that change too. >>> >>> thanks, >>> >>> Chris >>> >>> On 2/10/20 1:34 PM, Igor Ignatyev wrote: >>>> Hi Chris, >>>> >>>> in general it all looks good, I have a few comments (most of them are editorial): >>>> in Platform.java: >>>> 1. you have doubled spaced at line#238 (b/w boolean and isSignedOSX) >>>> 2. as FileNotFoundException is IOException, there is no need to declare the former in the signature of isSignedOSX >>>> 3. it's better to pass jdkPath, "bin" and "java" as separate arguments to Path.get, so the code won't depend on file separator >>>> 4. you are waiting for codesign to finish w/o reading its cout / cerr, which might lead to a deadlock (if codesign will exhaust IO buffer before exiting), so you need to either create two separate threads to read cout and cerr or redirect these streams them to files and read these files afterwards or just ignore cout/cerr by using Redirect.DISCARD. I'd personally recommend the latter as the result of codesign can be reliably deduced from its exitcode (0 - signed, 1 - verification failed, 2 - wrong arguments, 3 - not all requirements from R are satisfied) and using cout/cerr is somewhat fragile as there is no guarantee output format won't be changed. >>>> >>>> the rest looks good to me. >>>> >>>> -- Igor >>>> >>>>> On Feb 10, 2020, at 11:48 AM, Chris Plummer >> wrote: >>>>> >>>>> Ping #2. It's not that hard of a review. Most of it is the new Platform.isSignedOSX() method, which is well commented and pretty straight froward. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 2/4/20 5:04 PM, Chris Plummer wrote: >>>>>> Ping! >>>>>> >>>>>> And I decided to push to 15 instead of 14. Will backport to 14 eventually. >>>>>> >>>>>> thanks, >>>>>> >>>>>> Chris >>>>>> >>>>>> On 1/30/20 10:20 PM, Chris Plummer wrote: >>>>>>> Yes, you are correct: >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00 >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 1/30/20 10:13 PM, Igor Ignatyev wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00? seems to be a webrev from another issue, should it have been? http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00/? ? >>>>>>>> >>>>>>>> -- Igor >>>>>>>> >>>>>>>>> On Jan 30, 2020, at 10:10 PM, Chris Plummer >> wrote: >>>>>>>>> >>>>>>>>> Hello, >>>>>>>>> >>>>>>>>> Please review the following fix for some SA tests that are failing on Mac OS X 10.14.5 and later: >>>>>>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 >>>>>>>>> >>>>>>>>> The issue is that SA can't attach to a signed binary starting with 10.14.5. There is no workaround for this, so these tests are being disabled when it is detected that the binary is signed and we are running on 10.14 or later (I chose all 10.14 releases to simplify the check). >>>>>>>>> >>>>>>>>> Some background may help explain the fix. In order for SA to attach to a live process (not a core file) on OSX, either the attaching process (ie. the test) has to be run as root, or sudo needs to be supported. However, the only tests that make the sudo check are the 20 or so that use ClhsdbLauncher. The rest all rely on "@requires vm.hasSAandCanAttach" to filter out tests that use SA attach. vm.hasSAandCanAttach only checks if the test is being run as root. Thus all our non-ClhsdbLauncher tests that SA attach to a live process are currently not run unless they are run as root. 8238268 [1] has been filed to address this, making it so all the tests will attempt to use sudo if not run as root. >>>>>>>>> >>>>>>>>> Because of the difference in how ClhsdbLauncher tests and "@requires? vm.hasSAandCanAttach" tests check to see if they are runnable, this fix needs to address both types of checks. The common code for both these cases is Platform.shouldSAAttach(), which on OSX basically equates to check to see if we are running as root. I changed it to also return false if running on signed binary with 10.14 or later. However, this confused the ClhsdbLauncher use of Platform.shouldSAAttach() somewhat, since it assumed a false result only happens because you are not running as root (in which case it would then check if sudo will work). So ClhsdbLauncher now has double check that the false result was not because of running a signed binary. If it is signed, it won't do the sudo check. This will get cleaned up with 8238268 [1], which will move the sudo check into Platform.shouldSAAttach(). >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238268 > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Tue Feb 11 22:50:51 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 12 Feb 2020 08:50:51 +1000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> Message-ID: <7330038c-be71-99f4-f285-db762a2dc291@oracle.com> On 11/02/2020 10:55 pm, Yasumasa Suenaga wrote: > Hi Ralf, > > On 2020/02/11 0:33, Schmelter, Ralf wrote: >> Hi Yasumasa, >> >>> ?? You can use `DCmdArgument` for -gz option. >> >> That is what I originally tried. But then you always have to supply a >> compression level (just specifying -gz doesn't work). Since I would >> expect most users never caring about the compression level, I switched >> to a string option, which can handle this pattern. > > I think you can modify DCmdArgument::parse_value() to allow the > operation without argument. > Or you can add new impl function for integer types which can handle > default value. > > >>> _nr_of_threads, _id_to_write, _current in CompressionBackend should >>> be added `volatile` at least. >> >> I don't think that is needed. Apart from the initialization, they are >> only changed under lock protection. > > I concerned with compiler optimization. > They are class members and they are used in `while(true)` loop. > Of course the problem would not appear in all C++ compiler, but I guess > it is more safely if `volatile` is added. As long as the variable is only accessed under the lock then volatile is not needed. If a compiler hoisted accesses outside of locked regions then all MT code could be broken. David ----- > >>> BTW how much processing time is different between single threaded and >>> multi threaded? >> >> I've benchmarked an example, which creates a ~31 GB uncompressed hprof >> file, with a VM which doesn't use any background threads. Here are the >> size of the create files, the compression level and the time spend: >> >> Uncompressed, 31.6 G, 71 sec >> gzipped level 1, 7.57 G, 463 sec (x6.5) >> gzipped level 3, 7.10 G, 609 sec (x8.6) >> gzipped level 6, 6.49 G, 1415 sec (x19.9) >> >> So even the fastest gzip compression makes writing the dump at least 5 >> times as slow. >> >>> Also I want to know what number is set to ParallelGCThreads. >>> ParallelGCThreads seems to affect to thread num for GZip compression. >> >> Originally, I've tried to use the WorkGang (CollectedHeap:: >> get_safepoint_workers()) of the GC to do the work. But this wouldn't >> work because Shenandoah could not iterate the heap from a worker >> thread. So I've opted to start the needed threads itself for the time >> of the heap dump. I've used ParallelGCThreads as the maximum number of >> threads, since this is what would be used for a GC too. So it should >> not clog up the machine more than a GC. Maybe it would be even better >> to additionally limit the threads by the compression level. > > Thanks! > > Yasumasa (ysuenaga) > > >> Best regards, >> Ralf Schmelter >> >> -----Original Message----- >> From: Yasumasa Suenaga >> Sent: Samstag, 8. Februar 2020 14:46 >> To: Schmelter, Ralf ; OpenJDK Serviceability >> >> Cc: yasuenag at gmail.com >> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped >> heap dump >> >> Hi Ralf, >> >> >> - diagnosticCommand.cpp >> ?? You can use `DCmdArgument` for -gz option. >> ?? If you want to use lesser type (e.g. int, unsigned char), I guess >> you need to modify GenDCmdArgument class. >> >> - heapDumper.cpp >> ?? _nr_of_threads, _id_to_write, _current in CompressionBackend should >> be added `volatile` at least. >> ?? (Other values need to be checked) >> >> >> BTW how much processing time is different between single threaded and >> multi threaded? >> Also I want to know what number is set to ParallelGCThreads. >> ParallelGCThreads seems to affect to thread num for GZip compression. >> >> >> Thanks, >> >> Yasumasa >> From serguei.spitsyn at oracle.com Wed Feb 12 00:03:15 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 11 Feb 2020 16:03:15 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <380a4ffe-de77-ee92-2ae1-26b73e95d26a@oracle.com> References: <39200476-D017-4BFD-ABF2-8ABC59ADA8C0@oracle.com> <3edb4c50-588d-d111-2a39-c9859ec64931@oracle.com> <6609ac07-de22-dcf5-6bea-420fdbcdddb8@oracle.com> <42C18C7D-7FD3-4B4F-AFDC-9A16218AD578@oracle.com> <380a4ffe-de77-ee92-2ae1-26b73e95d26a@oracle.com> Message-ID: <671e8701-2b4c-6124-9d16-898430733dc9@oracle.com> Hi Chris, This looks okay to me. Thanks, Serguei On 2/11/20 1:49 PM, Chris Plummer wrote: > Hi Igor, > > Here's an updated webrev: > > http://cr.openjdk.java.net/~cjplummer/8238196/webrev.01/index.html > > I rebased to JDK 15 and made all the changes you suggested except for > (3). I did not think it is necessary since the code is only executed > on OSX. However, if you still feel allowing flexibility in the path > separator is important, I can add that change too. > > thanks, > > Chris > > On 2/10/20 1:34 PM, Igor Ignatyev wrote: >> Hi Chris, >> >> in general it all looks good, I have a few comments (most of them are >> editorial): >> in?Platform.java: >> 1. you have doubled spaced at line#238 (b/w boolean and?isSignedOSX) >> 2. as?FileNotFoundException is?IOException, there is no need to >> declare the former in the signature of?isSignedOSX >> 3. it's better to pass jdkPath, "bin" and "java" as separate >> arguments to Path.get, so the code won't depend on file separator >> 4. you are waiting for codesign to finish w/o reading its cout / >> cerr, which might lead to a deadlock (if?codesign will exhaust IO >> buffer before exiting), so you need to either create two separate >> threads to read cout and cerr or ?redirect these streams them to >> files and read these files afterwards or just ignore cout/cerr by >> using Redirect.DISCARD. I'd personally recommend the latter as the >> result of codesign can be reliably deduced from its exitcode (0 - >> signed, 1 - verification failed, 2 - wrong arguments, 3 - not all >> requirements from R are satisfied) and using cout/cerr is somewhat >> fragile as there is no guarantee output format won't be changed. >> >> the rest looks good to me. >> >> -- Igor >> >>> On Feb 10, 2020, at 11:48 AM, Chris Plummer >>> > wrote: >>> >>> Ping #2. It's not that hard of a review. Most of it is the new >>> Platform.isSignedOSX() method, which is well commented and pretty >>> straight froward. >>> >>> thanks, >>> >>> Chris >>> >>> On 2/4/20 5:04 PM, Chris Plummer wrote: >>>> Ping! >>>> >>>> And I decided to push to 15 instead of 14. Will backport to 14 >>>> eventually. >>>> >>>> thanks, >>>> >>>> Chris >>>> >>>> On 1/30/20 10:20 PM, Chris Plummer wrote: >>>>> Yes, you are correct: >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00 >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 1/30/20 10:13 PM, Igor Ignatyev wrote: >>>>>> Hi Chris, >>>>>> >>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00??seems to >>>>>> be a webrev from another issue, should it have been? >>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00/??? >>>>>> >>>>>> -- Igor >>>>>> >>>>>>> On Jan 30, 2020, at 10:10 PM, Chris Plummer >>>>>>> > wrote: >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> Please review the following fix for some SA tests that are >>>>>>> failing on Mac OS X 10.14.5 and later: >>>>>>> >>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 >>>>>>> >>>>>>> The issue is that SA can't attach to a signed binary starting >>>>>>> with 10.14.5. There is no workaround for this, so these tests >>>>>>> are being disabled when it is detected that the binary is signed >>>>>>> and we are running on 10.14 or later (I chose all 10.14 releases >>>>>>> to simplify the check). >>>>>>> >>>>>>> Some background may help explain the fix. In order for SA to >>>>>>> attach to a live process (not a core file) on OSX, either the >>>>>>> attaching process (ie. the test) has to be run as root, or sudo >>>>>>> needs to be supported. However, the only tests that make the >>>>>>> sudo check are the 20 or so that use ClhsdbLauncher. The rest >>>>>>> all rely on "@requires vm.hasSAandCanAttach" to filter out tests >>>>>>> that use SA attach. vm.hasSAandCanAttach only checks if the test >>>>>>> is being run as root. Thus all our non-ClhsdbLauncher tests that >>>>>>> SA attach to a live process are currently not run unless they >>>>>>> are run as root. 8238268 [1] has been filed to address this, >>>>>>> making it so all the tests will attempt to use sudo if not run >>>>>>> as root. >>>>>>> >>>>>>> Because of the difference in how ClhsdbLauncher tests and >>>>>>> "@requires? vm.hasSAandCanAttach" tests check to see if they are >>>>>>> runnable, this fix needs to address both types of checks. The >>>>>>> common code for both these cases is Platform.shouldSAAttach(), >>>>>>> which on OSX basically equates to check to see if we are running >>>>>>> as root. I changed it to also return false if running on signed >>>>>>> binary with 10.14 or later. However, this confused the >>>>>>> ClhsdbLauncher use of Platform.shouldSAAttach() somewhat, since >>>>>>> it assumed a false result only happens because you are not >>>>>>> running as root (in which case it would then check if sudo will >>>>>>> work). So ClhsdbLauncher now has double check that the false >>>>>>> result was not because of running a signed binary. If it is >>>>>>> signed, it won't do the sudo check. This will get cleaned up >>>>>>> with 8238268 [1], which will move the sudo check into >>>>>>> Platform.shouldSAAttach(). >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238268 >>>>>>> >>>>>> >>>>> >>>> >>> >> > > From chris.plummer at oracle.com Wed Feb 12 02:15:19 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 11 Feb 2020 18:15:19 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <066B2149-8EB6-4835-AA0A-49D3D5B9CFCC@oracle.com> References: <39200476-D017-4BFD-ABF2-8ABC59ADA8C0@oracle.com> <3edb4c50-588d-d111-2a39-c9859ec64931@oracle.com> <6609ac07-de22-dcf5-6bea-420fdbcdddb8@oracle.com> <42C18C7D-7FD3-4B4F-AFDC-9A16218AD578@oracle.com> <380a4ffe-de77-ee92-2ae1-26b73e95d26a@oracle.com> <37616cfd-4ad9-bed4-0838-d1238caf4c45@oracle.com> <066B2149-8EB6-4835-AA0A-49D3D5B9CFCC@oracle.com> Message-ID: <16754f9b-61a3-15d6-7cb6-8e9895662657@oracle.com> An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Wed Feb 12 03:12:02 2020 From: igor.ignatyev at oracle.com (Igor Ignatev) Date: Tue, 11 Feb 2020 19:12:02 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <16754f9b-61a3-15d6-7cb6-8e9895662657@oracle.com> References: <16754f9b-61a3-15d6-7cb6-8e9895662657@oracle.com> Message-ID: <979CFEC7-538E-46EE-988F-1935AFB8AEFA@oracle.com> rather like this : > } catch (InterruptedException e) { > Thread.currentThread().interrupt(); > return false; // assume not signed > } ? Igor > On Feb 11, 2020, at 6:15 PM, Chris Plummer wrote: > > ? > Like this? > > } catch (InterruptedException e) { > Thread.currentThread().interrupt(); > throw new RuntimeException(e); > } > > Chris > > On 2/11/20 2:23 PM, Igor Ignatyev wrote: >> no, I meant to call Thread.currentThread().interrupt(), calling that will restore interrupted state of the thread, so an user of Platform class will be able to response to it appropriately, w/ your current code, the fact that the thread was interrupted will be missed, and in most cases it is not right thing to do. >> >> -- Igor >> >>> On Feb 11, 2020, at 2:02 PM, Chris Plummer > wrote: >>> >>> Hi Igor, >>> >>> I'm not sure what you mean by restore the interrupt state. Do you mean loop back to the waitFor() call? >>> >>> thanks, >>> >>> Chris >>> >>> On 2/11/20 1:55 PM, Igor Ignatyev wrote: >>>> Hi Chris, >>>> >>>> I don't insist on (3), so I'm fine if you don't want to change that part. one thing I'd change though is to restore thread interrupted state at L#266 of Platform.java (no need to publish new webrev) >>>> >>>> Thanks, >>>> -- Igor >>>> >>>>> On Feb 11, 2020, at 1:49 PM, Chris Plummer > wrote: >>>>> >>>>> Hi Igor, >>>>> >>>>> Here's an updated webrev: >>>>> >>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.01/index.html >>>>> >>>>> I rebased to JDK 15 and made all the changes you suggested except for (3). I did not think it is necessary since the code is only executed on OSX. However, if you still feel allowing flexibility in the path separator is important, I can add that change too. >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 2/10/20 1:34 PM, Igor Ignatyev wrote: >>>>>> Hi Chris, >>>>>> >>>>>> in general it all looks good, I have a few comments (most of them are editorial): >>>>>> in Platform.java: >>>>>> 1. you have doubled spaced at line#238 (b/w boolean and isSignedOSX) >>>>>> 2. as FileNotFoundException is IOException, there is no need to declare the former in the signature of isSignedOSX >>>>>> 3. it's better to pass jdkPath, "bin" and "java" as separate arguments to Path.get, so the code won't depend on file separator >>>>>> 4. you are waiting for codesign to finish w/o reading its cout / cerr, which might lead to a deadlock (if codesign will exhaust IO buffer before exiting), so you need to either create two separate threads to read cout and cerr or redirect these streams them to files and read these files afterwards or just ignore cout/cerr by using Redirect.DISCARD. I'd personally recommend the latter as the result of codesign can be reliably deduced from its exitcode (0 - signed, 1 - verification failed, 2 - wrong arguments, 3 - not all requirements from R are satisfied) and using cout/cerr is somewhat fragile as there is no guarantee output format won't be changed. >>>>>> >>>>>> the rest looks good to me. >>>>>> >>>>>> -- Igor >>>>>> >>>>>>> On Feb 10, 2020, at 11:48 AM, Chris Plummer >> wrote: >>>>>>> >>>>>>> Ping #2. It's not that hard of a review. Most of it is the new Platform.isSignedOSX() method, which is well commented and pretty straight froward. >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 2/4/20 5:04 PM, Chris Plummer wrote: >>>>>>>> Ping! >>>>>>>> >>>>>>>> And I decided to push to 15 instead of 14. Will backport to 14 eventually. >>>>>>>> >>>>>>>> thanks, >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 1/30/20 10:20 PM, Chris Plummer wrote: >>>>>>>>> Yes, you are correct: >>>>>>>>> >>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00 >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 1/30/20 10:13 PM, Igor Ignatyev wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00? seems to be a webrev from another issue, should it have been? http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00/? ? >>>>>>>>>> >>>>>>>>>> -- Igor >>>>>>>>>> >>>>>>>>>>> On Jan 30, 2020, at 10:10 PM, Chris Plummer >> wrote: >>>>>>>>>>> >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> Please review the following fix for some SA tests that are failing on Mac OS X 10.14.5 and later: >>>>>>>>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 >>>>>>>>>>> >>>>>>>>>>> The issue is that SA can't attach to a signed binary starting with 10.14.5. There is no workaround for this, so these tests are being disabled when it is detected that the binary is signed and we are running on 10.14 or later (I chose all 10.14 releases to simplify the check). >>>>>>>>>>> >>>>>>>>>>> Some background may help explain the fix. In order for SA to attach to a live process (not a core file) on OSX, either the attaching process (ie. the test) has to be run as root, or sudo needs to be supported. However, the only tests that make the sudo check are the 20 or so that use ClhsdbLauncher. The rest all rely on "@requires vm.hasSAandCanAttach" to filter out tests that use SA attach. vm.hasSAandCanAttach only checks if the test is being run as root. Thus all our non-ClhsdbLauncher tests that SA attach to a live process are currently not run unless they are run as root. 8238268 [1] has been filed to address this, making it so all the tests will attempt to use sudo if not run as root. >>>>>>>>>>> >>>>>>>>>>> Because of the difference in how ClhsdbLauncher tests and "@requires? vm.hasSAandCanAttach" tests check to see if they are runnable, this fix needs to address both types of checks. The common code for both these cases is Platform.shouldSAAttach(), which on OSX basically equates to check to see if we are running as root. I changed it to also return false if running on signed binary with 10.14 or later. However, this confused the ClhsdbLauncher use of Platform.shouldSAAttach() somewhat, since it assumed a false result only happens because you are not running as root (in which case it would then check if sudo will work). So ClhsdbLauncher now has double check that the false result was not because of running a signed binary. If it is signed, it won't do the sudo check. This will get cleaned up with 8238268 [1], which will move the sudo check into Platform.shouldSAAttach(). >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238268 >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Wed Feb 12 03:22:52 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 12 Feb 2020 13:22:52 +1000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <7330038c-be71-99f4-f285-db762a2dc291@oracle.com> References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <7330038c-be71-99f4-f285-db762a2dc291@oracle.com> Message-ID: <6f5cbe21-1622-f6f9-29b7-9a657ea83bd2@oracle.com> As this is a decades old FAQ ... On 12/02/2020 8:50 am, David Holmes wrote: > On 11/02/2020 10:55 pm, Yasumasa Suenaga wrote: >> Hi Ralf, >> >> On 2020/02/11 0:33, Schmelter, Ralf wrote: >>> Hi Yasumasa, >>> >>>> ?? You can use `DCmdArgument` for -gz option. >>> >>> That is what I originally tried. But then you always have to supply a >>> compression level (just specifying -gz doesn't work). Since I would >>> expect most users never caring about the compression level, I >>> switched to a string option, which can handle this pattern. >> >> I think you can modify DCmdArgument::parse_value() to allow the >> operation without argument. >> Or you can add new impl function for integer types which can handle >> default value. >> >> >>>> _nr_of_threads, _id_to_write, _current in CompressionBackend should >>>> be added `volatile` at least. >>> >>> I don't think that is needed. Apart from the initialization, they are >>> only changed under lock protection. >> >> I concerned with compiler optimization. >> They are class members and they are used in `while(true)` loop. >> Of course the problem would not appear in all C++ compiler, but I >> guess it is more safely if `volatile` is added. > > As long as the variable is only accessed under the lock then volatile is > not needed. If a compiler hoisted accesses outside of locked regions > then all MT code could be broken. https://danluu.com/threads-faq/#Q56 And the same applies for non-POSIX platform compilers. Cheers, David ----- > David > ----- > > >> >>>> BTW how much processing time is different between single threaded >>>> and multi threaded? >>> >>> I've benchmarked an example, which creates a ~31 GB uncompressed >>> hprof file, with a VM which doesn't use any background threads. Here >>> are the size of the create files, the compression level and the time >>> spend: >>> >>> Uncompressed, 31.6 G, 71 sec >>> gzipped level 1, 7.57 G, 463 sec (x6.5) >>> gzipped level 3, 7.10 G, 609 sec (x8.6) >>> gzipped level 6, 6.49 G, 1415 sec (x19.9) >>> >>> So even the fastest gzip compression makes writing the dump at least >>> 5 times as slow. >>> >>>> Also I want to know what number is set to ParallelGCThreads. >>>> ParallelGCThreads seems to affect to thread num for GZip compression. >>> >>> Originally, I've tried to use the WorkGang (CollectedHeap:: >>> get_safepoint_workers()) of the GC to do the work. But this wouldn't >>> work because Shenandoah could not iterate the heap from a worker >>> thread. So I've opted to start the needed threads itself for the time >>> of the heap dump. I've used ParallelGCThreads as the maximum number >>> of threads, since this is what would be used for a GC too. So it >>> should not clog up the machine more than a GC. Maybe it would be even >>> better to additionally limit the threads by the compression level. >> >> Thanks! >> >> Yasumasa (ysuenaga) >> >> >>> Best regards, >>> Ralf Schmelter >>> >>> -----Original Message----- >>> From: Yasumasa Suenaga >>> Sent: Samstag, 8. Februar 2020 14:46 >>> To: Schmelter, Ralf ; OpenJDK Serviceability >>> >>> Cc: yasuenag at gmail.com >>> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped >>> heap dump >>> >>> Hi Ralf, >>> >>> >>> - diagnosticCommand.cpp >>> ?? You can use `DCmdArgument` for -gz option. >>> ?? If you want to use lesser type (e.g. int, unsigned char), I guess >>> you need to modify GenDCmdArgument class. >>> >>> - heapDumper.cpp >>> ?? _nr_of_threads, _id_to_write, _current in CompressionBackend >>> should be added `volatile` at least. >>> ?? (Other values need to be checked) >>> >>> >>> BTW how much processing time is different between single threaded and >>> multi threaded? >>> Also I want to know what number is set to ParallelGCThreads. >>> ParallelGCThreads seems to affect to thread num for GZip compression. >>> >>> >>> Thanks, >>> >>> Yasumasa >>> From suenaga at oss.nttdata.com Wed Feb 12 04:35:40 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Wed, 12 Feb 2020 13:35:40 +0900 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> Message-ID: <5e053110-d6eb-56ee-2d76-5c05e097cc44@oss.nttdata.com> Hi Ralf, I agree with Serguei and Laurence. +1 to -gz[=level] only. Yasumasa On 2020/02/12 0:35, Schmelter, Ralf wrote: > Hi Yasumasa, > > I think I've tried too much by using the -gz flag as a boolean plus a int value. I've decided to use two options instead: -gz as a boolean option to turn compression on and -gz-level to specify the compression level. E.g. > GC.heap_dump -gz -gz-level=3 test.hprof.gz > > Best regards, > Ralf > From chris.plummer at oracle.com Wed Feb 12 06:03:28 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 11 Feb 2020 22:03:28 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <979CFEC7-538E-46EE-988F-1935AFB8AEFA@oracle.com> References: <16754f9b-61a3-15d6-7cb6-8e9895662657@oracle.com> <979CFEC7-538E-46EE-988F-1935AFB8AEFA@oracle.com> Message-ID: <5bddcc55-5340-bea4-8cbb-1338c121d6d4@oracle.com> An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Wed Feb 12 06:07:04 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 11 Feb 2020 22:07:04 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <5bddcc55-5340-bea4-8cbb-1338c121d6d4@oracle.com> References: <16754f9b-61a3-15d6-7cb6-8e9895662657@oracle.com> <979CFEC7-538E-46EE-988F-1935AFB8AEFA@oracle.com> <5bddcc55-5340-bea4-8cbb-1338c121d6d4@oracle.com> Message-ID: <86C876E3-6005-4ADE-9224-DEF6A2908F26@oracle.com> Hi Chris, that's a common practice for any kind of library-ish code, if there are no explicit check of interrupt status, it will be checked a by next operation which might be interrupted. in this particular case, I agree rethrowing it as an unchecked exception might be a good alternative. -- Igor > On Feb 11, 2020, at 10:03 PM, Chris Plummer wrote: > > Hi Igor, > > I guess I fail to see the benefit of this. Who is going to check the interrupt status of this thread and do something meaningful with it? It seems we would want to immediately propagate the failure by throwing a RuntimeException. This will work well when called from a test since this is a common way to fail a test. The other use of this code is by VMProps.vmHasSAandCanAttach(). It looks like if a RuntimeException is thrown the right thing will happen when SafeMap.put() catches the exception (it catches all Throwables). > > Chris > > On 2/11/20 7:12 PM, Igor Ignatev wrote: >> rather like this : >> >>> } catch (InterruptedException e) { >>> Thread.currentThread().interrupt(); >>> return false; // assume not signed >>> } >> >> ? Igor >> >>> On Feb 11, 2020, at 6:15 PM, Chris Plummer > wrote: >>> >>> ? >>> Like this? >>> >>> } catch (InterruptedException e) { >>> Thread.currentThread().interrupt(); >>> throw new RuntimeException(e); >>> } >>> >>> Chris >>> >>> On 2/11/20 2:23 PM, Igor Ignatyev wrote: >>>> no, I meant to call Thread.currentThread().interrupt(), calling that will restore interrupted state of the thread, so an user of Platform class will be able to response to it appropriately, w/ your current code, the fact that the thread was interrupted will be missed, and in most cases it is not right thing to do. >>>> >>>> -- Igor >>>> >>>>> On Feb 11, 2020, at 2:02 PM, Chris Plummer > wrote: >>>>> >>>>> Hi Igor, >>>>> >>>>> I'm not sure what you mean by restore the interrupt state. Do you mean loop back to the waitFor() call? >>>>> >>>>> thanks, >>>>> >>>>> Chris >>>>> >>>>> On 2/11/20 1:55 PM, Igor Ignatyev wrote: >>>>>> Hi Chris, >>>>>> >>>>>> I don't insist on (3), so I'm fine if you don't want to change that part. one thing I'd change though is to restore thread interrupted state at L#266 of Platform.java (no need to publish new webrev) >>>>>> >>>>>> Thanks, >>>>>> -- Igor >>>>>> >>>>>>> On Feb 11, 2020, at 1:49 PM, Chris Plummer > wrote: >>>>>>> >>>>>>> Hi Igor, >>>>>>> >>>>>>> Here's an updated webrev: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.01/index.html >>>>>>> >>>>>>> I rebased to JDK 15 and made all the changes you suggested except for (3). I did not think it is necessary since the code is only executed on OSX. However, if you still feel allowing flexibility in the path separator is important, I can add that change too. >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 2/10/20 1:34 PM, Igor Ignatyev wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> in general it all looks good, I have a few comments (most of them are editorial): >>>>>>>> in Platform.java: >>>>>>>> 1. you have doubled spaced at line#238 (b/w boolean and isSignedOSX) >>>>>>>> 2. as FileNotFoundException is IOException, there is no need to declare the former in the signature of isSignedOSX >>>>>>>> 3. it's better to pass jdkPath, "bin" and "java" as separate arguments to Path.get, so the code won't depend on file separator >>>>>>>> 4. you are waiting for codesign to finish w/o reading its cout / cerr, which might lead to a deadlock (if codesign will exhaust IO buffer before exiting), so you need to either create two separate threads to read cout and cerr or redirect these streams them to files and read these files afterwards or just ignore cout/cerr by using Redirect.DISCARD. I'd personally recommend the latter as the result of codesign can be reliably deduced from its exitcode (0 - signed, 1 - verification failed, 2 - wrong arguments, 3 - not all requirements from R are satisfied) and using cout/cerr is somewhat fragile as there is no guarantee output format won't be changed. >>>>>>>> >>>>>>>> the rest looks good to me. >>>>>>>> >>>>>>>> -- Igor >>>>>>>> >>>>>>>>> On Feb 10, 2020, at 11:48 AM, Chris Plummer >> wrote: >>>>>>>>> >>>>>>>>> Ping #2. It's not that hard of a review. Most of it is the new Platform.isSignedOSX() method, which is well commented and pretty straight froward. >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 2/4/20 5:04 PM, Chris Plummer wrote: >>>>>>>>>> Ping! >>>>>>>>>> >>>>>>>>>> And I decided to push to 15 instead of 14. Will backport to 14 eventually. >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 1/30/20 10:20 PM, Chris Plummer wrote: >>>>>>>>>>> Yes, you are correct: >>>>>>>>>>> >>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00 >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 1/30/20 10:13 PM, Igor Ignatyev wrote: >>>>>>>>>>>> Hi Chris, >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00? seems to be a webrev from another issue, should it have been? http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00/? ? >>>>>>>>>>>> >>>>>>>>>>>> -- Igor >>>>>>>>>>>> >>>>>>>>>>>>> On Jan 30, 2020, at 10:10 PM, Chris Plummer >> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hello, >>>>>>>>>>>>> >>>>>>>>>>>>> Please review the following fix for some SA tests that are failing on Mac OS X 10.14.5 and later: >>>>>>>>>>>>> >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 >>>>>>>>>>>>> >>>>>>>>>>>>> The issue is that SA can't attach to a signed binary starting with 10.14.5. There is no workaround for this, so these tests are being disabled when it is detected that the binary is signed and we are running on 10.14 or later (I chose all 10.14 releases to simplify the check). >>>>>>>>>>>>> >>>>>>>>>>>>> Some background may help explain the fix. In order for SA to attach to a live process (not a core file) on OSX, either the attaching process (ie. the test) has to be run as root, or sudo needs to be supported. However, the only tests that make the sudo check are the 20 or so that use ClhsdbLauncher. The rest all rely on "@requires vm.hasSAandCanAttach" to filter out tests that use SA attach. vm.hasSAandCanAttach only checks if the test is being run as root. Thus all our non-ClhsdbLauncher tests that SA attach to a live process are currently not run unless they are run as root. 8238268 [1] has been filed to address this, making it so all the tests will attempt to use sudo if not run as root. >>>>>>>>>>>>> >>>>>>>>>>>>> Because of the difference in how ClhsdbLauncher tests and "@requires? vm.hasSAandCanAttach" tests check to see if they are runnable, this fix needs to address both types of checks. The common code for both these cases is Platform.shouldSAAttach(), which on OSX basically equates to check to see if we are running as root. I changed it to also return false if running on signed binary with 10.14 or later. However, this confused the ClhsdbLauncher use of Platform.shouldSAAttach() somewhat, since it assumed a false result only happens because you are not running as root (in which case it would then check if sudo will work). So ClhsdbLauncher now has double check that the false result was not because of running a signed binary. If it is signed, it won't do the sudo check. This will get cleaned up with 8238268 [1], which will move the sudo check into Platform.shouldSAAttach(). >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238268 >>>>> >>>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Feb 12 06:19:54 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 11 Feb 2020 22:19:54 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <86C876E3-6005-4ADE-9224-DEF6A2908F26@oracle.com> References: <16754f9b-61a3-15d6-7cb6-8e9895662657@oracle.com> <979CFEC7-538E-46EE-988F-1935AFB8AEFA@oracle.com> <5bddcc55-5340-bea4-8cbb-1338c121d6d4@oracle.com> <86C876E3-6005-4ADE-9224-DEF6A2908F26@oracle.com> Message-ID: An HTML attachment was scrubbed... URL: From igor.ignatyev at oracle.com Wed Feb 12 06:30:04 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 11 Feb 2020 22:30:04 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: References: <16754f9b-61a3-15d6-7cb6-8e9895662657@oracle.com> <979CFEC7-538E-46EE-988F-1935AFB8AEFA@oracle.com> <5bddcc55-5340-bea4-8cbb-1338c121d6d4@oracle.com> <86C876E3-6005-4ADE-9224-DEF6A2908F26@oracle.com> Message-ID: I'd say yes, it's better to still call Thread::interrupt. -- Igor > On Feb 11, 2020, at 10:19 PM, Chris Plummer wrote: > > Ok. Should I still call interrupt()? > > Chris > > On 2/11/20 10:07 PM, Igor Ignatyev wrote: >> Hi Chris, >> >> that's a common practice for any kind of library-ish code, if there are no explicit check of interrupt status, it will be checked a by next operation which might be interrupted. in this particular case, I agree rethrowing it as an unchecked exception might be a good alternative. >> >> -- Igor >> >>> On Feb 11, 2020, at 10:03 PM, Chris Plummer > wrote: >>> >>> Hi Igor, >>> >>> I guess I fail to see the benefit of this. Who is going to check the interrupt status of this thread and do something meaningful with it? It seems we would want to immediately propagate the failure by throwing a RuntimeException. This will work well when called from a test since this is a common way to fail a test. The other use of this code is by VMProps.vmHasSAandCanAttach(). It looks like if a RuntimeException is thrown the right thing will happen when SafeMap.put() catches the exception (it catches all Throwables). >>> >>> Chris >>> >>> On 2/11/20 7:12 PM, Igor Ignatev wrote: >>>> rather like this : >>>> >>>>> } catch (InterruptedException e) { >>>>> Thread.currentThread().interrupt(); >>>>> return false; // assume not signed >>>>> } >>>> >>>> ? Igor >>>> >>>>> On Feb 11, 2020, at 6:15 PM, Chris Plummer > wrote: >>>>> >>>>> ? >>>>> Like this? >>>>> >>>>> } catch (InterruptedException e) { >>>>> Thread.currentThread().interrupt(); >>>>> throw new RuntimeException(e); >>>>> } >>>>> >>>>> Chris >>>>> >>>>> On 2/11/20 2:23 PM, Igor Ignatyev wrote: >>>>>> no, I meant to call Thread.currentThread().interrupt(), calling that will restore interrupted state of the thread, so an user of Platform class will be able to response to it appropriately, w/ your current code, the fact that the thread was interrupted will be missed, and in most cases it is not right thing to do. >>>>>> >>>>>> -- Igor >>>>>> >>>>>>> On Feb 11, 2020, at 2:02 PM, Chris Plummer > wrote: >>>>>>> >>>>>>> Hi Igor, >>>>>>> >>>>>>> I'm not sure what you mean by restore the interrupt state. Do you mean loop back to the waitFor() call? >>>>>>> >>>>>>> thanks, >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 2/11/20 1:55 PM, Igor Ignatyev wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> I don't insist on (3), so I'm fine if you don't want to change that part. one thing I'd change though is to restore thread interrupted state at L#266 of Platform.java (no need to publish new webrev) >>>>>>>> >>>>>>>> Thanks, >>>>>>>> -- Igor >>>>>>>> >>>>>>>>> On Feb 11, 2020, at 1:49 PM, Chris Plummer > wrote: >>>>>>>>> >>>>>>>>> Hi Igor, >>>>>>>>> >>>>>>>>> Here's an updated webrev: >>>>>>>>> >>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.01/index.html >>>>>>>>> >>>>>>>>> I rebased to JDK 15 and made all the changes you suggested except for (3). I did not think it is necessary since the code is only executed on OSX. However, if you still feel allowing flexibility in the path separator is important, I can add that change too. >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 2/10/20 1:34 PM, Igor Ignatyev wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> in general it all looks good, I have a few comments (most of them are editorial): >>>>>>>>>> in Platform.java: >>>>>>>>>> 1. you have doubled spaced at line#238 (b/w boolean and isSignedOSX) >>>>>>>>>> 2. as FileNotFoundException is IOException, there is no need to declare the former in the signature of isSignedOSX >>>>>>>>>> 3. it's better to pass jdkPath, "bin" and "java" as separate arguments to Path.get, so the code won't depend on file separator >>>>>>>>>> 4. you are waiting for codesign to finish w/o reading its cout / cerr, which might lead to a deadlock (if codesign will exhaust IO buffer before exiting), so you need to either create two separate threads to read cout and cerr or redirect these streams them to files and read these files afterwards or just ignore cout/cerr by using Redirect.DISCARD. I'd personally recommend the latter as the result of codesign can be reliably deduced from its exitcode (0 - signed, 1 - verification failed, 2 - wrong arguments, 3 - not all requirements from R are satisfied) and using cout/cerr is somewhat fragile as there is no guarantee output format won't be changed. >>>>>>>>>> >>>>>>>>>> the rest looks good to me. >>>>>>>>>> >>>>>>>>>> -- Igor >>>>>>>>>> >>>>>>>>>>> On Feb 10, 2020, at 11:48 AM, Chris Plummer >> wrote: >>>>>>>>>>> >>>>>>>>>>> Ping #2. It's not that hard of a review. Most of it is the new Platform.isSignedOSX() method, which is well commented and pretty straight froward. >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 2/4/20 5:04 PM, Chris Plummer wrote: >>>>>>>>>>>> Ping! >>>>>>>>>>>> >>>>>>>>>>>> And I decided to push to 15 instead of 14. Will backport to 14 eventually. >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 1/30/20 10:20 PM, Chris Plummer wrote: >>>>>>>>>>>>> Yes, you are correct: >>>>>>>>>>>>> >>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00 >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 1/30/20 10:13 PM, Igor Ignatyev wrote: >>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00? seems to be a webrev from another issue, should it have been? http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00/? ? >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- Igor >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Jan 30, 2020, at 10:10 PM, Chris Plummer >> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Please review the following fix for some SA tests that are failing on Mac OS X 10.14.5 and later: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The issue is that SA can't attach to a signed binary starting with 10.14.5. There is no workaround for this, so these tests are being disabled when it is detected that the binary is signed and we are running on 10.14 or later (I chose all 10.14 releases to simplify the check). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Some background may help explain the fix. In order for SA to attach to a live process (not a core file) on OSX, either the attaching process (ie. the test) has to be run as root, or sudo needs to be supported. However, the only tests that make the sudo check are the 20 or so that use ClhsdbLauncher. The rest all rely on "@requires vm.hasSAandCanAttach" to filter out tests that use SA attach. vm.hasSAandCanAttach only checks if the test is being run as root. Thus all our non-ClhsdbLauncher tests that SA attach to a live process are currently not run unless they are run as root. 8238268 [1] has been filed to address this, making it so all the tests will attempt to use sudo if not run as root. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Because of the difference in how ClhsdbLauncher tests and "@requires? vm.hasSAandCanAttach" tests check to see if they are runnable, this fix needs to address both types of checks. The common code for both these cases is Platform.shouldSAAttach(), which on OSX basically equates to check to see if we are running as root. I changed it to also return false if running on signed binary with 10.14 or later. However, this confused the ClhsdbLauncher use of Platform.shouldSAAttach() somewhat, since it assumed a false result only happens because you are not running as root (in which case it would then check if sudo will work). So ClhsdbLauncher now has double check that the false result was not because of running a signed binary. If it is signed, it won't do the sudo check. This will get cleaned up with 8238268 [1], which will move the sudo check into Platform.shouldSAAttach(). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238268 >>>>>>> >>>>>> >>>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Wed Feb 12 08:16:13 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 12 Feb 2020 00:16:13 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: References: <16754f9b-61a3-15d6-7cb6-8e9895662657@oracle.com> <979CFEC7-538E-46EE-988F-1935AFB8AEFA@oracle.com> <5bddcc55-5340-bea4-8cbb-1338c121d6d4@oracle.com> <86C876E3-6005-4ADE-9224-DEF6A2908F26@oracle.com> Message-ID: <79c65fd4-166e-98de-9327-ede4e9a20421@oracle.com> An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Wed Feb 12 08:24:43 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Wed, 12 Feb 2020 09:24:43 +0100 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: References: Message-ID: <8ea20a15-bdb9-27b0-c306-154f539a3674@oracle.com> Hi Zhengyu, On 2020-02-07 16:53, Zhengyu Gu wrote: > Hi, > > I would like purpose this change that allows GC to provide ObjectMarker > during JVMTI heap walk. > > Currently, JVMTI heap walk uses oop markword's 'marked' pattern to > indicate 'visited' oop. > > Unfortunately, it conflicts with Shenandoah, who uses the pattern to > indicate 'forwarding'. When JVMTI heap walk occurs in some of > Shenandoah's concurrent heap (e.g. concurrent evacuation or concurrent > reference updating phases), it can result corrupted heap, as it tries to > resolve a real oop header as a forwarding pointer. > > This patch allows GC to provide ObjectMarker for JVMTI to track > 'visited' oop, and uses current implementation as default, so that, it > has no impact to GCs other than Shenandoah, who provides its own > implementation. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8238633 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.00/index.html > Would you mind if I asked you to move the object marker code to its own objectMarker.hpp/cpp files? --- Another suggestion is to move the "marking" out of the visit() function, and renamed ObjectMarker::visted() to ObjectMarker::marked(). - if (!ObjectMarker::visited(o)) { - if (!visit(o)) { + if (!marker.object_marker()->visited(o)) { + if (!visit(o, marker.object_marker())) { Would become: + if (!marker.object_marker()->mark(o)) { + if (!visit(o)) { This assert would be unnecessary: -bool VM_HeapWalkOperation::visit(oop o) { +bool VM_HeapWalkOperation::visit(oop o, ObjectMarker* object_marker) { // mark object as visited - assert(!ObjectMarker::visited(o), "can't visit same object more than once"); - ObjectMarker::mark(o); + assert(!object_marker->visited(o), "can't visit same object more than once"); + object_marker->mark(o); The name and comment would match: -// return true if object is marked -inline bool ObjectMarker::visited(oop o) { - return o->mark().is_marked(); -} --- Previously, the calls to 'mark' and 'visited' were inlineable, but now every GC has to take a virtual call when marking the objects. My guess is that this code is slow anyway, and that it doesn't matter too much, but did you measure the effect of that change with, for example, G1? Thanks, StefanK > Test: > ? hotspot_gc > ? vmTestbase_nsk_jdi > ? vmTestbase_nsk_jvmti > > Thanks, > > -Zhengyu > > From ralf.schmelter at sap.com Wed Feb 12 08:41:12 2020 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Wed, 12 Feb 2020 08:41:12 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <1f2c938f-f9ce-7fb7-7169-94daaf3542a4@oracle.com> References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <1f2c938f-f9ce-7fb7-7169-94daaf3542a4@oracle.com> Message-ID: Hi David, > I see little point in subclassing NonJavaThread (via NamedThread) but > then overriding pre_run() and post_run() so that you don't do anything > that NonJavaThread is supposed to do regarding the NJT iterator > capabilities. The problem is the post_run() method of NamedThread calls Thread::clear_thread_current(), which then makes it impossible to delete the thread at least in a debug build, since the code in ~Thread calls os::free_thread() which calls Thread::current()->->osthread() in an assert, which obviously will crash. Originally I tried not use my own threads at all and instead use the WorkGang from CollectedHeap:: get_safepoint_workers(). But this ultimately failed because I'm not allowed to iterate the heap in a worker thread on Shenandoah. Additionally ParallelGC did not implement get_safepoint_workers(), but that should have not been a problem. Maybe it is better to try to get this to work (e.g. if I could specify a foreground task when calling run_task(), the problem could be avoid by doing the iteration in the foreground task). But I'm not sure how changes in this area are seen. > For your monitor operations, you should use a MonitorLocker and then > call ml->wait() which will do the right thing with respect to "no > safepoint checks" without you needing to specify it directly. Thanks, will do. Best regards, Ralf -----Original Message----- From: David Holmes Sent: Dienstag, 11. Februar 2020 08:44 To: Schmelter, Ralf ; Yasumasa Suenaga ; OpenJDK Serviceability Cc: yasuenag at gmail.com Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump Hi again Ralf, :) A few more comments after taking a closer look at the thread code. On the surface it seems to me this is a case where it would be okay to introduce a subclass of Thread that is not JavaThread nor NonJavaThread. I see little point in subclassing NonJavaThread (via NamedThread) but then overriding pre_run() and post_run() so that you don't do anything that NonJavaThread is supposed to do regarding the NJT iterator capabilities. But we currently expect all threads to fit into one category or another, so this is problematic. :( I thinking disabling the NJT functionality is also problematic. So not sure what to suggest yet. BTW you extended NamedThread but you never actually set a name AFAICS. ?? For your monitor operations, you should use a MonitorLocker and then call ml->wait() which will do the right thing with respect to "no safepoint checks" without you needing to specify it directly. Cheers, David From ralf.schmelter at sap.com Wed Feb 12 08:52:54 2020 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Wed, 12 Feb 2020 08:52:54 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> Message-ID: Hi Serguei, the problem is, that if I make the -gz flag a jlong argument, I cannot just use '-gz'. This causes DCmdArgument::parse_value() to be called with a NULL string, which leads to an error. That is why a used a string argument in my code. But using a string when I really mean an integer seems strange too. Best regards, Ralf -----Original Message----- From: serguei.spitsyn at oracle.com Sent: Dienstag, 11. Februar 2020 20:42 To: Schmelter, Ralf ; Yasumasa Suenaga ; OpenJDK Serviceability Cc: yasuenag at gmail.com Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump Hi Ralf, I'd suggest for the option format something like this: ? -gz[=level] where level is an int. The part [=level] is optional. The level is 0 by default (if it is not set). Thanks, Serguei From richard.reingruber at sap.com Wed Feb 12 10:01:44 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 12 Feb 2020 10:01:44 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: <29a1b008-a322-a058-fdd6-e8c54cef8e6c@oracle.com> References: <29a1b008-a322-a058-fdd6-e8c54cef8e6c@oracle.com> Message-ID: Ok. I will repost and include hotspot runtime and gc lists. Thanks, Richard. -----Original Message----- From: Dean Long Sent: Dienstag, 11. Februar 2020 18:28 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant You might want to have some runtime/GC folks look at the handshake changes. dl On 2/6/20 4:39 AM, Reingruber, Richard wrote: > Hi, > > could I please get reviews for this small enhancement: > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 > > The change avoids making all compiled methods on stack not_entrant when switching a java thread to > interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. > > Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. > > Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. > > Thanks, Richard. > > See also my question if anyone knows a reason for making the compiled methods not_entrant: > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From richard.reingruber at sap.com Wed Feb 12 10:23:27 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Wed, 12 Feb 2020 10:23:27 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant Message-ID: // Repost including hotspot runtime and gc lists. // Dean Long suggested to do so, because the enhancement replaces a vm operation // with a handshake. // Original thread: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030359.html Hi, could I please get reviews for this small enhancement in hotspot's jvmti implementation: Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 The change avoids making all compiled methods on stack not_entrant when switching a java thread to interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. Thanks, Richard. See also my question if anyone knows a reason for making the compiled methods not_entrant: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From sgehwolf at redhat.com Wed Feb 12 11:22:20 2020 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 12 Feb 2020 12:22:20 +0100 Subject: [PING] RFR: 8231111: Cgroups v2: Rework Metrics in java.base so as to recognize unified hierarchy In-Reply-To: <436303A0-4D46-48BD-8EC3-23A7B213C6FA@oracle.com> References: <75fc377f8d5ca76b7dac02f55db640cbdd305633.camel@redhat.com> <4bf65380bc26cd3bf684d7994b33e66bcb87927b.camel@redhat.com> <6CACCC0D-7F5A-42A3-83F1-746497940CCA@oracle.com> <29544339574e34c4c25cbec0314c26f35e8d1a99.camel@redhat.com> <4CE7C7F6-ABFA-4263-98B2-32BBD5013A3C@oracle.com> <10b5e83bfb7e618e5f5906c8e707057ff8680785.camel@redhat.com> <4a304e2ce72a53859b4e9cc8b21db404a260b531.camel@redhat.com> <97e7ff2b-f2cc-a666-afb8-521c0f5c37e7@oracle.com> <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> <436303A0-4D46-48BD-8EC3-23A7B213C6FA@oracle.com> Message-ID: Hi Bob, On Tue, 2020-02-11 at 16:59 -0500, Bob Vandette wrote: > I applied your patch to the latest JDK 15 sources and ran the container > tests on Oracle Linux 8.1 with podman/cgroupv2 enabled. There were some issues. > I?m not sure if its my setup or not. Looking over them, they appear to be setup issues. Getting the setup working right was tricky for me as well. Aside: podman has a --runtime switch which you could point to a cgroups v2 capable crun binary for example. https://bugs.openjdk.java.net/browse/JDK-8230305 has some examples of how to use it. I also needed some extra setup to get the controllers' delegation to work: https://scrivano.org/2019/02/26/resources-management-with-rootless-containers/ > I also ran the same build on Ubuntu with docker/cgroupv1. I didn't see any failures on > the cgroupv1 system. Great, thanks! > Here are some notes: > > The podman version on OL 8.1 doesn't yes support cgroupv2. I > built the latest from sources for this test. > > The docker tests take a very long time under podman! Longer than > the cgroupv1 run. Possibly. I haven't done any fair comparison of the two. > cpusets and cpusets.mems are blank on host and if none are specified > on podman/docker run command. On cgroupv1 they are host values if not > specified for container. Effective cpusets and cpusets.mems are set properly > in a container. Yes, there seems to be slight differences to cgroup v1. You can only set cpusets on containers if the host has the controller enabled, though: # cat /sys/fs/cgroup/cgroup.subtree_control cpuset cpu io memory pids # podman run --memory=300M --cpuset-cpus=0,1 -ti -v `pwd`:/mnt fedora:30 /bin/bash [root at 5d4a4e593a24 /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpuset.cpus 0-1 [root at 5d4a4e593a24 mnt]# ./cgroupsv2-jdk/bin/java -XshowSettings:system -version Operating System Metrics: Provider: cgroupv2 Effective CPU Count: 2 CPU Period: 100000us CPU Quota: -1 CPU Shares: -1 List of Processors, 2 total: 0 1 List of Effective Processors, 2 total: 0 1 List of Memory Nodes: N/A List of Available Memory Nodes, 1 total: 0 Memory Limit: 300.00M Memory Soft Limit: Unlimited Memory & Swap Limit: 600.00M openjdk version "15-internal" 2020-09-15 OpenJDK Runtime Environment (build 15-internal+0-adhoc.sgehwolf.openjdk-head-2) OpenJDK 64-Bit Server VM (build 15-internal+0-adhoc.sgehwolf.openjdk-head-2, mixed mode, sharing) > HOST OUTPUT: > ./java -XshowSettings:system -version > Operating System Metrics: > Provider: cgroupv2 > Effective CPU Count: 32 > CPU Period: -1 > CPU Quota: -1 > CPU Shares: -1 > List of Processors: N/A > List of Effective Processors: N/A > List of Memory Nodes: N/A > List of Available Memory Nodes: N/A > Memory Limit: Unlimited > Memory Soft Limit: Unlimited > Memory & Swap Limit: Unlimited > > CONTAINER OUTPUT: > # podman run -it -v `pwd`:/mnt ubuntu bash > root at 3c6654a3b834:/mnt/jdk/bin# ./java -XshowSettings:system -version > Operating System Metrics: > Provider: cgroupv2 > Effective CPU Count: 32 > CPU Period: 100000us > CPU Quota: -1 > CPU Shares: -1 > List of Processors: N/A > List of Effective Processors, 32 total: > 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 > List of Memory Nodes: N/A > List of Available Memory Nodes, 1 total: > 0 > Memory Limit: Unlimited > Memory Soft Limit: Unlimited > Memory & Swap Limit: Unlimited Yes, host and container output differ depending on the configured test system. I remember that I had cpusets working on the host system too at some point, but I forgot what the magic was to get this properly delegated via systemd. > Docker tests fail if /bin/docker is not available in podman setup. We > probably should enhance the docker check to also look for podman. Could you be more specific about this? How do you run docker tests? I use: -e PATH -verbose:summary -Djdk.test.container.command=podman -Djdk.test.docker.image.name=fedora -Djdk.test.docker.image.version=30 with -Djdk.test.container.command=podman it shouldn't need docker. Did you specify that property? > Two container tests failed: > > FAILED: containers/cgroup/PlainRead.java failed Memory Limit is: -2 instead > of unlimited or -1. This is because memory.max is not foumd. This doesn't fail for me, because I've got memory.max present on host: # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cgroup.controllers cpu io memory pids [root at f31 sgehwolf]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/memory.max max Note, howevre, this is a hotspot test. So support for it for cgroup v2 came with JDK-8230305. It seems we should return -1 if memory.max is not found (over -2, not supported). Could you file a bug for this? It's unrelated to this change. It should be a simple fix. > FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java > This fails because nr_periods line doesn't always exist. I think you?ve got > to enable a quota for this to appear (not sure). Passes for me, but it needs the cpu controller enabled on the test system. # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cgroup.controllers cpu io memory pids # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpu.stat usage_usec 1157537 user_usec 606832 system_usec 550705 nr_periods 0 nr_throttled 0 throttled_usec 0 > Here?s the contents: > % more cpu.stat > usage_usec 23974562755 > user_usec 22257183568 > system_usec 1717379186 It suggests you've got the cpu controller disabled: https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html#cpu """ and the following three when the controller is enabled: nr_periods nr_throttled throttled_usec """ > CGROUPV2 results on Oracle Linux 8.1 > --------- > > testing container APIs > > FAILED: containers/cgroup/PlainRead.java > Passed: containers/docker/DockerBasicTest.java > Passed: containers/docker/TestCPUAwareness.java > Passed: containers/docker/TestCPUSets.java > Passed: containers/docker/TestJcmdWithSideCar.java > Passed: containers/docker/TestJFREvents.java > Passed: containers/docker/TestJFRNetworkEvents.java > Passed: containers/docker/TestMemoryAwareness.java > Passed: containers/docker/TestMisc.java > Test results: passed: 8; failed: 1 > Results written to /export/users/bobv/jdk15/build/jtreg/JTwork > Error: Some tests failed or other problems occurred. These are hotspot tests. Covered by JDK-8230305 (hotspot changes). The plain read test passes on a properly configured host system with controller delegation. > testing jdk.internal.platform APIs > > FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java > Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java > Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java > Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java > Passed: jdk/internal/platform/docker/TestSystemMetrics.java > Test results: passed: 4; failed: 1 > Results written to /export/users/bobv/jdk15/build/jtreg/JTwork > Error: Some tests failed or other problems occurred. I believe cgroup/TestCgroupMetrics.java fails due to bad host config. It passes here: [root at f31 jdk-jdk]# rm -rf JTwork/ JTreport && /media/disk/jtreg/bin/jtreg -timeout:4 -jdk:../cgroupsv2-jdk/ -e PATH -verbose:summary -Djdk.test.container.command=podman -Djdk.test.docker.image.name=fedora -Djdk.test.docker.image.version=30 test/jdk/jdk/internal/platform Directory "JTwork" not found: creating Directory "JTreport" not found: creating Passed: jdk/internal/platform/cgroup/TestCgroupMetrics.java Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java Passed: jdk/internal/platform/docker/TestSystemMetrics.java Test results: passed: 5 Report written to /home/sgehwolf/jdk-jdk/JTreport/html/report.html Results written to /home/sgehwolf/jdk-jdk/JTwork JTR files available here: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/jtreg-results/ > testing -XshowSettings:system launcher option > > Passed: tools/launcher/Settings.java > Test results: passed: 1 Thanks for running the tests! FWIW, jdk-submit came back clean. If we could get the initial support of this pushed soon it would be great. I'd be happy to fix any follow- up issues. Thanks, Severin > Bob. > > > On Feb 11, 2020, at 1:04 PM, Severin Gehwolf wrote: > > > > Hi Mandy, Bob, > > > > Thanks again for the reviews and patience on this. Sorry it took me so > > long to get back to this :-/ > > > > Updated webrev: > > Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/webrev/ > > incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/incremental/webrev/ > > > > I've tested this with docker tests on cgroup v1 and via podman on a > > cgroup v2 system. They pass. I'll be running this through jdk-submit as > > well. > > > > More below. > > > > On Tue, 2020-01-21 at 16:09 -0800, Mandy Chung wrote: > > > Hi Severin, > > > > > > Thanks for the update. > > > > > > On 1/21/20 11:30 AM, Severin Gehwolf wrote: > > > > Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/webrev/ > > > > incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/incremental/webrev/ > > > > > > > > > > I have answered my own question. Most of the metrics used to return 0 if unavailable due to IOException reading a file or malformed content and now are changed to return -2 due to error fetching the metric. > > > > > > The following are about limits which used to return -1 if unlimited or no limit set. > > > public long getCpuQuota(); > > > public long getCpuShares(); > > > public long getMemoryLimit(); > > > public long getMemoryAndSwapLimit(); > > > public long getMemorySoftLimit(); > > > > > > With this patch, only getMemoryLimit and getMemoryAndSwapLimit specify to return -1 if unlimited or no limit set. However the implementation does return -1. All of the above specify to return -2 if unavailable due to error fetching the metric. > > > > > > I found the implementation quite hard to follow. I spent some time reviewing the code to see if the implementation matches the spec but I can't easily tell yet. For example, > > > CgroupSubsystemController::getLongValueMatchingLine returns -1 when IOException occurs. > > > CgroupSubsystemController::getLongEntry returns 0L if IOException occurs. > > > > > > CgroupV1SubsystemController::convertStringToLong returns Long.MAX_VALUE > > > when value overflows > > > > This one is intentional. It's mapped back to unlimited via > > longValOrUnlimited(). The reason for this is that cgroup v1 doesn't > > have a concept of "unlimited". Unlimited values will be a very large > > numbers in cgroup v1 files. > > > > > CgroupV2SubsystemController::convertStringToLong returns -1 when IOException occurs > > > > > > CgroupV1Subsystem::getCpuShares return -1 if cpu.shares == 0 or 1024 > > > CgroupV2Subsystem::getCpuShares returns -1 if cpu.weight == 100 or 0 > > > > These two are special cases too. See the implementation note of > > Metrics.getCpuShares(). In the cgroup v2 case the default value is 100 > > (over 1024 in cgroup v1). That's why unlimited is being returned for > > those values. > > > > > CgroupV2Subsystem::getFromCpuMax returns -1 if error in reading cpu.max or malformed content > > > CgroupV2Subsystem::sumTokensIOStat returns -2 if IOException error > > > This is called by getBlkIOServiceCount and getBlkIOServiced > > > > > > I think this can be improved and add the documentation to describe > > > what the methods do. Since Metrics APIs consistently return -2 if > > > unavailable due to error in fetching the metric, why some utility > > > methods in *Subsystem and *SubsystemController return -1 upon error > > > and 0 when unlimited? > > > > > > I suspect if the getXXXValue and other methods are clearly documented > > > with the error cases (possibly renaming the method name if appropriate) > > > CgroupV1Subsystem and CgroupV2SubSystem will become very explicit > > > to understand. > > > > This should be fixed now. > > > > I've gone through the API doc of Metrics.java and have updated it. In > > general, I've updated it to return -1 if metric is unavailable (due to > > error in reading some files or content being empty), and -2 if not > > supported. No method returns -2 currently, but it might change and it's > > good to have some way of saying "not implementable" for this subsystem > > in the spec. That's my take on it anyway. > > > > There is also a new unit test for shared controller logic: > > TestCgroupSubsystemController.java > > > > It execises various cases of error/success. > > > > That is to ensure proper symmetry across the various cases (including > > IOException). I've also documented static methods in > > CgroupSubsystemController. Overall, all methods now return the same > > values for cgroup v1 and cgroup v2 (given the impl nuances) for the > > various cases. > > > > > CgroupSubsystem.java > > > > > > 44 public static final double DOUBLE_RETVAL_NOT_SUPPORTED = LONG_RETVAL_NOT_SUPPORTED; > > > 49 public static final Boolean BOOL_RETVAL_NOT_SUPPORTED = null; > > > > > > They are no longer needed, right? > > > > Removed. > > > > > CgroupSubsystemFactory.java > > > > > > 89 System.err.println("Warning: Mixed cgroupv1 and cgroupv2 not supported. Metrics disabled."); > > > > > > > > > I expect this be a System.Logger log > > > > Updated. > > > > > 114 if (!Integer.valueOf(0).toString().equals(tokens[0])) { > > > > > > This can be simplified to if (!"0".equals(tokens[0])) > > > > Done, thanks! > > > > > LauncherHelper.java > > > > > > 407 // Extended cgroupv1 specific metrics > > > 408 if (c instanceof MetricsCgroupV1) { > > > 409 MetricsCgroupV1 cgroupV1 = (MetricsCgroupV1)c; > > > 410 limit = cgroupV1.getKernelMemoryLimit(); > > > 411 ostream.println(formatLimitString(limit, INDENT + "Kernel Memory Limit: ")); > > > 412 limit = cgroupV1.getTcpMemoryLimit(); > > > 413 ostream.println(formatLimitString(limit, INDENT + "TCP Memory Limit: ")); > > > 414 Boolean value = cgroupV1.isMemoryOOMKillEnabled(); > > > 415 ostream.println(formatBoolean(value, INDENT + "Out Of Memory Killer Enabled: ")); > > > 416 value = cgroupV1.isCpuSetMemoryPressureEnabled(); > > > 417 ostream.println(formatBoolean(value, INDENT + "CPUSet Memory Pressure Enabled: ")); > > > 418 } > > > > > > MetricsCgroupV1 is linux-only. It will fail the compilation when > > > building on non-linux. One option is to move this code to > > > src/java.base/linux/share/sun/launcher/CgroupMetrics.java > > > > > > Are they continued to be interesting metrics to be output from > > > -XshowSetting? I wonder if they can simply be dropped from the output. > > > Bob will have an opinion. > > > > I've removed those extra cgroup v1 specific metrics printed via > > -XshowSettings:system. Not sure what to do with MetricsCgroupV1. It's > > only used in tests in webrev 10. On the other hand the idea would be > > for consumers to downcast it to MetricsCgroupV1 if they needed those > > extra metrics. > > > > Thanks, > > Severin > > From ralf.schmelter at sap.com Wed Feb 12 12:17:31 2020 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Wed, 12 Feb 2020 12:17:31 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <01361a9d-2855-db67-a176-73731fada08f@oracle.com> References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> Message-ID: Hi Serguei, the use case is being able to get a heap dump from big Java servers. These usually run on machines with a lot of memory and CPUs, but not much disk space (which they don't need apart from some trace files and the server code itself). And if we can get the customer to mount some NFS file system on the machine, it is usually slow. So writing only a third or forth of the data is a big win. Doing the compression outside the VM would either depend on the hprof file written first (so we would still need the disk space) or have another channel to dump the data (e.g. via socket). But this would add complexity too and needs an external program. I've compiled 2 release versions on Windows with and without my change. The change adds 14.5k to the server.dll (which is 10.4 MB). Not sure if this is considered acceptable. Best regards, Ralf -----Original Message----- From: serguei.spitsyn at oracle.com Sent: Dienstag, 11. Februar 2020 20:49 To: Schmelter, Ralf ; Yasumasa Suenaga ; OpenJDK Serviceability Cc: yasuenag at gmail.com Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump Ralf, I see this feature adds a lot of code. In fact, I'm not sure, it is worth to add this kind of complexity (including new compressing threads) into the VM implementation. What is a real use case behind it? Could this compressing be done separately from VM implementation? Thanks, Serguei From david.holmes at oracle.com Wed Feb 12 13:58:39 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 12 Feb 2020 23:58:39 +1000 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <79c65fd4-166e-98de-9327-ede4e9a20421@oracle.com> References: <16754f9b-61a3-15d6-7cb6-8e9895662657@oracle.com> <979CFEC7-538E-46EE-988F-1935AFB8AEFA@oracle.com> <5bddcc55-5340-bea4-8cbb-1338c121d6d4@oracle.com> <86C876E3-6005-4ADE-9224-DEF6A2908F26@oracle.com> <79c65fd4-166e-98de-9327-ede4e9a20421@oracle.com> Message-ID: <3a21aa8f-8081-da59-cb10-87e79f81e2e3@oracle.com> Hi Chris, I think you are overthinking this. :) What you have observed is that the code that actually uses this method does not utilise interrupts, or expect them, so if you artifically inject one in this library method then you see things failing in unexpected ways. That also means that if the thread was interrupted by some other piece of logic then it would also fail in unexpected ways. That doesn't negate your choice to re-assert the interrupt state. From a library writing perspective if you have a method that performs a blocking call that can throw InterruptedException then you generally have three choices: 1. Throw InterruptedException yourself and pass the buck to your callers. 2. Convert the InterruptedException to a more general failure exception - typically an unchecked RuntimeException - for which interruption is but one possible cause; or 3. Catch the InterruptedException and allow the method to complete normally (i.e. not by throwing an exception) but re-assert the interrupt state so that a caller checking for interruption will still see that it occurred. What you have below is a mix of #2 and #3 - you convert to a generic exception but also re-assert the interrupt state. That's a little unusual. David On 12/02/2020 6:16 pm, Chris Plummer wrote: > Hi Igor, > > I think it might be best to the interrupt() call out. I wanted to see > what would happen if we ever got an InterruptedException, so I added the > following to the start of Platform.shouldSAAttach(): > > ??????? try { > ??????????? throw new InterruptedException(); > ??????? } catch (InterruptedException e) { > ??????????? Thread.currentThread().interrupt(); > ??????????? throw new RuntimeException(e); > ??????? } > > At the start of the test run, before any tests are actually run, I see > the following: > > failed to get value for vm.hasSAandCanAttach > java.lang.RuntimeException: java.lang.InterruptedException > ??? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:300) > ??? at requires.VMProps.vmHasSAandCanAttach(VMProps.java:327) > ??? at requires.VMProps$SafeMap.put(VMProps.java:69) > ??? at requires.VMProps.call(VMProps.java:101) > ??? at requires.VMProps.call(VMProps.java:57) > ??? at > com.sun.javatest.regtest.agent.GetJDKProperties.run(GetJDKProperties.java:80) > ??? at > com.sun.javatest.regtest.agent.GetJDKProperties.main(GetJDKProperties.java:54) > Caused by: java.lang.InterruptedException > ??? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:297) > ??? ... 6 more > > This seems reasonable. > > For each test that checks vm.hasSAandCanAttach I also see. > > TEST RESULT: Error. Error evaluating expression: vm.hasSAandCanAttach: > java.lang.RuntimeException: java.lang.InterruptedException > > This too seems reasonable. > > For tests that don't check vm.hasSAandCanAttach, but instead make a > runtime check that calls Platform.shouldSAAttach(), the test fails with: > > java.lang.IllegalThreadStateException: process hasn't exited > ??? at java.base/java.lang.ProcessImpl.exitValue(ProcessImpl.java:500) > ??? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:380) > ??? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:433) > ??? at ClhsdbAttach.main(ClhsdbAttach.java:77) > ??? at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > ??? at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ??? at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ??? at java.base/java.lang.reflect.Method.invoke(Method.java:564) > ??? at > com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) > ??? at java.base/java.lang.Thread.run(Thread.java:832) > > This is a confusing way to fail. The reason it fails this way is because > stopApp() first calls waitAppTerminiate(), which does the following: > > ??? public void waitAppTerminate() { > ??????? // This code is modeled after tail end of ProcessTools.getOutput(). > ??????? try { > ??????????? appProcess.waitFor(); > ??????????? outPumperThread.join(); > ??????????? errPumperThread.join(); > ??????? } catch (InterruptedException e) { > ??????????? Thread.currentThread().interrupt(); > ??????????? // pass > ??????? } > ??? } > > I added an e.printStackTrace() call and see the following: > > java.lang.InterruptedException > ??? at java.base/java.lang.Object.wait(Native Method) > ??? at java.base/java.lang.Object.wait(Object.java:321) > ??? at java.base/java.lang.ProcessImpl.waitFor(ProcessImpl.java:474) > ??? at jdk.test.lib.apps.LingeredApp.waitAppTerminate(LingeredApp.java:239) > ??? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:380) > ??? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:434) > > So the earlier call to interrupt() is resulting in waitAppTerminate() > not actually waiting for exit. This then results in stopApp() getting > IllegalThreadStateException when calling Process.exitValue(). > > If I comment out the call to interrupt() in Platform.shouldSAAttach(), I > think the failure stack trace is much better: > > java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: > java.lang.InterruptedException > ??? at ClhsdbAttach.main(ClhsdbAttach.java:75) > ??? at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > ??? at > java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ??? at > java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ??? at java.base/java.lang.reflect.Method.invoke(Method.java:564) > ??? at > com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) > ??? at java.base/java.lang.Thread.run(Thread.java:832) > Caused by: java.lang.RuntimeException: java.lang.InterruptedException > ??? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:300) > ??? at ClhsdbLauncher.run(ClhsdbLauncher.java:199) > ??? at ClhsdbAttach.main(ClhsdbAttach.java:71) > ??? ... 6 more > Caused by: java.lang.InterruptedException > ??? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:297) > ??? ... 8 more > > There's still a minor issue with rethrowing the RuntimeException > encapsulated inside another RuntimeException. That the fault of the test > which is catching all Exceptions and encapsulating them in a > RuntimeException, even if the Exceptions itself is already a > RuntimeException. It should add have a catch clause for > RuntimeException, and just rethrow it without encapulating it. All the > Clhsdb tests seem to do this, so that's about 20 places to fix. Probably > not worth doing unless some other cleanup is being done at the same time. > > Chris > > On 2/11/20 10:30 PM, Igor Ignatyev wrote: >> I'd say yes, it's better to still call Thread::interrupt. >> >> -- Igor >> >>> On Feb 11, 2020, at 10:19 PM, Chris Plummer >> > wrote: >>> >>> Ok. Should I still call interrupt()? >>> >>> Chris >>> >>> On 2/11/20 10:07 PM, Igor Ignatyev wrote: >>>> Hi Chris, >>>> >>>> that's a common practice for any kind of library-ish code, if there >>>> are no explicit check of interrupt status, it will be checked a by >>>> next operation which might be interrupted. in this particular case, >>>> I agree rethrowing it as an unchecked exception might be a good >>>> alternative. >>>> >>>> -- Igor >>>> >>>>> On Feb 11, 2020, at 10:03 PM, Chris Plummer >>>>> > wrote: >>>>> >>>>> Hi Igor, >>>>> >>>>> I guess I fail to see the benefit of this. Who is going to check >>>>> the interrupt status of this thread and do something meaningful >>>>> with it? It seems we would want to immediately propagate the >>>>> failure by throwing a RuntimeException. This will work well when >>>>> called from a test since this is a common way to fail a test. The >>>>> other use of this code is by VMProps.vmHasSAandCanAttach(). It >>>>> looks like if a RuntimeException is thrown the right thing will >>>>> happen when SafeMap.put() catches the exception (it catches all >>>>> Throwables). >>>>> >>>>> Chris >>>>> >>>>> On 2/11/20 7:12 PM, Igor Ignatev wrote: >>>>>> rather like this : >>>>>> >>>>>>> } catch (InterruptedException e) { >>>>>>> ?Thread.currentThread().interrupt(); >>>>>>> ? ?return false; // assume not signed >>>>>>> } >>>>>> >>>>>> ? Igor >>>>>> >>>>>>> On Feb 11, 2020, at 6:15 PM, Chris Plummer >>>>>>> > wrote: >>>>>>> >>>>>>> ? >>>>>>> Like this? >>>>>>> >>>>>>> ??????? } catch (InterruptedException e) { >>>>>>> Thread.currentThread().interrupt(); >>>>>>> ??????????? throw new RuntimeException(e); >>>>>>> ??????? } >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 2/11/20 2:23 PM, Igor Ignatyev wrote: >>>>>>>> no, I meant to call Thread.currentThread().interrupt(), calling >>>>>>>> that will restore interrupted state of the thread, so an user of >>>>>>>> Platform class will be able to response to it appropriately, w/ >>>>>>>> your current code, the fact that the thread was interrupted will >>>>>>>> be missed, and in most cases it is not right thing to do. >>>>>>>> >>>>>>>> -- Igor >>>>>>>> >>>>>>>>> On Feb 11, 2020, at 2:02 PM, Chris Plummer >>>>>>>>> > wrote: >>>>>>>>> >>>>>>>>> Hi Igor, >>>>>>>>> >>>>>>>>> I'm not sure what you mean by restore the interrupt state. Do >>>>>>>>> you mean loop back to the waitFor() call? >>>>>>>>> >>>>>>>>> thanks, >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 2/11/20 1:55 PM, Igor Ignatyev wrote: >>>>>>>>>> Hi Chris, >>>>>>>>>> >>>>>>>>>> I don't insist on (3), so I'm fine if you don't want to change >>>>>>>>>> that part. one thing I'd change though is to restore thread >>>>>>>>>> interrupted state at L#266 of Platform.java (no need to >>>>>>>>>> publish new webrev) >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> -- Igor >>>>>>>>>> >>>>>>>>>>> On Feb 11, 2020, at 1:49 PM, Chris Plummer >>>>>>>>>>> > >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Igor, >>>>>>>>>>> >>>>>>>>>>> Here's an updated webrev: >>>>>>>>>>> >>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.01/index.html >>>>>>>>>>> >>>>>>>>>>> I rebased to JDK 15 and made all the changes you suggested >>>>>>>>>>> except for (3). I did not think it is necessary since the >>>>>>>>>>> code is only executed on OSX. However, if you still feel >>>>>>>>>>> allowing flexibility in the path separator is important, I >>>>>>>>>>> can add that change too. >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 2/10/20 1:34 PM, Igor Ignatyev wrote: >>>>>>>>>>>> Hi Chris, >>>>>>>>>>>> >>>>>>>>>>>> in general it all looks good, I have a few comments (most of >>>>>>>>>>>> them are editorial): >>>>>>>>>>>> in?Platform.java: >>>>>>>>>>>> 1. you have doubled spaced at line#238 (b/w boolean >>>>>>>>>>>> and?isSignedOSX) >>>>>>>>>>>> 2. as?FileNotFoundException is?IOException, there is no need >>>>>>>>>>>> to declare the former in the signature of?isSignedOSX >>>>>>>>>>>> 3. it's better to pass jdkPath, "bin" and "java" as separate >>>>>>>>>>>> arguments to Path.get, so the code won't depend on file >>>>>>>>>>>> separator >>>>>>>>>>>> 4. you are waiting for codesign to finish w/o reading its >>>>>>>>>>>> cout / cerr, which might lead to a deadlock (if?codesign >>>>>>>>>>>> will exhaust IO buffer before exiting), so you need to >>>>>>>>>>>> either create two separate threads to read cout and cerr or >>>>>>>>>>>> ?redirect these streams them to files and read these files >>>>>>>>>>>> afterwards or just ignore cout/cerr by using >>>>>>>>>>>> Redirect.DISCARD. I'd personally recommend the latter as the >>>>>>>>>>>> result of codesign can be reliably deduced from its exitcode >>>>>>>>>>>> (0 - signed, 1 - verification failed, 2 - wrong arguments, 3 >>>>>>>>>>>> - not all requirements from R are satisfied) and using >>>>>>>>>>>> cout/cerr is somewhat fragile as there is no guarantee >>>>>>>>>>>> output format won't be changed. >>>>>>>>>>>> >>>>>>>>>>>> the rest looks good to me. >>>>>>>>>>>> >>>>>>>>>>>> -- Igor >>>>>>>>>>>> >>>>>>>>>>>>> On Feb 10, 2020, at 11:48 AM, Chris Plummer >>>>>>>>>>>>> >>>>>>>>>>>> > >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Ping #2. It's not that hard of a review. Most of it is the >>>>>>>>>>>>> new Platform.isSignedOSX() method, which is well commented >>>>>>>>>>>>> and pretty straight froward. >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 2/4/20 5:04 PM, Chris Plummer wrote: >>>>>>>>>>>>>> Ping! >>>>>>>>>>>>>> >>>>>>>>>>>>>> And I decided to push to 15 instead of 14. Will backport >>>>>>>>>>>>>> to 14 eventually. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 1/30/20 10:20 PM, Chris Plummer wrote: >>>>>>>>>>>>>>> Yes, you are correct: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 1/30/20 10:13 PM, Igor Ignatyev wrote: >>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00? >>>>>>>>>>>>>>>> ?seems >>>>>>>>>>>>>>>> to be a webrev from another issue, should it have >>>>>>>>>>>>>>>> been?http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00/? >>>>>>>>>>>>>>>> ?? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- Igor >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Jan 30, 2020, at 10:10 PM, Chris Plummer >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Please review the following fix for some SA tests that >>>>>>>>>>>>>>>>> are failing on Mac OS X 10.14.5 and later: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> The issue is that SA can't attach to a signed binary >>>>>>>>>>>>>>>>> starting with 10.14.5. There is no workaround for this, >>>>>>>>>>>>>>>>> so these tests are being disabled when it is detected >>>>>>>>>>>>>>>>> that the binary is signed and we are running on 10.14 >>>>>>>>>>>>>>>>> or later (I chose all 10.14 releases to simplify the >>>>>>>>>>>>>>>>> check). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Some background may help explain the fix. In order for >>>>>>>>>>>>>>>>> SA to attach to a live process (not a core file) on >>>>>>>>>>>>>>>>> OSX, either the attaching process (ie. the test) has to >>>>>>>>>>>>>>>>> be run as root, or sudo needs to be supported. However, >>>>>>>>>>>>>>>>> the only tests that make the sudo check are the 20 or >>>>>>>>>>>>>>>>> so that use ClhsdbLauncher. The rest all rely on >>>>>>>>>>>>>>>>> "@requires vm.hasSAandCanAttach" to filter out tests >>>>>>>>>>>>>>>>> that use SA attach. vm.hasSAandCanAttach only checks if >>>>>>>>>>>>>>>>> the test is being run as root. Thus all our >>>>>>>>>>>>>>>>> non-ClhsdbLauncher tests that SA attach to a live >>>>>>>>>>>>>>>>> process are currently not run unless they are run as >>>>>>>>>>>>>>>>> root. 8238268 [1] has been filed to address this, >>>>>>>>>>>>>>>>> making it so all the tests will attempt to use sudo if >>>>>>>>>>>>>>>>> not run as root. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Because of the difference in how ClhsdbLauncher tests >>>>>>>>>>>>>>>>> and "@requires? vm.hasSAandCanAttach" tests check to >>>>>>>>>>>>>>>>> see if they are runnable, this fix needs to address >>>>>>>>>>>>>>>>> both types of checks. The common code for both these >>>>>>>>>>>>>>>>> cases is Platform.shouldSAAttach(), which on OSX >>>>>>>>>>>>>>>>> basically equates to check to see if we are running as >>>>>>>>>>>>>>>>> root. I changed it to also return false if running on >>>>>>>>>>>>>>>>> signed binary with 10.14 or later. However, this >>>>>>>>>>>>>>>>> confused the ClhsdbLauncher use of >>>>>>>>>>>>>>>>> Platform.shouldSAAttach() somewhat, since it assumed a >>>>>>>>>>>>>>>>> false result only happens because you are not running >>>>>>>>>>>>>>>>> as root (in which case it would then check if sudo will >>>>>>>>>>>>>>>>> work). So ClhsdbLauncher now has double check that the >>>>>>>>>>>>>>>>> false result was not because of running a signed >>>>>>>>>>>>>>>>> binary. If it is signed, it won't do the sudo check. >>>>>>>>>>>>>>>>> This will get cleaned up with 8238268 [1], which will >>>>>>>>>>>>>>>>> move the sudo check into Platform.shouldSAAttach(). >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238268 >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>>> >>> >> > From bob.vandette at oracle.com Wed Feb 12 15:13:09 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Wed, 12 Feb 2020 10:13:09 -0500 Subject: [PING] RFR: 8231111: Cgroups v2: Rework Metrics in java.base so as to recognize unified hierarchy In-Reply-To: References: <75fc377f8d5ca76b7dac02f55db640cbdd305633.camel@redhat.com> <4bf65380bc26cd3bf684d7994b33e66bcb87927b.camel@redhat.com> <6CACCC0D-7F5A-42A3-83F1-746497940CCA@oracle.com> <29544339574e34c4c25cbec0314c26f35e8d1a99.camel@redhat.com> <4CE7C7F6-ABFA-4263-98B2-32BBD5013A3C@oracle.com> <10b5e83bfb7e618e5f5906c8e707057ff8680785.camel@redhat.com> <4a304e2ce72a53859b4e9cc8b21db404a260b531.camel@redhat.com> <97e7ff2b-f2cc-a666-afb8-521c0f5c37e7@oracle.com> <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> <436303A0-4D46-48BD-8EC3-23A7B213C6FA@oracle.com> Message-ID: <04BA2357-D5B3-4553-A369-D4790B225503@oracle.com> > On Feb 12, 2020, at 6:22 AM, Severin Gehwolf wrote: > > Hi Bob, > > On Tue, 2020-02-11 at 16:59 -0500, Bob Vandette wrote: >> I applied your patch to the latest JDK 15 sources and ran the container >> tests on Oracle Linux 8.1 with podman/cgroupv2 enabled. There were some issues. >> I?m not sure if its my setup or not. > > Looking over them, they appear to be setup issues. > > Getting the setup working right was tricky for me as well. > > Aside: > podman has a --runtime switch which you could point to a cgroups v2 > capable crun binary for example. > https://bugs.openjdk.java.net/browse/JDK-8230305 has some examples of > how to use it. > > I also needed some extra setup to get the controllers' delegation to > work: > https://urldefense.com/v3/__https://scrivano.org/2019/02/26/resources-management-with-rootless-containers/__;!!GqivPVa7Brio!Ot7tv-QJgGntRyaimzIRrRh7rqvnzMu9AoOaBWINWNnnpR9LUVNNvGUe9GXp0Ri0Wg$ > >> I also ran the same build on Ubuntu with docker/cgroupv1. I didn't see any failures on >> the cgroupv1 system. > > Great, thanks! > >> Here are some notes: >> >> The podman version on OL 8.1 doesn't yes support cgroupv2. I >> built the latest from sources for this test. >> >> The docker tests take a very long time under podman! Longer than >> the cgroupv1 run. > > Possibly. I haven't done any fair comparison of the two. It?s possible that it?s trying to access several container hubs. I may be getting some timeouts since I haven?t figured out how to specify proxies for podman. I?ve set a local http_proxy env variable but maybe that?s not working. > >> cpusets and cpusets.mems are blank on host and if none are specified >> on podman/docker run command. On cgroupv1 they are host values if not >> specified for container. Effective cpusets and cpusets.mems are set properly >> in a container. > > Yes, there seems to be slight differences to cgroup v1. You can only > set cpusets on containers if the host has the controller enabled, > though: > > # cat /sys/fs/cgroup/cgroup.subtree_control > cpuset cpu io memory pids > # podman run --memory=300M --cpuset-cpus=0,1 -ti -v `pwd`:/mnt fedora:30 /bin/bash > [root at 5d4a4e593a24 /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpuset.cpus > 0-1 > [root at 5d4a4e593a24 mnt]# ./cgroupsv2-jdk/bin/java -XshowSettings:system -version > Operating System Metrics: > Provider: cgroupv2 > Effective CPU Count: 2 > CPU Period: 100000us > CPU Quota: -1 > CPU Shares: -1 > List of Processors, 2 total: > 0 1 > List of Effective Processors, 2 total: > 0 1 > List of Memory Nodes: N/A > List of Available Memory Nodes, 1 total: > 0 > Memory Limit: 300.00M > Memory Soft Limit: Unlimited > Memory & Swap Limit: 600.00M > > openjdk version "15-internal" 2020-09-15 > OpenJDK Runtime Environment (build 15-internal+0-adhoc.sgehwolf.openjdk-head-2) > OpenJDK 64-Bit Server VM (build 15-internal+0-adhoc.sgehwolf.openjdk-head-2, mixed mode, sharing) > In my setup, your command above works correctly. The only thing that doesn?t work is if I run the -XshowSettings directly on the host OR I run a container without specifying ?cpuset-cpus. The default setup seems wrong. > >> HOST OUTPUT: >> ./java -XshowSettings:system -version >> Operating System Metrics: >> Provider: cgroupv2 >> Effective CPU Count: 32 >> CPU Period: -1 >> CPU Quota: -1 >> CPU Shares: -1 >> List of Processors: N/A >> List of Effective Processors: N/A >> List of Memory Nodes: N/A >> List of Available Memory Nodes: N/A >> Memory Limit: Unlimited >> Memory Soft Limit: Unlimited >> Memory & Swap Limit: Unlimited >> >> CONTAINER OUTPUT: >> # podman run -it -v `pwd`:/mnt ubuntu bash >> root at 3c6654a3b834:/mnt/jdk/bin# ./java -XshowSettings:system -version >> Operating System Metrics: >> Provider: cgroupv2 >> Effective CPU Count: 32 >> CPU Period: 100000us >> CPU Quota: -1 >> CPU Shares: -1 >> List of Processors: N/A >> List of Effective Processors, 32 total: >> 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 >> List of Memory Nodes: N/A >> List of Available Memory Nodes, 1 total: >> 0 >> Memory Limit: Unlimited >> Memory Soft Limit: Unlimited >> Memory & Swap Limit: Unlimited > > Yes, host and container output differ depending on the configured test > system. I remember that I had cpusets working on the host system too at > some point, but I forgot what the magic was to get this properly > delegated via systemd. > >> Docker tests fail if /bin/docker is not available in podman setup. We >> probably should enhance the docker check to also look for podman. > > Could you be more specific about this? How do you run docker tests? I > use: > > -e PATH -verbose:summary -Djdk.test.container.command=podman -Djdk.test.docker.image.name=fedora -Djdk.test.docker.image.version=30 > > with -Djdk.test.container.command=podman it shouldn't need docker. Did > you specify that property? I thought that most podman setups try to provide docker compatibility. I did not try using the properties. In order to make our automation work cleaner, I was hoping that we could always just execute docker. > >> Two container tests failed: >> >> FAILED: containers/cgroup/PlainRead.java failed Memory Limit is: -2 instead >> of unlimited or -1. This is because memory.max is not foumd. > > This doesn't fail for me, because I've got memory.max present on host: > > # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cgroup.controllers > cpu io memory pids > [root at f31 sgehwolf]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/memory.max > max > > Note, howevre, this is a hotspot test. So support for it for cgroup v2 > came with JDK-8230305. It seems we should return -1 if memory.max is > not found (over -2, not supported). Could you file a bug for this? It's > unrelated to this change. It should be a simple fix. Here are my controllers: cpuset cpu io memory pids rdma This seems like another case where this file doesn?t exist in the path we form on the host. [0.035s][debug][os,container] /sys/fs/cgroup/user.slice/user-23603.slice/session-14.scope/cpu.max failed, No such file or directory It does exist here: /sys/fs/cgroup/user.slice so maybe it?s a delegation issue. > >> FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java >> This fails because nr_periods line doesn't always exist. I think you?ve got >> to enable a quota for this to appear (not sure). > > Passes for me, but it needs the cpu controller enabled on the test > system. > > # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cgroup.controllers > cpu io memory pids > # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpu.stat > usage_usec 1157537 > user_usec 606832 > system_usec 550705 > nr_periods 0 > nr_throttled 0 > throttled_usec 0 > >> Here?s the contents: >> % more cpu.stat >> usage_usec 23974562755 >> user_usec 22257183568 >> system_usec 1717379186 > > It suggests you've got the cpu controller disabled: > https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html*cpu__;Iw!!GqivPVa7Brio!Ot7tv-QJgGntRyaimzIRrRh7rqvnzMu9AoOaBWINWNnnpR9LUVNNvGUe9GWZKkNZMA$ > > """ > and the following three when the controller is enabled: > > > nr_periods > nr_throttled > throttled_usec > ?"" I do have the cpu controller enabled. I?ll look into your fixes for delegation issues and try again. Bob. > >> CGROUPV2 results on Oracle Linux 8.1 >> --------- >> >> testing container APIs >> >> FAILED: containers/cgroup/PlainRead.java >> Passed: containers/docker/DockerBasicTest.java >> Passed: containers/docker/TestCPUAwareness.java >> Passed: containers/docker/TestCPUSets.java >> Passed: containers/docker/TestJcmdWithSideCar.java >> Passed: containers/docker/TestJFREvents.java >> Passed: containers/docker/TestJFRNetworkEvents.java >> Passed: containers/docker/TestMemoryAwareness.java >> Passed: containers/docker/TestMisc.java >> Test results: passed: 8; failed: 1 >> Results written to /export/users/bobv/jdk15/build/jtreg/JTwork >> Error: Some tests failed or other problems occurred. > > These are hotspot tests. Covered by JDK-8230305 (hotspot changes). The > plain read test passes on a properly configured host system with > controller delegation. > >> testing jdk.internal.platform APIs >> >> FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java >> Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java >> Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java >> Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java >> Passed: jdk/internal/platform/docker/TestSystemMetrics.java >> Test results: passed: 4; failed: 1 >> Results written to /export/users/bobv/jdk15/build/jtreg/JTwork >> Error: Some tests failed or other problems occurred. > > I believe cgroup/TestCgroupMetrics.java fails due to bad host config. > It passes here: > > [root at f31 jdk-jdk]# rm -rf JTwork/ JTreport && /media/disk/jtreg/bin/jtreg -timeout:4 -jdk:../cgroupsv2-jdk/ -e PATH -verbose:summary -Djdk.test.container.command=podman -Djdk.test.docker.image.name=fedora -Djdk.test.docker.image.version=30 test/jdk/jdk/internal/platform > Directory "JTwork" not found: creating > Directory "JTreport" not found: creating > Passed: jdk/internal/platform/cgroup/TestCgroupMetrics.java > Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java > Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java > Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java > Passed: jdk/internal/platform/docker/TestSystemMetrics.java > Test results: passed: 5 > Report written to /home/sgehwolf/jdk-jdk/JTreport/html/report.html > Results written to /home/sgehwolf/jdk-jdk/JTwork > > JTR files available here: > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/jtreg-results/ > >> testing -XshowSettings:system launcher option >> >> Passed: tools/launcher/Settings.java >> Test results: passed: 1 > > Thanks for running the tests! > > FWIW, jdk-submit came back clean. If we could get the initial support > of this pushed soon it would be great. I'd be happy to fix any follow- > up issues. > > Thanks, > Severin > >> Bob. >> >>> On Feb 11, 2020, at 1:04 PM, Severin Gehwolf wrote: >>> >>> Hi Mandy, Bob, >>> >>> Thanks again for the reviews and patience on this. Sorry it took me so >>> long to get back to this :-/ >>> >>> Updated webrev: >>> Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/webrev/ >>> incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/incremental/webrev/ >>> >>> I've tested this with docker tests on cgroup v1 and via podman on a >>> cgroup v2 system. They pass. I'll be running this through jdk-submit as >>> well. >>> >>> More below. >>> >>> On Tue, 2020-01-21 at 16:09 -0800, Mandy Chung wrote: >>>> Hi Severin, >>>> >>>> Thanks for the update. >>>> >>>> On 1/21/20 11:30 AM, Severin Gehwolf wrote: >>>>> Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/webrev/ >>>>> incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/incremental/webrev/ >>>>> >>>> >>>> I have answered my own question. Most of the metrics used to return 0 if unavailable due to IOException reading a file or malformed content and now are changed to return -2 due to error fetching the metric. >>>> >>>> The following are about limits which used to return -1 if unlimited or no limit set. >>>> public long getCpuQuota(); >>>> public long getCpuShares(); >>>> public long getMemoryLimit(); >>>> public long getMemoryAndSwapLimit(); >>>> public long getMemorySoftLimit(); >>>> >>>> With this patch, only getMemoryLimit and getMemoryAndSwapLimit specify to return -1 if unlimited or no limit set. However the implementation does return -1. All of the above specify to return -2 if unavailable due to error fetching the metric. >>>> >>>> I found the implementation quite hard to follow. I spent some time reviewing the code to see if the implementation matches the spec but I can't easily tell yet. For example, >>>> CgroupSubsystemController::getLongValueMatchingLine returns -1 when IOException occurs. >>>> CgroupSubsystemController::getLongEntry returns 0L if IOException occurs. >>>> >>>> CgroupV1SubsystemController::convertStringToLong returns Long.MAX_VALUE >>>> when value overflows >>> >>> This one is intentional. It's mapped back to unlimited via >>> longValOrUnlimited(). The reason for this is that cgroup v1 doesn't >>> have a concept of "unlimited". Unlimited values will be a very large >>> numbers in cgroup v1 files. >>> >>>> CgroupV2SubsystemController::convertStringToLong returns -1 when IOException occurs >>>> >>>> CgroupV1Subsystem::getCpuShares return -1 if cpu.shares == 0 or 1024 >>>> CgroupV2Subsystem::getCpuShares returns -1 if cpu.weight == 100 or 0 >>> >>> These two are special cases too. See the implementation note of >>> Metrics.getCpuShares(). In the cgroup v2 case the default value is 100 >>> (over 1024 in cgroup v1). That's why unlimited is being returned for >>> those values. >>> >>>> CgroupV2Subsystem::getFromCpuMax returns -1 if error in reading cpu.max or malformed content >>>> CgroupV2Subsystem::sumTokensIOStat returns -2 if IOException error >>>> This is called by getBlkIOServiceCount and getBlkIOServiced >>>> >>>> I think this can be improved and add the documentation to describe >>>> what the methods do. Since Metrics APIs consistently return -2 if >>>> unavailable due to error in fetching the metric, why some utility >>>> methods in *Subsystem and *SubsystemController return -1 upon error >>>> and 0 when unlimited? >>>> >>>> I suspect if the getXXXValue and other methods are clearly documented >>>> with the error cases (possibly renaming the method name if appropriate) >>>> CgroupV1Subsystem and CgroupV2SubSystem will become very explicit >>>> to understand. >>> >>> This should be fixed now. >>> >>> I've gone through the API doc of Metrics.java and have updated it. In >>> general, I've updated it to return -1 if metric is unavailable (due to >>> error in reading some files or content being empty), and -2 if not >>> supported. No method returns -2 currently, but it might change and it's >>> good to have some way of saying "not implementable" for this subsystem >>> in the spec. That's my take on it anyway. >>> >>> There is also a new unit test for shared controller logic: >>> TestCgroupSubsystemController.java >>> >>> It execises various cases of error/success. >>> >>> That is to ensure proper symmetry across the various cases (including >>> IOException). I've also documented static methods in >>> CgroupSubsystemController. Overall, all methods now return the same >>> values for cgroup v1 and cgroup v2 (given the impl nuances) for the >>> various cases. >>> >>>> CgroupSubsystem.java >>>> >>>> 44 public static final double DOUBLE_RETVAL_NOT_SUPPORTED = LONG_RETVAL_NOT_SUPPORTED; >>>> 49 public static final Boolean BOOL_RETVAL_NOT_SUPPORTED = null; >>>> >>>> They are no longer needed, right? >>> >>> Removed. >>> >>>> CgroupSubsystemFactory.java >>>> >>>> 89 System.err.println("Warning: Mixed cgroupv1 and cgroupv2 not supported. Metrics disabled."); >>>> >>>> >>>> I expect this be a System.Logger log >>> >>> Updated. >>> >>>> 114 if (!Integer.valueOf(0).toString().equals(tokens[0])) { >>>> >>>> This can be simplified to if (!"0".equals(tokens[0])) >>> >>> Done, thanks! >>> >>>> LauncherHelper.java >>>> >>>> 407 // Extended cgroupv1 specific metrics >>>> 408 if (c instanceof MetricsCgroupV1) { >>>> 409 MetricsCgroupV1 cgroupV1 = (MetricsCgroupV1)c; >>>> 410 limit = cgroupV1.getKernelMemoryLimit(); >>>> 411 ostream.println(formatLimitString(limit, INDENT + "Kernel Memory Limit: ")); >>>> 412 limit = cgroupV1.getTcpMemoryLimit(); >>>> 413 ostream.println(formatLimitString(limit, INDENT + "TCP Memory Limit: ")); >>>> 414 Boolean value = cgroupV1.isMemoryOOMKillEnabled(); >>>> 415 ostream.println(formatBoolean(value, INDENT + "Out Of Memory Killer Enabled: ")); >>>> 416 value = cgroupV1.isCpuSetMemoryPressureEnabled(); >>>> 417 ostream.println(formatBoolean(value, INDENT + "CPUSet Memory Pressure Enabled: ")); >>>> 418 } >>>> >>>> MetricsCgroupV1 is linux-only. It will fail the compilation when >>>> building on non-linux. One option is to move this code to >>>> src/java.base/linux/share/sun/launcher/CgroupMetrics.java >>>> >>>> Are they continued to be interesting metrics to be output from >>>> -XshowSetting? I wonder if they can simply be dropped from the output. >>>> Bob will have an opinion. >>> >>> I've removed those extra cgroup v1 specific metrics printed via >>> -XshowSettings:system. Not sure what to do with MetricsCgroupV1. It's >>> only used in tests in webrev 10. On the other hand the idea would be >>> for consumers to downcast it to MetricsCgroupV1 if they needed those >>> extra metrics. >>> >>> Thanks, >>> Severin >>> > From larry.cable at oracle.com Wed Feb 12 17:02:30 2020 From: larry.cable at oracle.com (Laurence Cable) Date: Wed, 12 Feb 2020 09:02:30 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> Message-ID: <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> On 2/12/20 4:17 AM, Schmelter, Ralf wrote: > Hi Serguei, > > the use case is being able to get a heap dump from big Java servers. These usually run on machines with a lot of memory and CPUs, but not much disk space (which they don't need apart from some trace files and the server code itself). And if we can get the customer to mount some NFS file system on the machine, it is usually slow. So writing only a third or forth of the data is a big win. > > Doing the compression outside the VM would either depend on the hprof file written first (so we would still need the disk space) or have another channel to dump the data (e.g. via socket). or named pipe > But this would add complexity too and needs an external program. agreed > > I've compiled 2 release versions on Windows with and without my change. The change adds 14.5k to the server.dll (which is 10.4 MB). Not sure if this is considered acceptable. but what is the performance impact of this? > > Best regards, > Ralf > > -----Original Message----- > From: serguei.spitsyn at oracle.com > Sent: Dienstag, 11. Februar 2020 20:49 > To: Schmelter, Ralf ; Yasumasa Suenaga ; OpenJDK Serviceability > Cc: yasuenag at gmail.com > Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump > > Ralf, > > I see this feature adds a lot of code. In fact, I'm not sure, it is > worth to add this kind of complexity (including new compressing threads) > into the VM implementation. What is a real use case behind it? Could > this compressing be done separately from VM implementation? > > Thanks, > Serguei From bob.vandette at oracle.com Wed Feb 12 17:29:19 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Wed, 12 Feb 2020 12:29:19 -0500 Subject: [PING] RFR: 8231111: Cgroups v2: Rework Metrics in java.base so as to recognize unified hierarchy In-Reply-To: <04BA2357-D5B3-4553-A369-D4790B225503@oracle.com> References: <75fc377f8d5ca76b7dac02f55db640cbdd305633.camel@redhat.com> <4bf65380bc26cd3bf684d7994b33e66bcb87927b.camel@redhat.com> <6CACCC0D-7F5A-42A3-83F1-746497940CCA@oracle.com> <29544339574e34c4c25cbec0314c26f35e8d1a99.camel@redhat.com> <4CE7C7F6-ABFA-4263-98B2-32BBD5013A3C@oracle.com> <10b5e83bfb7e618e5f5906c8e707057ff8680785.camel@redhat.com> <4a304e2ce72a53859b4e9cc8b21db404a260b531.camel@redhat.com> <97e7ff2b-f2cc-a666-afb8-521c0f5c37e7@oracle.com> <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> <436303A0-4D46-48BD-8EC3-23A7B213C6FA@oracle.com> <04BA2357-D5B3-4553-A369-D4790B225503@oracle.com> Message-ID: <98615975-1586-48CA-BDF9-5E2692B7B77B@oracle.com> I applied the delegation change that you recommended and now all container tests pass. Here?s the change that I applied. echo +cpu +cpuset +io +memory +pids > /sys/fs/cgroup/cgroup.subtree_control echo +cpu +cpuset +io +memory +pids > /sys/fs/cgroup/user.slice/cgroup.subtree_control echo +cpu +cpuset +io +memory +pids > /sys/fs/cgroup/user.slice/user-23603.slice/cgroup.subtree_control Although all the tests now pass, the values for cpusets and cpusets.mems still do not show up on the host or container unless limits are specified. That?s the only remaining minor difference from cgroupv1 that I can see. testing container APIs Directory "JTwork" not found: creating Passed: containers/cgroup/PlainRead.java Passed: containers/docker/DockerBasicTest.java Passed: containers/docker/TestCPUAwareness.java Passed: containers/docker/TestCPUSets.java Passed: containers/docker/TestJcmdWithSideCar.java Passed: containers/docker/TestJFREvents.java Passed: containers/docker/TestJFRNetworkEvents.java Passed: containers/docker/TestMemoryAwareness.java Passed: containers/docker/TestMisc.java Test results: passed: 9 Results written to /export/users/bobv/jdk15/build/jtreg/JTwork testing jdk.internal.platform APIs Passed: jdk/internal/platform/cgroup/TestCgroupMetrics.java Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java Passed: jdk/internal/platform/docker/TestSystemMetrics.java Test results: passed: 5 Results written to /export/users/bobv/jdk15/build/jtreg/JTwork testing -XshowSettings:system launcher option Passed: tools/launcher/Settings.java Test results: passed: 1 Results written to /export/users/bobv/jdk15/build/jtreg/JTwork Bob. > On Feb 12, 2020, at 10:13 AM, Bob Vandette wrote: > > >> On Feb 12, 2020, at 6:22 AM, Severin Gehwolf wrote: >> >> Hi Bob, >> >> On Tue, 2020-02-11 at 16:59 -0500, Bob Vandette wrote: >>> I applied your patch to the latest JDK 15 sources and ran the container >>> tests on Oracle Linux 8.1 with podman/cgroupv2 enabled. There were some issues. >>> I?m not sure if its my setup or not. >> >> Looking over them, they appear to be setup issues. >> >> Getting the setup working right was tricky for me as well. >> >> Aside: >> podman has a --runtime switch which you could point to a cgroups v2 >> capable crun binary for example. >> https://bugs.openjdk.java.net/browse/JDK-8230305 has some examples of >> how to use it. >> >> I also needed some extra setup to get the controllers' delegation to >> work: >> https://urldefense.com/v3/__https://scrivano.org/2019/02/26/resources-management-with-rootless-containers/__;!!GqivPVa7Brio!Ot7tv-QJgGntRyaimzIRrRh7rqvnzMu9AoOaBWINWNnnpR9LUVNNvGUe9GXp0Ri0Wg$ >> >>> I also ran the same build on Ubuntu with docker/cgroupv1. I didn't see any failures on >>> the cgroupv1 system. >> >> Great, thanks! >> >>> Here are some notes: >>> >>> The podman version on OL 8.1 doesn't yes support cgroupv2. I >>> built the latest from sources for this test. >>> >>> The docker tests take a very long time under podman! Longer than >>> the cgroupv1 run. >> >> Possibly. I haven't done any fair comparison of the two. > > It?s possible that it?s trying to access several container hubs. I may be getting some timeouts > since I haven?t figured out how to specify proxies for podman. I?ve set a local http_proxy env variable > but maybe that?s not working. > >> >>> cpusets and cpusets.mems are blank on host and if none are specified >>> on podman/docker run command. On cgroupv1 they are host values if not >>> specified for container. Effective cpusets and cpusets.mems are set properly >>> in a container. >> >> Yes, there seems to be slight differences to cgroup v1. You can only >> set cpusets on containers if the host has the controller enabled, >> though: >> >> # cat /sys/fs/cgroup/cgroup.subtree_control >> cpuset cpu io memory pids >> # podman run --memory=300M --cpuset-cpus=0,1 -ti -v `pwd`:/mnt fedora:30 /bin/bash >> [root at 5d4a4e593a24 /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpuset.cpus >> 0-1 >> [root at 5d4a4e593a24 mnt]# ./cgroupsv2-jdk/bin/java -XshowSettings:system -version >> Operating System Metrics: >> Provider: cgroupv2 >> Effective CPU Count: 2 >> CPU Period: 100000us >> CPU Quota: -1 >> CPU Shares: -1 >> List of Processors, 2 total: >> 0 1 >> List of Effective Processors, 2 total: >> 0 1 >> List of Memory Nodes: N/A >> List of Available Memory Nodes, 1 total: >> 0 >> Memory Limit: 300.00M >> Memory Soft Limit: Unlimited >> Memory & Swap Limit: 600.00M >> >> openjdk version "15-internal" 2020-09-15 >> OpenJDK Runtime Environment (build 15-internal+0-adhoc.sgehwolf.openjdk-head-2) >> OpenJDK 64-Bit Server VM (build 15-internal+0-adhoc.sgehwolf.openjdk-head-2, mixed mode, sharing) >> > > In my setup, your command above works correctly. The only thing that doesn?t work is if I run the -XshowSettings > directly on the host OR I run a container without specifying ?cpuset-cpus. The default setup seems wrong. > >> >>> HOST OUTPUT: >>> ./java -XshowSettings:system -version >>> Operating System Metrics: >>> Provider: cgroupv2 >>> Effective CPU Count: 32 >>> CPU Period: -1 >>> CPU Quota: -1 >>> CPU Shares: -1 >>> List of Processors: N/A >>> List of Effective Processors: N/A >>> List of Memory Nodes: N/A >>> List of Available Memory Nodes: N/A >>> Memory Limit: Unlimited >>> Memory Soft Limit: Unlimited >>> Memory & Swap Limit: Unlimited >>> >>> CONTAINER OUTPUT: >>> # podman run -it -v `pwd`:/mnt ubuntu bash >>> root at 3c6654a3b834:/mnt/jdk/bin# ./java -XshowSettings:system -version >>> Operating System Metrics: >>> Provider: cgroupv2 >>> Effective CPU Count: 32 >>> CPU Period: 100000us >>> CPU Quota: -1 >>> CPU Shares: -1 >>> List of Processors: N/A >>> List of Effective Processors, 32 total: >>> 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 >>> List of Memory Nodes: N/A >>> List of Available Memory Nodes, 1 total: >>> 0 >>> Memory Limit: Unlimited >>> Memory Soft Limit: Unlimited >>> Memory & Swap Limit: Unlimited >> >> Yes, host and container output differ depending on the configured test >> system. I remember that I had cpusets working on the host system too at >> some point, but I forgot what the magic was to get this properly >> delegated via systemd. >> >>> Docker tests fail if /bin/docker is not available in podman setup. We >>> probably should enhance the docker check to also look for podman. >> >> Could you be more specific about this? How do you run docker tests? I >> use: >> >> -e PATH -verbose:summary -Djdk.test.container.command=podman -Djdk.test.docker.image.name=fedora -Djdk.test.docker.image.version=30 >> >> with -Djdk.test.container.command=podman it shouldn't need docker. Did >> you specify that property? > > I thought that most podman setups try to provide docker compatibility. I did not try using the properties. > In order to make our automation work cleaner, I was hoping that we could always just execute docker. > >> >>> Two container tests failed: >>> >>> FAILED: containers/cgroup/PlainRead.java failed Memory Limit is: -2 instead >>> of unlimited or -1. This is because memory.max is not foumd. >> >> This doesn't fail for me, because I've got memory.max present on host: >> >> # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cgroup.controllers >> cpu io memory pids >> [root at f31 sgehwolf]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/memory.max >> max >> >> Note, howevre, this is a hotspot test. So support for it for cgroup v2 >> came with JDK-8230305. It seems we should return -1 if memory.max is >> not found (over -2, not supported). Could you file a bug for this? It's >> unrelated to this change. It should be a simple fix. > > Here are my controllers: > cpuset cpu io memory pids rdma > > This seems like another case where this file doesn?t exist in the path we form on the host. > > [0.035s][debug][os,container] /sys/fs/cgroup/user.slice/user-23603.slice/session-14.scope/cpu.max failed, No such file or directory > > It does exist here: /sys/fs/cgroup/user.slice so maybe it?s a delegation issue. > >> >>> FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java >>> This fails because nr_periods line doesn't always exist. I think you?ve got >>> to enable a quota for this to appear (not sure). >> >> Passes for me, but it needs the cpu controller enabled on the test >> system. >> >> # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cgroup.controllers >> cpu io memory pids >> # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' -f3)/cpu.stat >> usage_usec 1157537 >> user_usec 606832 >> system_usec 550705 >> nr_periods 0 >> nr_throttled 0 >> throttled_usec 0 >> >>> Here?s the contents: >>> % more cpu.stat >>> usage_usec 23974562755 >>> user_usec 22257183568 >>> system_usec 1717379186 >> >> It suggests you've got the cpu controller disabled: >> https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html*cpu__;Iw!!GqivPVa7Brio!Ot7tv-QJgGntRyaimzIRrRh7rqvnzMu9AoOaBWINWNnnpR9LUVNNvGUe9GWZKkNZMA$ >> >> """ >> and the following three when the controller is enabled: >> >> >> nr_periods >> nr_throttled >> throttled_usec >> ?"" > > I do have the cpu controller enabled. I?ll look into your fixes for delegation issues and try again. > > Bob. > >> >>> CGROUPV2 results on Oracle Linux 8.1 >>> --------- >>> >>> testing container APIs >>> >>> FAILED: containers/cgroup/PlainRead.java >>> Passed: containers/docker/DockerBasicTest.java >>> Passed: containers/docker/TestCPUAwareness.java >>> Passed: containers/docker/TestCPUSets.java >>> Passed: containers/docker/TestJcmdWithSideCar.java >>> Passed: containers/docker/TestJFREvents.java >>> Passed: containers/docker/TestJFRNetworkEvents.java >>> Passed: containers/docker/TestMemoryAwareness.java >>> Passed: containers/docker/TestMisc.java >>> Test results: passed: 8; failed: 1 >>> Results written to /export/users/bobv/jdk15/build/jtreg/JTwork >>> Error: Some tests failed or other problems occurred. >> >> These are hotspot tests. Covered by JDK-8230305 (hotspot changes). The >> plain read test passes on a properly configured host system with >> controller delegation. >> >>> testing jdk.internal.platform APIs >>> >>> FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java >>> Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java >>> Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java >>> Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java >>> Passed: jdk/internal/platform/docker/TestSystemMetrics.java >>> Test results: passed: 4; failed: 1 >>> Results written to /export/users/bobv/jdk15/build/jtreg/JTwork >>> Error: Some tests failed or other problems occurred. >> >> I believe cgroup/TestCgroupMetrics.java fails due to bad host config. >> It passes here: >> >> [root at f31 jdk-jdk]# rm -rf JTwork/ JTreport && /media/disk/jtreg/bin/jtreg -timeout:4 -jdk:../cgroupsv2-jdk/ -e PATH -verbose:summary -Djdk.test.container.command=podman -Djdk.test.docker.image.name=fedora -Djdk.test.docker.image.version=30 test/jdk/jdk/internal/platform >> Directory "JTwork" not found: creating >> Directory "JTreport" not found: creating >> Passed: jdk/internal/platform/cgroup/TestCgroupMetrics.java >> Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java >> Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java >> Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java >> Passed: jdk/internal/platform/docker/TestSystemMetrics.java >> Test results: passed: 5 >> Report written to /home/sgehwolf/jdk-jdk/JTreport/html/report.html >> Results written to /home/sgehwolf/jdk-jdk/JTwork >> >> JTR files available here: >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/jtreg-results/ >> >>> testing -XshowSettings:system launcher option >>> >>> Passed: tools/launcher/Settings.java >>> Test results: passed: 1 >> >> Thanks for running the tests! >> >> FWIW, jdk-submit came back clean. If we could get the initial support >> of this pushed soon it would be great. I'd be happy to fix any follow- >> up issues. >> >> Thanks, >> Severin >> >>> Bob. >>> >>>> On Feb 11, 2020, at 1:04 PM, Severin Gehwolf wrote: >>>> >>>> Hi Mandy, Bob, >>>> >>>> Thanks again for the reviews and patience on this. Sorry it took me so >>>> long to get back to this :-/ >>>> >>>> Updated webrev: >>>> Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/webrev/ >>>> incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/incremental/webrev/ >>>> >>>> I've tested this with docker tests on cgroup v1 and via podman on a >>>> cgroup v2 system. They pass. I'll be running this through jdk-submit as >>>> well. >>>> >>>> More below. >>>> >>>> On Tue, 2020-01-21 at 16:09 -0800, Mandy Chung wrote: >>>>> Hi Severin, >>>>> >>>>> Thanks for the update. >>>>> >>>>> On 1/21/20 11:30 AM, Severin Gehwolf wrote: >>>>>> Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/webrev/ >>>>>> incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/incremental/webrev/ >>>>>> >>>>> >>>>> I have answered my own question. Most of the metrics used to return 0 if unavailable due to IOException reading a file or malformed content and now are changed to return -2 due to error fetching the metric. >>>>> >>>>> The following are about limits which used to return -1 if unlimited or no limit set. >>>>> public long getCpuQuota(); >>>>> public long getCpuShares(); >>>>> public long getMemoryLimit(); >>>>> public long getMemoryAndSwapLimit(); >>>>> public long getMemorySoftLimit(); >>>>> >>>>> With this patch, only getMemoryLimit and getMemoryAndSwapLimit specify to return -1 if unlimited or no limit set. However the implementation does return -1. All of the above specify to return -2 if unavailable due to error fetching the metric. >>>>> >>>>> I found the implementation quite hard to follow. I spent some time reviewing the code to see if the implementation matches the spec but I can't easily tell yet. For example, >>>>> CgroupSubsystemController::getLongValueMatchingLine returns -1 when IOException occurs. >>>>> CgroupSubsystemController::getLongEntry returns 0L if IOException occurs. >>>>> >>>>> CgroupV1SubsystemController::convertStringToLong returns Long.MAX_VALUE >>>>> when value overflows >>>> >>>> This one is intentional. It's mapped back to unlimited via >>>> longValOrUnlimited(). The reason for this is that cgroup v1 doesn't >>>> have a concept of "unlimited". Unlimited values will be a very large >>>> numbers in cgroup v1 files. >>>> >>>>> CgroupV2SubsystemController::convertStringToLong returns -1 when IOException occurs >>>>> >>>>> CgroupV1Subsystem::getCpuShares return -1 if cpu.shares == 0 or 1024 >>>>> CgroupV2Subsystem::getCpuShares returns -1 if cpu.weight == 100 or 0 >>>> >>>> These two are special cases too. See the implementation note of >>>> Metrics.getCpuShares(). In the cgroup v2 case the default value is 100 >>>> (over 1024 in cgroup v1). That's why unlimited is being returned for >>>> those values. >>>> >>>>> CgroupV2Subsystem::getFromCpuMax returns -1 if error in reading cpu.max or malformed content >>>>> CgroupV2Subsystem::sumTokensIOStat returns -2 if IOException error >>>>> This is called by getBlkIOServiceCount and getBlkIOServiced >>>>> >>>>> I think this can be improved and add the documentation to describe >>>>> what the methods do. Since Metrics APIs consistently return -2 if >>>>> unavailable due to error in fetching the metric, why some utility >>>>> methods in *Subsystem and *SubsystemController return -1 upon error >>>>> and 0 when unlimited? >>>>> >>>>> I suspect if the getXXXValue and other methods are clearly documented >>>>> with the error cases (possibly renaming the method name if appropriate) >>>>> CgroupV1Subsystem and CgroupV2SubSystem will become very explicit >>>>> to understand. >>>> >>>> This should be fixed now. >>>> >>>> I've gone through the API doc of Metrics.java and have updated it. In >>>> general, I've updated it to return -1 if metric is unavailable (due to >>>> error in reading some files or content being empty), and -2 if not >>>> supported. No method returns -2 currently, but it might change and it's >>>> good to have some way of saying "not implementable" for this subsystem >>>> in the spec. That's my take on it anyway. >>>> >>>> There is also a new unit test for shared controller logic: >>>> TestCgroupSubsystemController.java >>>> >>>> It execises various cases of error/success. >>>> >>>> That is to ensure proper symmetry across the various cases (including >>>> IOException). I've also documented static methods in >>>> CgroupSubsystemController. Overall, all methods now return the same >>>> values for cgroup v1 and cgroup v2 (given the impl nuances) for the >>>> various cases. >>>> >>>>> CgroupSubsystem.java >>>>> >>>>> 44 public static final double DOUBLE_RETVAL_NOT_SUPPORTED = LONG_RETVAL_NOT_SUPPORTED; >>>>> 49 public static final Boolean BOOL_RETVAL_NOT_SUPPORTED = null; >>>>> >>>>> They are no longer needed, right? >>>> >>>> Removed. >>>> >>>>> CgroupSubsystemFactory.java >>>>> >>>>> 89 System.err.println("Warning: Mixed cgroupv1 and cgroupv2 not supported. Metrics disabled."); >>>>> >>>>> >>>>> I expect this be a System.Logger log >>>> >>>> Updated. >>>> >>>>> 114 if (!Integer.valueOf(0).toString().equals(tokens[0])) { >>>>> >>>>> This can be simplified to if (!"0".equals(tokens[0])) >>>> >>>> Done, thanks! >>>> >>>>> LauncherHelper.java >>>>> >>>>> 407 // Extended cgroupv1 specific metrics >>>>> 408 if (c instanceof MetricsCgroupV1) { >>>>> 409 MetricsCgroupV1 cgroupV1 = (MetricsCgroupV1)c; >>>>> 410 limit = cgroupV1.getKernelMemoryLimit(); >>>>> 411 ostream.println(formatLimitString(limit, INDENT + "Kernel Memory Limit: ")); >>>>> 412 limit = cgroupV1.getTcpMemoryLimit(); >>>>> 413 ostream.println(formatLimitString(limit, INDENT + "TCP Memory Limit: ")); >>>>> 414 Boolean value = cgroupV1.isMemoryOOMKillEnabled(); >>>>> 415 ostream.println(formatBoolean(value, INDENT + "Out Of Memory Killer Enabled: ")); >>>>> 416 value = cgroupV1.isCpuSetMemoryPressureEnabled(); >>>>> 417 ostream.println(formatBoolean(value, INDENT + "CPUSet Memory Pressure Enabled: ")); >>>>> 418 } >>>>> >>>>> MetricsCgroupV1 is linux-only. It will fail the compilation when >>>>> building on non-linux. One option is to move this code to >>>>> src/java.base/linux/share/sun/launcher/CgroupMetrics.java >>>>> >>>>> Are they continued to be interesting metrics to be output from >>>>> -XshowSetting? I wonder if they can simply be dropped from the output. >>>>> Bob will have an opinion. >>>> >>>> I've removed those extra cgroup v1 specific metrics printed via >>>> -XshowSettings:system. Not sure what to do with MetricsCgroupV1. It's >>>> only used in tests in webrev 10. On the other hand the idea would be >>>> for consumers to downcast it to MetricsCgroupV1 if they needed those >>>> extra metrics. >>>> >>>> Thanks, >>>> Severin >>>> >> > From sgehwolf at redhat.com Wed Feb 12 17:44:53 2020 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Wed, 12 Feb 2020 18:44:53 +0100 Subject: [PING] RFR: 8231111: Cgroups v2: Rework Metrics in java.base so as to recognize unified hierarchy In-Reply-To: <98615975-1586-48CA-BDF9-5E2692B7B77B@oracle.com> References: <75fc377f8d5ca76b7dac02f55db640cbdd305633.camel@redhat.com> <4bf65380bc26cd3bf684d7994b33e66bcb87927b.camel@redhat.com> <6CACCC0D-7F5A-42A3-83F1-746497940CCA@oracle.com> <29544339574e34c4c25cbec0314c26f35e8d1a99.camel@redhat.com> <4CE7C7F6-ABFA-4263-98B2-32BBD5013A3C@oracle.com> <10b5e83bfb7e618e5f5906c8e707057ff8680785.camel@redhat.com> <4a304e2ce72a53859b4e9cc8b21db404a260b531.camel@redhat.com> <97e7ff2b-f2cc-a666-afb8-521c0f5c37e7@oracle.com> <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> <436303A0-4D46-48BD-8EC3-23A7B213C6FA@oracle.com> <04BA2357-D5B3-4553-A369-D4790B225503@oracle.com> <98615975-1586-48CA-BDF9-5E2692B7B77B@oracle.com> Message-ID: <1186ee103e51be7434492c919efb16787a14b9cc.camel@redhat.com> Hi, On Wed, 2020-02-12 at 12:29 -0500, Bob Vandette wrote: > I applied the delegation change that you recommended and now all container tests pass. > > Here?s the change that I applied. > > echo +cpu +cpuset +io +memory +pids > /sys/fs/cgroup/cgroup.subtree_control > echo +cpu +cpuset +io +memory +pids > /sys/fs/cgroup/user.slice/cgroup.subtree_control > echo +cpu +cpuset +io +memory +pids > /sys/fs/cgroup/user.slice/user-23603.slice/cgroup.subtree_control > > Although all the tests now pass, the values for cpusets and cpusets.mems still do not show up on the host > or container unless limits are specified. That?s the only remaining minor difference from cgroupv1 that I can see. > > testing container APIs > Directory "JTwork" not found: creating > Passed: containers/cgroup/PlainRead.java > Passed: containers/docker/DockerBasicTest.java > Passed: containers/docker/TestCPUAwareness.java > Passed: containers/docker/TestCPUSets.java > Passed: containers/docker/TestJcmdWithSideCar.java > Passed: containers/docker/TestJFREvents.java > Passed: containers/docker/TestJFRNetworkEvents.java > Passed: containers/docker/TestMemoryAwareness.java > Passed: containers/docker/TestMisc.java > Test results: passed: 9 > Results written to /export/users/bobv/jdk15/build/jtreg/JTwork > testing jdk.internal.platform APIs > Passed: jdk/internal/platform/cgroup/TestCgroupMetrics.java > Passed: jdk/internal/platform/cgroup/TestCgroupSubsystemController.java > Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java > Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java > Passed: jdk/internal/platform/docker/TestSystemMetrics.java > Test results: passed: 5 > Results written to /export/users/bobv/jdk15/build/jtreg/JTwork > testing -XshowSettings:system launcher option > Passed: tools/launcher/Settings.java > Test results: passed: 1 > Results written to /export/users/bobv/jdk15/build/jtreg/JTwork Excellent, thanks! Mandy: Any more comments? Thanks in advance! Cheers, Severin > Bob. > > > On Feb 12, 2020, at 10:13 AM, Bob Vandette > > wrote: > > > > > > > On Feb 12, 2020, at 6:22 AM, Severin Gehwolf > > > wrote: > > > > > > Hi Bob, > > > > > > On Tue, 2020-02-11 at 16:59 -0500, Bob Vandette wrote: > > > > I applied your patch to the latest JDK 15 sources and ran the > > > > container > > > > tests on Oracle Linux 8.1 with podman/cgroupv2 enabled. There > > > > were some issues. > > > > I?m not sure if its my setup or not. > > > > > > Looking over them, they appear to be setup issues. > > > > > > Getting the setup working right was tricky for me as well. > > > > > > Aside: > > > podman has a --runtime switch which you could point to a cgroups > > > v2 > > > capable crun binary for example. > > > https://bugs.openjdk.java.net/browse/JDK-8230305 has some > > > examples of > > > how to use it. > > > > > > I also needed some extra setup to get the controllers' delegation > > > to > > > work: > > > https://urldefense.com/v3/__https://scrivano.org/2019/02/26/resources-management-with-rootless-containers/__;!!GqivPVa7Brio!Ot7tv-QJgGntRyaimzIRrRh7rqvnzMu9AoOaBWINWNnnpR9LUVNNvGUe9GXp0Ri0Wg$ > > > > > > > I also ran the same build on Ubuntu with docker/cgroupv1. I > > > > didn't see any failures on > > > > the cgroupv1 system. > > > > > > Great, thanks! > > > > > > > Here are some notes: > > > > > > > > The podman version on OL 8.1 doesn't yes support cgroupv2. I > > > > built the latest from sources for this test. > > > > > > > > The docker tests take a very long time under podman! Longer > > > > than > > > > the cgroupv1 run. > > > > > > Possibly. I haven't done any fair comparison of the two. > > > > It?s possible that it?s trying to access several container hubs. I > > may be getting some timeouts > > since I haven?t figured out how to specify proxies for > > podman. I?ve set a local http_proxy env variable > > but maybe that?s not working. > > > > > > cpusets and cpusets.mems are blank on host and if none are > > > > specified > > > > on podman/docker run command. On cgroupv1 they are host values > > > > if not > > > > specified for container. Effective cpusets and cpusets.mems > > > > are set properly > > > > in a container. > > > > > > Yes, there seems to be slight differences to cgroup v1. You can > > > only > > > set cpusets on containers if the host has the controller enabled, > > > though: > > > > > > # cat /sys/fs/cgroup/cgroup.subtree_control > > > cpuset cpu io memory pids > > > # podman run --memory=300M --cpuset-cpus=0,1 -ti -v `pwd`:/mnt > > > fedora:30 /bin/bash > > > [root at 5d4a4e593a24 /]# cat /sys/fs/cgroup$(cat /proc/self/cgroup > > > | cut -d':' -f3)/cpuset.cpus > > > 0-1 > > > [root at 5d4a4e593a24 mnt]# ./cgroupsv2-jdk/bin/java > > > -XshowSettings:system -version > > > Operating System Metrics: > > > Provider: cgroupv2 > > > Effective CPU Count: 2 > > > CPU Period: 100000us > > > CPU Quota: -1 > > > CPU Shares: -1 > > > List of Processors, 2 total: > > > 0 1 > > > List of Effective Processors, 2 total: > > > 0 1 > > > List of Memory Nodes: N/A > > > List of Available Memory Nodes, 1 total: > > > 0 > > > Memory Limit: 300.00M > > > Memory Soft Limit: Unlimited > > > Memory & Swap Limit: 600.00M > > > > > > openjdk version "15-internal" 2020-09-15 > > > OpenJDK Runtime Environment (build 15-internal+0- > > > adhoc.sgehwolf.openjdk-head-2) > > > OpenJDK 64-Bit Server VM (build 15-internal+0- > > > adhoc.sgehwolf.openjdk-head-2, mixed mode, sharing) > > > > > > > In my setup, your command above works correctly. The only thing > > that doesn?t work is if I run the -XshowSettings > > directly on the host OR I run a container without specifying > > ?cpuset-cpus. The default setup seems wrong. > > > > > > HOST OUTPUT: > > > > ./java -XshowSettings:system -version > > > > Operating System Metrics: > > > > Provider: cgroupv2 > > > > Effective CPU Count: 32 > > > > CPU Period: -1 > > > > CPU Quota: -1 > > > > CPU Shares: -1 > > > > List of Processors: N/A > > > > List of Effective Processors: N/A > > > > List of Memory Nodes: N/A > > > > List of Available Memory Nodes: N/A > > > > Memory Limit: Unlimited > > > > Memory Soft Limit: Unlimited > > > > Memory & Swap Limit: Unlimited > > > > > > > > CONTAINER OUTPUT: > > > > # podman run -it -v `pwd`:/mnt ubuntu bash > > > > root at 3c6654a3b834:/mnt/jdk/bin# ./java -XshowSettings:system > > > > -version > > > > Operating System Metrics: > > > > Provider: cgroupv2 > > > > Effective CPU Count: 32 > > > > CPU Period: 100000us > > > > CPU Quota: -1 > > > > CPU Shares: -1 > > > > List of Processors: N/A > > > > List of Effective Processors, 32 total: > > > > 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 > > > > 24 25 26 27 28 29 30 31 > > > > List of Memory Nodes: N/A > > > > List of Available Memory Nodes, 1 total: > > > > 0 > > > > Memory Limit: Unlimited > > > > Memory Soft Limit: Unlimited > > > > Memory & Swap Limit: Unlimited > > > > > > Yes, host and container output differ depending on the configured > > > test > > > system. I remember that I had cpusets working on the host system > > > too at > > > some point, but I forgot what the magic was to get this properly > > > delegated via systemd. > > > > > > > Docker tests fail if /bin/docker is not available in podman > > > > setup. We > > > > probably should enhance the docker check to also look for > > > > podman. > > > > > > Could you be more specific about this? How do you run docker > > > tests? I > > > use: > > > > > > -e PATH -verbose:summary -Djdk.test.container.command=podman > > > -Djdk.test.docker.image.name=fedora > > > -Djdk.test.docker.image.version=30 > > > > > > with -Djdk.test.container.command=podman it shouldn't need > > > docker. Did > > > you specify that property? > > > > I thought that most podman setups try to provide docker > > compatibility. I did not try using the properties. > > In order to make our automation work cleaner, I was hoping that we > > could always just execute docker. > > > > > > Two container tests failed: > > > > > > > > FAILED: containers/cgroup/PlainRead.java failed Memory Limit > > > > is: -2 instead > > > > of unlimited or -1. This is because memory.max is not foumd. > > > > > > This doesn't fail for me, because I've got memory.max present on > > > host: > > > > > > # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' > > > -f3)/cgroup.controllers > > > cpu io memory pids > > > [root at f31 sgehwolf]# cat /sys/fs/cgroup$(cat /proc/self/cgroup | > > > cut -d':' -f3)/memory.max > > > max > > > > > > Note, howevre, this is a hotspot test. So support for it for > > > cgroup v2 > > > came with JDK-8230305. It seems we should return -1 if memory.max > > > is > > > not found (over -2, not supported). Could you file a bug for > > > this? It's > > > unrelated to this change. It should be a simple fix. > > > > Here are my controllers: > > cpuset cpu io memory pids rdma > > > > This seems like another case where this file doesn?t exist in the > > path we form on the host. > > > > [0.035s][debug][os,container] /sys/fs/cgroup/user.slice/user- > > 23603.slice/session-14.scope/cpu.max failed, No such file or > > directory > > > > It does exist here: /sys/fs/cgroup/user.slice so maybe it?s a > > delegation issue. > > > > > > FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java > > > > This fails because nr_periods line doesn't always exist. I > > > > think you?ve got > > > > to enable a quota for this to appear (not sure). > > > > > > Passes for me, but it needs the cpu controller enabled on the > > > test > > > system. > > > > > > # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' > > > -f3)/cgroup.controllers > > > cpu io memory pids > > > # cat /sys/fs/cgroup$(cat /proc/self/cgroup | cut -d':' > > > -f3)/cpu.stat > > > usage_usec 1157537 > > > user_usec 606832 > > > system_usec 550705 > > > nr_periods 0 > > > nr_throttled 0 > > > throttled_usec 0 > > > > > > > Here?s the contents: > > > > % more cpu.stat > > > > usage_usec 23974562755 > > > > user_usec 22257183568 > > > > system_usec 1717379186 > > > > > > It suggests you've got the cpu controller disabled: > > > https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/admin-guide/cgroup-v2.html*cpu__;Iw!!GqivPVa7Brio!Ot7tv-QJgGntRyaimzIRrRh7rqvnzMu9AoOaBWINWNnnpR9LUVNNvGUe9GWZKkNZMA$ > > > > > > """ > > > and the following three when the controller is enabled: > > > > > > > > > nr_periods > > > nr_throttled > > > throttled_usec > > > ?"" > > > > I do have the cpu controller enabled. I?ll look into your fixes > > for delegation issues and try again. > > > > Bob. > > > > > > CGROUPV2 results on Oracle Linux 8.1 > > > > --------- > > > > > > > > testing container APIs > > > > > > > > FAILED: containers/cgroup/PlainRead.java > > > > Passed: containers/docker/DockerBasicTest.java > > > > Passed: containers/docker/TestCPUAwareness.java > > > > Passed: containers/docker/TestCPUSets.java > > > > Passed: containers/docker/TestJcmdWithSideCar.java > > > > Passed: containers/docker/TestJFREvents.java > > > > Passed: containers/docker/TestJFRNetworkEvents.java > > > > Passed: containers/docker/TestMemoryAwareness.java > > > > Passed: containers/docker/TestMisc.java > > > > Test results: passed: 8; failed: 1 > > > > Results written to /export/users/bobv/jdk15/build/jtreg/JTwork > > > > Error: Some tests failed or other problems occurred. > > > > > > These are hotspot tests. Covered by JDK-8230305 (hotspot > > > changes). The > > > plain read test passes on a properly configured host system with > > > controller delegation. > > > > > > > testing jdk.internal.platform APIs > > > > > > > > FAILED: jdk/internal/platform/cgroup/TestCgroupMetrics.java > > > > Passed: > > > > jdk/internal/platform/cgroup/TestCgroupSubsystemController.java > > > > Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java > > > > Passed: > > > > jdk/internal/platform/docker/TestDockerMemoryMetrics.java > > > > Passed: jdk/internal/platform/docker/TestSystemMetrics.java > > > > Test results: passed: 4; failed: 1 > > > > Results written to /export/users/bobv/jdk15/build/jtreg/JTwork > > > > Error: Some tests failed or other problems occurred. > > > > > > I believe cgroup/TestCgroupMetrics.java fails due to bad host > > > config. > > > It passes here: > > > > > > [root at f31 jdk-jdk]# rm -rf JTwork/ JTreport && > > > /media/disk/jtreg/bin/jtreg -timeout:4 -jdk:../cgroupsv2-jdk/ -e > > > PATH -verbose:summary -Djdk.test.container.command=podman > > > -Djdk.test.docker.image.name=fedora > > > -Djdk.test.docker.image.version=30 test/jdk/jdk/internal/platform > > > Directory "JTwork" not found: creating > > > Directory "JTreport" not found: creating > > > Passed: jdk/internal/platform/cgroup/TestCgroupMetrics.java > > > Passed: > > > jdk/internal/platform/cgroup/TestCgroupSubsystemController.java > > > Passed: jdk/internal/platform/docker/TestDockerCpuMetrics.java > > > Passed: jdk/internal/platform/docker/TestDockerMemoryMetrics.java > > > Passed: jdk/internal/platform/docker/TestSystemMetrics.java > > > Test results: passed: 5 > > > Report written to /home/sgehwolf/jdk- > > > jdk/JTreport/html/report.html > > > Results written to /home/sgehwolf/jdk-jdk/JTwork > > > > > > JTR files available here: > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/jtreg-results/ > > > > > > > testing -XshowSettings:system launcher option > > > > > > > > Passed: tools/launcher/Settings.java > > > > Test results: passed: 1 > > > > > > Thanks for running the tests! > > > > > > FWIW, jdk-submit came back clean. If we could get the initial > > > support > > > of this pushed soon it would be great. I'd be happy to fix any > > > follow- > > > up issues. > > > > > > Thanks, > > > Severin > > > > > > > Bob. > > > > > > > > > On Feb 11, 2020, at 1:04 PM, Severin Gehwolf < > > > > > sgehwolf at redhat.com> wrote: > > > > > > > > > > Hi Mandy, Bob, > > > > > > > > > > Thanks again for the reviews and patience on this. Sorry it > > > > > took me so > > > > > long to get back to this :-/ > > > > > > > > > > Updated webrev: > > > > > Full: > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/webrev/ > > > > > incremental: > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/incremental/webrev/ > > > > > > > > > > I've tested this with docker tests on cgroup v1 and via > > > > > podman on a > > > > > cgroup v2 system. They pass. I'll be running this through > > > > > jdk-submit as > > > > > well. > > > > > > > > > > More below. > > > > > > > > > > On Tue, 2020-01-21 at 16:09 -0800, Mandy Chung wrote: > > > > > > Hi Severin, > > > > > > > > > > > > Thanks for the update. > > > > > > > > > > > > On 1/21/20 11:30 AM, Severin Gehwolf wrote: > > > > > > > Full: > > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/webrev/ > > > > > > > incremental: > > > > > > > http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/incremental/webrev/ > > > > > > > > > > > > > > > > > > > I have answered my own question. Most of the metrics used > > > > > > to return 0 if unavailable due to IOException reading a > > > > > > file or malformed content and now are changed to return -2 > > > > > > due to error fetching the metric. > > > > > > > > > > > > The following are about limits which used to return -1 if > > > > > > unlimited or no limit set. > > > > > > public long getCpuQuota(); > > > > > > public long getCpuShares(); > > > > > > public long getMemoryLimit(); > > > > > > public long getMemoryAndSwapLimit(); > > > > > > public long getMemorySoftLimit(); > > > > > > > > > > > > With this patch, only getMemoryLimit and > > > > > > getMemoryAndSwapLimit specify to return -1 if unlimited or > > > > > > no limit set. However the implementation does return > > > > > > -1. All of the above specify to return -2 if unavailable > > > > > > due to error fetching the metric. > > > > > > > > > > > > I found the implementation quite hard to follow. I spent > > > > > > some time reviewing the code to see if the implementation > > > > > > matches the spec but I can't easily tell yet. For > > > > > > example, > > > > > > CgroupSubsystemController::getLongValueMatchingLine returns > > > > > > -1 when IOException occurs. > > > > > > CgroupSubsystemController::getLongEntry returns 0L if > > > > > > IOException occurs. > > > > > > > > > > > > CgroupV1SubsystemController::convertStringToLong returns > > > > > > Long.MAX_VALUE > > > > > > when value overflows > > > > > > > > > > This one is intentional. It's mapped back to unlimited via > > > > > longValOrUnlimited(). The reason for this is that cgroup v1 > > > > > doesn't > > > > > have a concept of "unlimited". Unlimited values will be a > > > > > very large > > > > > numbers in cgroup v1 files. > > > > > > > > > > > CgroupV2SubsystemController::convertStringToLong returns -1 > > > > > > when IOException occurs > > > > > > > > > > > > CgroupV1Subsystem::getCpuShares return -1 if cpu.shares == > > > > > > 0 or 1024 > > > > > > CgroupV2Subsystem::getCpuShares returns -1 if cpu.weight == > > > > > > 100 or 0 > > > > > > > > > > These two are special cases too. See the implementation note > > > > > of > > > > > Metrics.getCpuShares(). In the cgroup v2 case the default > > > > > value is 100 > > > > > (over 1024 in cgroup v1). That's why unlimited is being > > > > > returned for > > > > > those values. > > > > > > > > > > > CgroupV2Subsystem::getFromCpuMax returns -1 if error in > > > > > > reading cpu.max or malformed content > > > > > > CgroupV2Subsystem::sumTokensIOStat returns -2 if > > > > > > IOException error > > > > > > This is called by getBlkIOServiceCount and getBlkIOServiced > > > > > > > > > > > > I think this can be improved and add the documentation to > > > > > > describe > > > > > > what the methods do. Since Metrics APIs consistently > > > > > > return -2 if > > > > > > unavailable due to error in fetching the metric, why some > > > > > > utility > > > > > > methods in *Subsystem and *SubsystemController return -1 > > > > > > upon error > > > > > > and 0 when unlimited? > > > > > > > > > > > > I suspect if the getXXXValue and other methods are clearly > > > > > > documented > > > > > > with the error cases (possibly renaming the method name if > > > > > > appropriate) > > > > > > CgroupV1Subsystem and CgroupV2SubSystem will become very > > > > > > explicit > > > > > > to understand. > > > > > > > > > > This should be fixed now. > > > > > > > > > > I've gone through the API doc of Metrics.java and have > > > > > updated it. In > > > > > general, I've updated it to return -1 if metric is > > > > > unavailable (due to > > > > > error in reading some files or content being empty), and -2 > > > > > if not > > > > > supported. No method returns -2 currently, but it might > > > > > change and it's > > > > > good to have some way of saying "not implementable" for this > > > > > subsystem > > > > > in the spec. That's my take on it anyway. > > > > > > > > > > There is also a new unit test for shared controller logic: > > > > > TestCgroupSubsystemController.java > > > > > > > > > > It execises various cases of error/success. > > > > > > > > > > That is to ensure proper symmetry across the various cases > > > > > (including > > > > > IOException). I've also documented static methods in > > > > > CgroupSubsystemController. Overall, all methods now return > > > > > the same > > > > > values for cgroup v1 and cgroup v2 (given the impl nuances) > > > > > for the > > > > > various cases. > > > > > > > > > > > CgroupSubsystem.java > > > > > > > > > > > > 44 public static final double > > > > > > DOUBLE_RETVAL_NOT_SUPPORTED = LONG_RETVAL_NOT_SUPPORTED; > > > > > > 49 public static final Boolean > > > > > > BOOL_RETVAL_NOT_SUPPORTED = null; > > > > > > > > > > > > They are no longer needed, right? > > > > > > > > > > Removed. > > > > > > > > > > > CgroupSubsystemFactory.java > > > > > > > > > > > > 89 System.err.println("Warning: Mixed cgroupv1 > > > > > > and cgroupv2 not supported. Metrics disabled."); > > > > > > > > > > > > > > > > > > I expect this be a System.Logger log > > > > > > > > > > Updated. > > > > > > > > > > > 114 if > > > > > > (!Integer.valueOf(0).toString().equals(tokens[0])) { > > > > > > > > > > > > This can be simplified to if (!"0".equals(tokens[0])) > > > > > > > > > > Done, thanks! > > > > > > > > > > > LauncherHelper.java > > > > > > > > > > > > 407 // Extended cgroupv1 specific metrics > > > > > > 408 if (c instanceof MetricsCgroupV1) { > > > > > > 409 MetricsCgroupV1 cgroupV1 = > > > > > > (MetricsCgroupV1)c; > > > > > > 410 limit = cgroupV1.getKernelMemoryLimit(); > > > > > > 411 ostream.println(formatLimitString(limit, > > > > > > INDENT + "Kernel Memory Limit: ")); > > > > > > 412 limit = cgroupV1.getTcpMemoryLimit(); > > > > > > 413 ostream.println(formatLimitString(limit, > > > > > > INDENT + "TCP Memory Limit: ")); > > > > > > 414 Boolean value = > > > > > > cgroupV1.isMemoryOOMKillEnabled(); > > > > > > 415 ostream.println(formatBoolean(value, INDENT > > > > > > + "Out Of Memory Killer Enabled: ")); > > > > > > 416 value = > > > > > > cgroupV1.isCpuSetMemoryPressureEnabled(); > > > > > > 417 ostream.println(formatBoolean(value, INDENT > > > > > > + "CPUSet Memory Pressure Enabled: ")); > > > > > > 418 } > > > > > > > > > > > > MetricsCgroupV1 is linux-only. It will fail the > > > > > > compilation when > > > > > > building on non-linux. One option is to move this code to > > > > > > src/java.base/linux/share/sun/launcher/CgroupMetrics.java > > > > > > > > > > > > Are they continued to be interesting metrics to be output > > > > > > from > > > > > > -XshowSetting? I wonder if they can simply be dropped from > > > > > > the output. > > > > > > Bob will have an opinion. > > > > > > > > > > I've removed those extra cgroup v1 specific metrics printed > > > > > via > > > > > -XshowSettings:system. Not sure what to do with > > > > > MetricsCgroupV1. It's > > > > > only used in tests in webrev 10. On the other hand the idea > > > > > would be > > > > > for consumers to downcast it to MetricsCgroupV1 if they > > > > > needed those > > > > > extra metrics. > > > > > > > > > > Thanks, > > > > > Severin > > > > > From chris.plummer at oracle.com Wed Feb 12 19:02:19 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 12 Feb 2020 11:02:19 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <3a21aa8f-8081-da59-cb10-87e79f81e2e3@oracle.com> References: <16754f9b-61a3-15d6-7cb6-8e9895662657@oracle.com> <979CFEC7-538E-46EE-988F-1935AFB8AEFA@oracle.com> <5bddcc55-5340-bea4-8cbb-1338c121d6d4@oracle.com> <86C876E3-6005-4ADE-9224-DEF6A2908F26@oracle.com> <79c65fd4-166e-98de-9327-ede4e9a20421@oracle.com> <3a21aa8f-8081-da59-cb10-87e79f81e2e3@oracle.com> Message-ID: Hi David, > What you have below is a mix of #2 and #3 - you convert to a generic > exception but also re-assert the interrupt state. That's a little > unusual. That what I also thought which is why I was suggesting not doing the interrupt() call and only throwing the RuntimeException. I agree that doing both does not make sense, and in general doing (3) does not make sense if the caller is not setup to properly check the interrupt state. Chris On 2/12/20 5:58 AM, David Holmes wrote: > Hi Chris, > > I think you are overthinking this. :) > > What you have observed is that the code that actually uses this method > does not utilise interrupts, or expect them, so if you artifically > inject one in this library method then you see things failing in > unexpected ways. That also means that if the thread was interrupted by > some other piece of logic then it would also fail in unexpected ways. > That doesn't negate your choice to re-assert the interrupt state. > > From a library writing perspective if you have a method that performs > a blocking call that can throw InterruptedException then you generally > have three choices: > > 1. Throw InterruptedException yourself and pass the buck to your callers. > 2. Convert the InterruptedException to a more general failure > exception - typically an unchecked RuntimeException - for which > interruption is but one possible cause; or > 3. Catch the InterruptedException and allow the method to complete > normally (i.e. not by throwing an exception) but re-assert the > interrupt state so that a caller checking for interruption will still > see that it occurred. > > What you have below is a mix of #2 and #3 - you convert to a generic > exception but also re-assert the interrupt state. That's a little > unusual. > > David > > > On 12/02/2020 6:16 pm, Chris Plummer wrote: >> Hi Igor, >> >> I think it might be best to the interrupt() call out. I wanted to see >> what would happen if we ever got an InterruptedException, so I added >> the following to the start of Platform.shouldSAAttach(): >> >> ???????? try { >> ???????????? throw new InterruptedException(); >> ???????? } catch (InterruptedException e) { >> ???????????? Thread.currentThread().interrupt(); >> ???????????? throw new RuntimeException(e); >> ???????? } >> >> At the start of the test run, before any tests are actually run, I >> see the following: >> >> failed to get value for vm.hasSAandCanAttach >> java.lang.RuntimeException: java.lang.InterruptedException >> ???? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:300) >> ???? at requires.VMProps.vmHasSAandCanAttach(VMProps.java:327) >> ???? at requires.VMProps$SafeMap.put(VMProps.java:69) >> ???? at requires.VMProps.call(VMProps.java:101) >> ???? at requires.VMProps.call(VMProps.java:57) >> ???? at >> com.sun.javatest.regtest.agent.GetJDKProperties.run(GetJDKProperties.java:80) >> ???? at >> com.sun.javatest.regtest.agent.GetJDKProperties.main(GetJDKProperties.java:54) >> Caused by: java.lang.InterruptedException >> ???? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:297) >> ???? ... 6 more >> >> This seems reasonable. >> >> For each test that checks vm.hasSAandCanAttach I also see. >> >> TEST RESULT: Error. Error evaluating expression: >> vm.hasSAandCanAttach: java.lang.RuntimeException: >> java.lang.InterruptedException >> >> This too seems reasonable. >> >> For tests that don't check vm.hasSAandCanAttach, but instead make a >> runtime check that calls Platform.shouldSAAttach(), the test fails with: >> >> java.lang.IllegalThreadStateException: process hasn't exited >> ???? at java.base/java.lang.ProcessImpl.exitValue(ProcessImpl.java:500) >> ???? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:380) >> ???? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:433) >> ???? at ClhsdbAttach.main(ClhsdbAttach.java:77) >> ???? at >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native >> Method) >> ???? at >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> ???? at >> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> ???? at java.base/java.lang.reflect.Method.invoke(Method.java:564) >> ???? at >> com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) >> ???? at java.base/java.lang.Thread.run(Thread.java:832) >> >> This is a confusing way to fail. The reason it fails this way is >> because stopApp() first calls waitAppTerminiate(), which does the >> following: >> >> ???? public void waitAppTerminate() { >> ???????? // This code is modeled after tail end of >> ProcessTools.getOutput(). >> ???????? try { >> ???????????? appProcess.waitFor(); >> ???????????? outPumperThread.join(); >> ???????????? errPumperThread.join(); >> ???????? } catch (InterruptedException e) { >> ???????????? Thread.currentThread().interrupt(); >> ???????????? // pass >> ???????? } >> ???? } >> >> I added an e.printStackTrace() call and see the following: >> >> java.lang.InterruptedException >> ???? at java.base/java.lang.Object.wait(Native Method) >> ???? at java.base/java.lang.Object.wait(Object.java:321) >> ???? at java.base/java.lang.ProcessImpl.waitFor(ProcessImpl.java:474) >> ???? at >> jdk.test.lib.apps.LingeredApp.waitAppTerminate(LingeredApp.java:239) >> ???? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:380) >> ???? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:434) >> >> So the earlier call to interrupt() is resulting in waitAppTerminate() >> not actually waiting for exit. This then results in stopApp() getting >> IllegalThreadStateException when calling Process.exitValue(). >> >> If I comment out the call to interrupt() in >> Platform.shouldSAAttach(), I think the failure stack trace is much >> better: >> >> java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: >> java.lang.InterruptedException >> ???? at ClhsdbAttach.main(ClhsdbAttach.java:75) >> ???? at >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native >> Method) >> ???? at >> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >> ???? at >> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >> ???? at java.base/java.lang.reflect.Method.invoke(Method.java:564) >> ???? at >> com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) >> ???? at java.base/java.lang.Thread.run(Thread.java:832) >> Caused by: java.lang.RuntimeException: java.lang.InterruptedException >> ???? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:300) >> ???? at ClhsdbLauncher.run(ClhsdbLauncher.java:199) >> ???? at ClhsdbAttach.main(ClhsdbAttach.java:71) >> ???? ... 6 more >> Caused by: java.lang.InterruptedException >> ???? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:297) >> ???? ... 8 more >> >> There's still a minor issue with rethrowing the RuntimeException >> encapsulated inside another RuntimeException. That the fault of the >> test which is catching all Exceptions and encapsulating them in a >> RuntimeException, even if the Exceptions itself is already a >> RuntimeException. It should add have a catch clause for >> RuntimeException, and just rethrow it without encapulating it. All >> the Clhsdb tests seem to do this, so that's about 20 places to fix. >> Probably not worth doing unless some other cleanup is being done at >> the same time. >> >> Chris >> >> On 2/11/20 10:30 PM, Igor Ignatyev wrote: >>> I'd say yes, it's better to still call Thread::interrupt. >>> >>> -- Igor >>> >>>> On Feb 11, 2020, at 10:19 PM, Chris Plummer >>>> > wrote: >>>> >>>> Ok. Should I still call interrupt()? >>>> >>>> Chris >>>> >>>> On 2/11/20 10:07 PM, Igor Ignatyev wrote: >>>>> Hi Chris, >>>>> >>>>> that's a common practice for any kind of library-ish code, if >>>>> there are no explicit check of interrupt status, it will be >>>>> checked a by next operation which might be interrupted. in this >>>>> particular case, I agree rethrowing it as an unchecked exception >>>>> might be a good alternative. >>>>> >>>>> -- Igor >>>>> >>>>>> On Feb 11, 2020, at 10:03 PM, Chris Plummer >>>>>> > wrote: >>>>>> >>>>>> Hi Igor, >>>>>> >>>>>> I guess I fail to see the benefit of this. Who is going to check >>>>>> the interrupt status of this thread and do something meaningful >>>>>> with it? It seems we would want to immediately propagate the >>>>>> failure by throwing a RuntimeException. This will work well when >>>>>> called from a test since this is a common way to fail a test. The >>>>>> other use of this code is by VMProps.vmHasSAandCanAttach(). It >>>>>> looks like if a RuntimeException is thrown the right thing will >>>>>> happen when SafeMap.put() catches the exception (it catches all >>>>>> Throwables). >>>>>> >>>>>> Chris >>>>>> >>>>>> On 2/11/20 7:12 PM, Igor Ignatev wrote: >>>>>>> rather like this : >>>>>>> >>>>>>>> } catch (InterruptedException e) { >>>>>>>> ?Thread.currentThread().interrupt(); >>>>>>>> ? ?return false; // assume not signed >>>>>>>> } >>>>>>> >>>>>>> ? Igor >>>>>>> >>>>>>>> On Feb 11, 2020, at 6:15 PM, Chris Plummer >>>>>>>> > >>>>>>>> wrote: >>>>>>>> >>>>>>>> ? >>>>>>>> Like this? >>>>>>>> >>>>>>>> ??????? } catch (InterruptedException e) { >>>>>>>> Thread.currentThread().interrupt(); >>>>>>>> ??????????? throw new RuntimeException(e); >>>>>>>> ??????? } >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 2/11/20 2:23 PM, Igor Ignatyev wrote: >>>>>>>>> no, I meant to call Thread.currentThread().interrupt(), >>>>>>>>> calling that will restore interrupted state of the thread, so >>>>>>>>> an user of Platform class will be able to response to it >>>>>>>>> appropriately, w/ your current code, the fact that the thread >>>>>>>>> was interrupted will be missed, and in most cases it is not >>>>>>>>> right thing to do. >>>>>>>>> >>>>>>>>> -- Igor >>>>>>>>> >>>>>>>>>> On Feb 11, 2020, at 2:02 PM, Chris Plummer >>>>>>>>>> > >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hi Igor, >>>>>>>>>> >>>>>>>>>> I'm not sure what you mean by restore the interrupt state. Do >>>>>>>>>> you mean loop back to the waitFor() call? >>>>>>>>>> >>>>>>>>>> thanks, >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 2/11/20 1:55 PM, Igor Ignatyev wrote: >>>>>>>>>>> Hi Chris, >>>>>>>>>>> >>>>>>>>>>> I don't insist on (3), so I'm fine if you don't want to >>>>>>>>>>> change that part. one thing I'd change though is to restore >>>>>>>>>>> thread interrupted state at L#266 of Platform.java (no need >>>>>>>>>>> to publish new webrev) >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> -- Igor >>>>>>>>>>> >>>>>>>>>>>> On Feb 11, 2020, at 1:49 PM, Chris Plummer >>>>>>>>>>>> >>>>>>>>>>> > wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Igor, >>>>>>>>>>>> >>>>>>>>>>>> Here's an updated webrev: >>>>>>>>>>>> >>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.01/index.html >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I rebased to JDK 15 and made all the changes you suggested >>>>>>>>>>>> except for (3). I did not think it is necessary since the >>>>>>>>>>>> code is only executed on OSX. However, if you still feel >>>>>>>>>>>> allowing flexibility in the path separator is important, I >>>>>>>>>>>> can add that change too. >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 2/10/20 1:34 PM, Igor Ignatyev wrote: >>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> >>>>>>>>>>>>> in general it all looks good, I have a few comments (most >>>>>>>>>>>>> of them are editorial): >>>>>>>>>>>>> in?Platform.java: >>>>>>>>>>>>> 1. you have doubled spaced at line#238 (b/w boolean >>>>>>>>>>>>> and?isSignedOSX) >>>>>>>>>>>>> 2. as?FileNotFoundException is?IOException, there is no >>>>>>>>>>>>> need to declare the former in the signature of?isSignedOSX >>>>>>>>>>>>> 3. it's better to pass jdkPath, "bin" and "java" as >>>>>>>>>>>>> separate arguments to Path.get, so the code won't depend >>>>>>>>>>>>> on file separator >>>>>>>>>>>>> 4. you are waiting for codesign to finish w/o reading its >>>>>>>>>>>>> cout / cerr, which might lead to a deadlock (if?codesign >>>>>>>>>>>>> will exhaust IO buffer before exiting), so you need to >>>>>>>>>>>>> either create two separate threads to read cout and cerr >>>>>>>>>>>>> or ?redirect these streams them to files and read these >>>>>>>>>>>>> files afterwards or just ignore cout/cerr by using >>>>>>>>>>>>> Redirect.DISCARD. I'd personally recommend the latter as >>>>>>>>>>>>> the result of codesign can be reliably deduced from its >>>>>>>>>>>>> exitcode (0 - signed, 1 - verification failed, 2 - wrong >>>>>>>>>>>>> arguments, 3 - not all requirements from R are satisfied) >>>>>>>>>>>>> and using cout/cerr is somewhat fragile as there is no >>>>>>>>>>>>> guarantee output format won't be changed. >>>>>>>>>>>>> >>>>>>>>>>>>> the rest looks good to me. >>>>>>>>>>>>> >>>>>>>>>>>>> -- Igor >>>>>>>>>>>>> >>>>>>>>>>>>>> On Feb 10, 2020, at 11:48 AM, Chris Plummer >>>>>>>>>>>>>> >>>>>>>>>>>>> > >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Ping #2. It's not that hard of a review. Most of it is >>>>>>>>>>>>>> the new Platform.isSignedOSX() method, which is well >>>>>>>>>>>>>> commented and pretty straight froward. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2/4/20 5:04 PM, Chris Plummer wrote: >>>>>>>>>>>>>>> Ping! >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> And I decided to push to 15 instead of 14. Will backport >>>>>>>>>>>>>>> to 14 eventually. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 1/30/20 10:20 PM, Chris Plummer wrote: >>>>>>>>>>>>>>>> Yes, you are correct: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 1/30/20 10:13 PM, Igor Ignatyev wrote: >>>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00? >>>>>>>>>>>>>>>>> ?seems >>>>>>>>>>>>>>>>> to be a webrev from another issue, should it have >>>>>>>>>>>>>>>>> been?http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00/? >>>>>>>>>>>>>>>>> ?? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -- Igor >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Jan 30, 2020, at 10:10 PM, Chris Plummer >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Please review the following fix for some SA tests >>>>>>>>>>>>>>>>>> that are failing on Mac OS X 10.14.5 and later: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The issue is that SA can't attach to a signed binary >>>>>>>>>>>>>>>>>> starting with 10.14.5. There is no workaround for >>>>>>>>>>>>>>>>>> this, so these tests are being disabled when it is >>>>>>>>>>>>>>>>>> detected that the binary is signed and we are running >>>>>>>>>>>>>>>>>> on 10.14 or later (I chose all 10.14 releases to >>>>>>>>>>>>>>>>>> simplify the check). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Some background may help explain the fix. In order >>>>>>>>>>>>>>>>>> for SA to attach to a live process (not a core file) >>>>>>>>>>>>>>>>>> on OSX, either the attaching process (ie. the test) >>>>>>>>>>>>>>>>>> has to be run as root, or sudo needs to be supported. >>>>>>>>>>>>>>>>>> However, the only tests that make the sudo check are >>>>>>>>>>>>>>>>>> the 20 or so that use ClhsdbLauncher. The rest all >>>>>>>>>>>>>>>>>> rely on "@requires vm.hasSAandCanAttach" to filter >>>>>>>>>>>>>>>>>> out tests that use SA attach. vm.hasSAandCanAttach >>>>>>>>>>>>>>>>>> only checks if the test is being run as root. Thus >>>>>>>>>>>>>>>>>> all our non-ClhsdbLauncher tests that SA attach to a >>>>>>>>>>>>>>>>>> live process are currently not run unless they are >>>>>>>>>>>>>>>>>> run as root. 8238268 [1] has been filed to address >>>>>>>>>>>>>>>>>> this, making it so all the tests will attempt to use >>>>>>>>>>>>>>>>>> sudo if not run as root. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Because of the difference in how ClhsdbLauncher tests >>>>>>>>>>>>>>>>>> and "@requires? vm.hasSAandCanAttach" tests check to >>>>>>>>>>>>>>>>>> see if they are runnable, this fix needs to address >>>>>>>>>>>>>>>>>> both types of checks. The common code for both these >>>>>>>>>>>>>>>>>> cases is Platform.shouldSAAttach(), which on OSX >>>>>>>>>>>>>>>>>> basically equates to check to see if we are running >>>>>>>>>>>>>>>>>> as root. I changed it to also return false if running >>>>>>>>>>>>>>>>>> on signed binary with 10.14 or later. However, this >>>>>>>>>>>>>>>>>> confused the ClhsdbLauncher use of >>>>>>>>>>>>>>>>>> Platform.shouldSAAttach() somewhat, since it assumed >>>>>>>>>>>>>>>>>> a false result only happens because you are not >>>>>>>>>>>>>>>>>> running as root (in which case it would then check if >>>>>>>>>>>>>>>>>> sudo will work). So ClhsdbLauncher now has double >>>>>>>>>>>>>>>>>> check that the false result was not because of >>>>>>>>>>>>>>>>>> running a signed binary. If it is signed, it won't do >>>>>>>>>>>>>>>>>> the sudo check. This will get cleaned up with 8238268 >>>>>>>>>>>>>>>>>> [1], which will move the sudo check into >>>>>>>>>>>>>>>>>> Platform.shouldSAAttach(). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238268 >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>> >>>>> >>>> >>> >> From mandy.chung at oracle.com Wed Feb 12 19:32:00 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Wed, 12 Feb 2020 11:32:00 -0800 Subject: [PING] RFR: 8231111: Cgroups v2: Rework Metrics in java.base so as to recognize unified hierarchy In-Reply-To: <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> References: <75fc377f8d5ca76b7dac02f55db640cbdd305633.camel@redhat.com> <4bf65380bc26cd3bf684d7994b33e66bcb87927b.camel@redhat.com> <6CACCC0D-7F5A-42A3-83F1-746497940CCA@oracle.com> <29544339574e34c4c25cbec0314c26f35e8d1a99.camel@redhat.com> <4CE7C7F6-ABFA-4263-98B2-32BBD5013A3C@oracle.com> <10b5e83bfb7e618e5f5906c8e707057ff8680785.camel@redhat.com> <4a304e2ce72a53859b4e9cc8b21db404a260b531.camel@redhat.com> <97e7ff2b-f2cc-a666-afb8-521c0f5c37e7@oracle.com> <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> Message-ID: I'll take a look next couple days.? I was out last few days and am catching up on other things. Mandy On 2/11/20 10:04 AM, Severin Gehwolf wrote: > Hi Mandy, Bob, > > Thanks again for the reviews and patience on this. Sorry it took me so > long to get back to this :-/ > > Updated webrev: > Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/webrev/ > incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/incremental/webrev/ > > I've tested this with docker tests on cgroup v1 and via podman on a > cgroup v2 system. They pass. I'll be running this through jdk-submit as > well. > > More below. > > On Tue, 2020-01-21 at 16:09 -0800, Mandy Chung wrote: >> Hi Severin, >> >> Thanks for the update. >> >> On 1/21/20 11:30 AM, Severin Gehwolf wrote: >>> Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/webrev/ >>> incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/09/incremental/webrev/ >>> >> >> I have answered my own question. Most of the metrics used to return 0 if unavailable due to IOException reading a file or malformed content and now are changed to return -2 due to error fetching the metric. >> >> The following are about limits which used to return -1 if unlimited or no limit set. >> public long getCpuQuota(); >> public long getCpuShares(); >> public long getMemoryLimit(); >> public long getMemoryAndSwapLimit(); >> public long getMemorySoftLimit(); >> >> With this patch, only getMemoryLimit and getMemoryAndSwapLimit specify to return -1 if unlimited or no limit set. However the implementation does return -1. All of the above specify to return -2 if unavailable due to error fetching the metric. >> >> I found the implementation quite hard to follow. I spent some time reviewing the code to see if the implementation matches the spec but I can't easily tell yet. For example, >> CgroupSubsystemController::getLongValueMatchingLine returns -1 when IOException occurs. >> CgroupSubsystemController::getLongEntry returns 0L if IOException occurs. >> >> CgroupV1SubsystemController::convertStringToLong returns Long.MAX_VALUE >> when value overflows > This one is intentional. It's mapped back to unlimited via > longValOrUnlimited(). The reason for this is that cgroup v1 doesn't > have a concept of "unlimited". Unlimited values will be a very large > numbers in cgroup v1 files. > >> CgroupV2SubsystemController::convertStringToLong returns -1 when IOException occurs >> >> CgroupV1Subsystem::getCpuShares return -1 if cpu.shares == 0 or 1024 >> CgroupV2Subsystem::getCpuShares returns -1 if cpu.weight == 100 or 0 > These two are special cases too. See the implementation note of > Metrics.getCpuShares(). In the cgroup v2 case the default value is 100 > (over 1024 in cgroup v1). That's why unlimited is being returned for > those values. > >> CgroupV2Subsystem::getFromCpuMax returns -1 if error in reading cpu.max or malformed content >> CgroupV2Subsystem::sumTokensIOStat returns -2 if IOException error >> This is called by getBlkIOServiceCount and getBlkIOServiced >> >> I think this can be improved and add the documentation to describe >> what the methods do. Since Metrics APIs consistently return -2 if >> unavailable due to error in fetching the metric, why some utility >> methods in *Subsystem and *SubsystemController return -1 upon error >> and 0 when unlimited? >> >> I suspect if the getXXXValue and other methods are clearly documented >> with the error cases (possibly renaming the method name if appropriate) >> CgroupV1Subsystem and CgroupV2SubSystem will become very explicit >> to understand. > This should be fixed now. > > I've gone through the API doc of Metrics.java and have updated it. In > general, I've updated it to return -1 if metric is unavailable (due to > error in reading some files or content being empty), and -2 if not > supported. No method returns -2 currently, but it might change and it's > good to have some way of saying "not implementable" for this subsystem > in the spec. That's my take on it anyway. > > There is also a new unit test for shared controller logic: > TestCgroupSubsystemController.java > > It execises various cases of error/success. > > That is to ensure proper symmetry across the various cases (including > IOException). I've also documented static methods in > CgroupSubsystemController. Overall, all methods now return the same > values for cgroup v1 and cgroup v2 (given the impl nuances) for the > various cases. > >> CgroupSubsystem.java >> >> 44 public static final double DOUBLE_RETVAL_NOT_SUPPORTED = LONG_RETVAL_NOT_SUPPORTED; >> 49 public static final Boolean BOOL_RETVAL_NOT_SUPPORTED = null; >> >> They are no longer needed, right? > Removed. > >> CgroupSubsystemFactory.java >> >> 89 System.err.println("Warning: Mixed cgroupv1 and cgroupv2 not supported. Metrics disabled."); >> >> >> I expect this be a System.Logger log > Updated. > >> 114 if (!Integer.valueOf(0).toString().equals(tokens[0])) { >> >> This can be simplified to if (!"0".equals(tokens[0])) > Done, thanks! > >> LauncherHelper.java >> >> 407 // Extended cgroupv1 specific metrics >> 408 if (c instanceof MetricsCgroupV1) { >> 409 MetricsCgroupV1 cgroupV1 = (MetricsCgroupV1)c; >> 410 limit = cgroupV1.getKernelMemoryLimit(); >> 411 ostream.println(formatLimitString(limit, INDENT + "Kernel Memory Limit: ")); >> 412 limit = cgroupV1.getTcpMemoryLimit(); >> 413 ostream.println(formatLimitString(limit, INDENT + "TCP Memory Limit: ")); >> 414 Boolean value = cgroupV1.isMemoryOOMKillEnabled(); >> 415 ostream.println(formatBoolean(value, INDENT + "Out Of Memory Killer Enabled: ")); >> 416 value = cgroupV1.isCpuSetMemoryPressureEnabled(); >> 417 ostream.println(formatBoolean(value, INDENT + "CPUSet Memory Pressure Enabled: ")); >> 418 } >> >> MetricsCgroupV1 is linux-only. It will fail the compilation when >> building on non-linux. One option is to move this code to >> src/java.base/linux/share/sun/launcher/CgroupMetrics.java >> >> Are they continued to be interesting metrics to be output from >> -XshowSetting? I wonder if they can simply be dropped from the output. >> Bob will have an opinion. > I've removed those extra cgroup v1 specific metrics printed via > -XshowSettings:system. Not sure what to do with MetricsCgroupV1. It's > only used in tests in webrev 10. On the other hand the idea would be > for consumers to downcast it to MetricsCgroupV1 if they needed those > extra metrics. > > Thanks, > Severin > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexey.menkov at oracle.com Wed Feb 12 21:28:00 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 12 Feb 2020 13:28:00 -0800 Subject: RFR: JDK-8238710: LingeredApp doesn't log stdout/stderr if exits with non-zero code Message-ID: <0d5df909-bb4b-00ce-b2b9-babdb5311bb6@oracle.com> Hi all, Please review small fix for https://bugs.openjdk.java.net/browse/JDK-8238710 webrev: http://cr.openjdk.java.net/~amenkov/jdk15/LingeredApp_log_error/webrev/ --alex From chris.plummer at oracle.com Wed Feb 12 21:53:00 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 12 Feb 2020 13:53:00 -0800 Subject: RFR: JDK-8238710: LingeredApp doesn't log stdout/stderr if exits with non-zero code In-Reply-To: <0d5df909-bb4b-00ce-b2b9-babdb5311bb6@oracle.com> References: <0d5df909-bb4b-00ce-b2b9-babdb5311bb6@oracle.com> Message-ID: <24fec2c9-fcab-522a-fbf0-bab35a86e611@oracle.com> Hi Alex, Thanks for doing this. Not having output from a spawned process that failed is an issue with more than just LingeredApp tests. This is a good start in getting those fixed. I'm a little unclear on one part of your fix. Why did you move the "appProcess != null" into finishApp(). You already make that check in stopApp(). If anything it looks like that check should have been there before your changes, but is no longer needed after your changes. thanks, Chris On 2/12/20 1:28 PM, Alex Menkov wrote: > Hi all, > > Please review small fix for > https://bugs.openjdk.java.net/browse/JDK-8238710 > webrev: > http://cr.openjdk.java.net/~amenkov/jdk15/LingeredApp_log_error/webrev/ > > --alex From alexey.menkov at oracle.com Wed Feb 12 21:58:08 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 12 Feb 2020 13:58:08 -0800 Subject: RFR: JDK-8238710: LingeredApp doesn't log stdout/stderr if exits with non-zero code In-Reply-To: <24fec2c9-fcab-522a-fbf0-bab35a86e611@oracle.com> References: <0d5df909-bb4b-00ce-b2b9-babdb5311bb6@oracle.com> <24fec2c9-fcab-522a-fbf0-bab35a86e611@oracle.com> Message-ID: Hi Chris, thanks for the review. finishApp is also called from startApp(String... cmd) method and appProcess can be not initialized there. In the case finishApp will throw NPE (calling appProcess.exitValue()) --alex On 02/12/2020 13:53, Chris Plummer wrote: > Hi Alex, > > Thanks for doing this. Not having output from a spawned process that > failed is an issue with more than just LingeredApp tests. This is a good > start in getting those fixed. > > I'm a little unclear on one part of your fix. Why did you move the > "appProcess != null" into finishApp(). You already make that check in > stopApp(). If anything it looks like that check should have been there > before your changes, but is no longer needed after your changes. > > thanks, > > Chris > > On 2/12/20 1:28 PM, Alex Menkov wrote: >> Hi all, >> >> Please review small fix for >> https://bugs.openjdk.java.net/browse/JDK-8238710 >> webrev: >> http://cr.openjdk.java.net/~amenkov/jdk15/LingeredApp_log_error/webrev/ >> >> --alex > From david.holmes at oracle.com Wed Feb 12 22:15:08 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 13 Feb 2020 08:15:08 +1000 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: References: <16754f9b-61a3-15d6-7cb6-8e9895662657@oracle.com> <979CFEC7-538E-46EE-988F-1935AFB8AEFA@oracle.com> <5bddcc55-5340-bea4-8cbb-1338c121d6d4@oracle.com> <86C876E3-6005-4ADE-9224-DEF6A2908F26@oracle.com> <79c65fd4-166e-98de-9327-ede4e9a20421@oracle.com> <3a21aa8f-8081-da59-cb10-87e79f81e2e3@oracle.com> Message-ID: On 13/02/2020 5:02 am, Chris Plummer wrote: > Hi David, > >> What you have below is a mix of #2 and #3 - you convert to a generic >> exception but also re-assert the interrupt state. That's a little >> unusual. > That what I also thought which is why I was suggesting not doing the > interrupt() call and only throwing the RuntimeException. I agree that > doing both does not make sense, and in general doing (3) does not make > sense if the caller is not setup to properly check the interrupt state. From a library writer perspective you should have zero knowledge of the caller and doing (3) makes perfect sense. Remember that you will only re-assert the interrupt() if you get the InterruptedException in the first place, which means that some other code already issued the initial interrupt(). Whether this code shoud be treated as general purpose library code is another matter. Personally, in this case I think I'd use (2) and make interruption a failure mode. David > Chris > > > On 2/12/20 5:58 AM, David Holmes wrote: >> Hi Chris, >> >> I think you are overthinking this. :) >> >> What you have observed is that the code that actually uses this method >> does not utilise interrupts, or expect them, so if you artifically >> inject one in this library method then you see things failing in >> unexpected ways. That also means that if the thread was interrupted by >> some other piece of logic then it would also fail in unexpected ways. >> That doesn't negate your choice to re-assert the interrupt state. >> >> From a library writing perspective if you have a method that performs >> a blocking call that can throw InterruptedException then you generally >> have three choices: >> >> 1. Throw InterruptedException yourself and pass the buck to your callers. >> 2. Convert the InterruptedException to a more general failure >> exception - typically an unchecked RuntimeException - for which >> interruption is but one possible cause; or >> 3. Catch the InterruptedException and allow the method to complete >> normally (i.e. not by throwing an exception) but re-assert the >> interrupt state so that a caller checking for interruption will still >> see that it occurred. >> >> What you have below is a mix of #2 and #3 - you convert to a generic >> exception but also re-assert the interrupt state. That's a little >> unusual. >> >> David >> >> >> On 12/02/2020 6:16 pm, Chris Plummer wrote: >>> Hi Igor, >>> >>> I think it might be best to the interrupt() call out. I wanted to see >>> what would happen if we ever got an InterruptedException, so I added >>> the following to the start of Platform.shouldSAAttach(): >>> >>> ???????? try { >>> ???????????? throw new InterruptedException(); >>> ???????? } catch (InterruptedException e) { >>> ???????????? Thread.currentThread().interrupt(); >>> ???????????? throw new RuntimeException(e); >>> ???????? } >>> >>> At the start of the test run, before any tests are actually run, I >>> see the following: >>> >>> failed to get value for vm.hasSAandCanAttach >>> java.lang.RuntimeException: java.lang.InterruptedException >>> ???? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:300) >>> ???? at requires.VMProps.vmHasSAandCanAttach(VMProps.java:327) >>> ???? at requires.VMProps$SafeMap.put(VMProps.java:69) >>> ???? at requires.VMProps.call(VMProps.java:101) >>> ???? at requires.VMProps.call(VMProps.java:57) >>> ???? at >>> com.sun.javatest.regtest.agent.GetJDKProperties.run(GetJDKProperties.java:80) >>> >>> ???? at >>> com.sun.javatest.regtest.agent.GetJDKProperties.main(GetJDKProperties.java:54) >>> >>> Caused by: java.lang.InterruptedException >>> ???? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:297) >>> ???? ... 6 more >>> >>> This seems reasonable. >>> >>> For each test that checks vm.hasSAandCanAttach I also see. >>> >>> TEST RESULT: Error. Error evaluating expression: >>> vm.hasSAandCanAttach: java.lang.RuntimeException: >>> java.lang.InterruptedException >>> >>> This too seems reasonable. >>> >>> For tests that don't check vm.hasSAandCanAttach, but instead make a >>> runtime check that calls Platform.shouldSAAttach(), the test fails with: >>> >>> java.lang.IllegalThreadStateException: process hasn't exited >>> ???? at java.base/java.lang.ProcessImpl.exitValue(ProcessImpl.java:500) >>> ???? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:380) >>> ???? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:433) >>> ???? at ClhsdbAttach.main(ClhsdbAttach.java:77) >>> ???? at >>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> >>> ???? at >>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> >>> ???? at >>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> >>> ???? at java.base/java.lang.reflect.Method.invoke(Method.java:564) >>> ???? at >>> com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) >>> >>> ???? at java.base/java.lang.Thread.run(Thread.java:832) >>> >>> This is a confusing way to fail. The reason it fails this way is >>> because stopApp() first calls waitAppTerminiate(), which does the >>> following: >>> >>> ???? public void waitAppTerminate() { >>> ???????? // This code is modeled after tail end of >>> ProcessTools.getOutput(). >>> ???????? try { >>> ???????????? appProcess.waitFor(); >>> ???????????? outPumperThread.join(); >>> ???????????? errPumperThread.join(); >>> ???????? } catch (InterruptedException e) { >>> ???????????? Thread.currentThread().interrupt(); >>> ???????????? // pass >>> ???????? } >>> ???? } >>> >>> I added an e.printStackTrace() call and see the following: >>> >>> java.lang.InterruptedException >>> ???? at java.base/java.lang.Object.wait(Native Method) >>> ???? at java.base/java.lang.Object.wait(Object.java:321) >>> ???? at java.base/java.lang.ProcessImpl.waitFor(ProcessImpl.java:474) >>> ???? at >>> jdk.test.lib.apps.LingeredApp.waitAppTerminate(LingeredApp.java:239) >>> ???? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:380) >>> ???? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:434) >>> >>> So the earlier call to interrupt() is resulting in waitAppTerminate() >>> not actually waiting for exit. This then results in stopApp() getting >>> IllegalThreadStateException when calling Process.exitValue(). >>> >>> If I comment out the call to interrupt() in >>> Platform.shouldSAAttach(), I think the failure stack trace is much >>> better: >>> >>> java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: >>> java.lang.InterruptedException >>> ???? at ClhsdbAttach.main(ClhsdbAttach.java:75) >>> ???? at >>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> >>> ???? at >>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>> >>> ???? at >>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> >>> ???? at java.base/java.lang.reflect.Method.invoke(Method.java:564) >>> ???? at >>> com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) >>> >>> ???? at java.base/java.lang.Thread.run(Thread.java:832) >>> Caused by: java.lang.RuntimeException: java.lang.InterruptedException >>> ???? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:300) >>> ???? at ClhsdbLauncher.run(ClhsdbLauncher.java:199) >>> ???? at ClhsdbAttach.main(ClhsdbAttach.java:71) >>> ???? ... 6 more >>> Caused by: java.lang.InterruptedException >>> ???? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:297) >>> ???? ... 8 more >>> >>> There's still a minor issue with rethrowing the RuntimeException >>> encapsulated inside another RuntimeException. That the fault of the >>> test which is catching all Exceptions and encapsulating them in a >>> RuntimeException, even if the Exceptions itself is already a >>> RuntimeException. It should add have a catch clause for >>> RuntimeException, and just rethrow it without encapulating it. All >>> the Clhsdb tests seem to do this, so that's about 20 places to fix. >>> Probably not worth doing unless some other cleanup is being done at >>> the same time. >>> >>> Chris >>> >>> On 2/11/20 10:30 PM, Igor Ignatyev wrote: >>>> I'd say yes, it's better to still call Thread::interrupt. >>>> >>>> -- Igor >>>> >>>>> On Feb 11, 2020, at 10:19 PM, Chris Plummer >>>>> > wrote: >>>>> >>>>> Ok. Should I still call interrupt()? >>>>> >>>>> Chris >>>>> >>>>> On 2/11/20 10:07 PM, Igor Ignatyev wrote: >>>>>> Hi Chris, >>>>>> >>>>>> that's a common practice for any kind of library-ish code, if >>>>>> there are no explicit check of interrupt status, it will be >>>>>> checked a by next operation which might be interrupted. in this >>>>>> particular case, I agree rethrowing it as an unchecked exception >>>>>> might be a good alternative. >>>>>> >>>>>> -- Igor >>>>>> >>>>>>> On Feb 11, 2020, at 10:03 PM, Chris Plummer >>>>>>> > wrote: >>>>>>> >>>>>>> Hi Igor, >>>>>>> >>>>>>> I guess I fail to see the benefit of this. Who is going to check >>>>>>> the interrupt status of this thread and do something meaningful >>>>>>> with it? It seems we would want to immediately propagate the >>>>>>> failure by throwing a RuntimeException. This will work well when >>>>>>> called from a test since this is a common way to fail a test. The >>>>>>> other use of this code is by VMProps.vmHasSAandCanAttach(). It >>>>>>> looks like if a RuntimeException is thrown the right thing will >>>>>>> happen when SafeMap.put() catches the exception (it catches all >>>>>>> Throwables). >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 2/11/20 7:12 PM, Igor Ignatev wrote: >>>>>>>> rather like this : >>>>>>>> >>>>>>>>> } catch (InterruptedException e) { >>>>>>>>> ?Thread.currentThread().interrupt(); >>>>>>>>> ? ?return false; // assume not signed >>>>>>>>> } >>>>>>>> >>>>>>>> ? Igor >>>>>>>> >>>>>>>>> On Feb 11, 2020, at 6:15 PM, Chris Plummer >>>>>>>>> > >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>> ? >>>>>>>>> Like this? >>>>>>>>> >>>>>>>>> ??????? } catch (InterruptedException e) { >>>>>>>>> Thread.currentThread().interrupt(); >>>>>>>>> ??????????? throw new RuntimeException(e); >>>>>>>>> ??????? } >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 2/11/20 2:23 PM, Igor Ignatyev wrote: >>>>>>>>>> no, I meant to call Thread.currentThread().interrupt(), >>>>>>>>>> calling that will restore interrupted state of the thread, so >>>>>>>>>> an user of Platform class will be able to response to it >>>>>>>>>> appropriately, w/ your current code, the fact that the thread >>>>>>>>>> was interrupted will be missed, and in most cases it is not >>>>>>>>>> right thing to do. >>>>>>>>>> >>>>>>>>>> -- Igor >>>>>>>>>> >>>>>>>>>>> On Feb 11, 2020, at 2:02 PM, Chris Plummer >>>>>>>>>>> > >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Hi Igor, >>>>>>>>>>> >>>>>>>>>>> I'm not sure what you mean by restore the interrupt state. Do >>>>>>>>>>> you mean loop back to the waitFor() call? >>>>>>>>>>> >>>>>>>>>>> thanks, >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 2/11/20 1:55 PM, Igor Ignatyev wrote: >>>>>>>>>>>> Hi Chris, >>>>>>>>>>>> >>>>>>>>>>>> I don't insist on (3), so I'm fine if you don't want to >>>>>>>>>>>> change that part. one thing I'd change though is to restore >>>>>>>>>>>> thread interrupted state at L#266 of Platform.java (no need >>>>>>>>>>>> to publish new webrev) >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> -- Igor >>>>>>>>>>>> >>>>>>>>>>>>> On Feb 11, 2020, at 1:49 PM, Chris Plummer >>>>>>>>>>>>> >>>>>>>>>>>> > wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Igor, >>>>>>>>>>>>> >>>>>>>>>>>>> Here's an updated webrev: >>>>>>>>>>>>> >>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.01/index.html >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I rebased to JDK 15 and made all the changes you suggested >>>>>>>>>>>>> except for (3). I did not think it is necessary since the >>>>>>>>>>>>> code is only executed on OSX. However, if you still feel >>>>>>>>>>>>> allowing flexibility in the path separator is important, I >>>>>>>>>>>>> can add that change too. >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 2/10/20 1:34 PM, Igor Ignatyev wrote: >>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>> >>>>>>>>>>>>>> in general it all looks good, I have a few comments (most >>>>>>>>>>>>>> of them are editorial): >>>>>>>>>>>>>> in?Platform.java: >>>>>>>>>>>>>> 1. you have doubled spaced at line#238 (b/w boolean >>>>>>>>>>>>>> and?isSignedOSX) >>>>>>>>>>>>>> 2. as?FileNotFoundException is?IOException, there is no >>>>>>>>>>>>>> need to declare the former in the signature of?isSignedOSX >>>>>>>>>>>>>> 3. it's better to pass jdkPath, "bin" and "java" as >>>>>>>>>>>>>> separate arguments to Path.get, so the code won't depend >>>>>>>>>>>>>> on file separator >>>>>>>>>>>>>> 4. you are waiting for codesign to finish w/o reading its >>>>>>>>>>>>>> cout / cerr, which might lead to a deadlock (if?codesign >>>>>>>>>>>>>> will exhaust IO buffer before exiting), so you need to >>>>>>>>>>>>>> either create two separate threads to read cout and cerr >>>>>>>>>>>>>> or ?redirect these streams them to files and read these >>>>>>>>>>>>>> files afterwards or just ignore cout/cerr by using >>>>>>>>>>>>>> Redirect.DISCARD. I'd personally recommend the latter as >>>>>>>>>>>>>> the result of codesign can be reliably deduced from its >>>>>>>>>>>>>> exitcode (0 - signed, 1 - verification failed, 2 - wrong >>>>>>>>>>>>>> arguments, 3 - not all requirements from R are satisfied) >>>>>>>>>>>>>> and using cout/cerr is somewhat fragile as there is no >>>>>>>>>>>>>> guarantee output format won't be changed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> the rest looks good to me. >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- Igor >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Feb 10, 2020, at 11:48 AM, Chris Plummer >>>>>>>>>>>>>>> >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ping #2. It's not that hard of a review. Most of it is >>>>>>>>>>>>>>> the new Platform.isSignedOSX() method, which is well >>>>>>>>>>>>>>> commented and pretty straight froward. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2/4/20 5:04 PM, Chris Plummer wrote: >>>>>>>>>>>>>>>> Ping! >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> And I decided to push to 15 instead of 14. Will backport >>>>>>>>>>>>>>>> to 14 eventually. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 1/30/20 10:20 PM, Chris Plummer wrote: >>>>>>>>>>>>>>>>> Yes, you are correct: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 1/30/20 10:13 PM, Igor Ignatyev wrote: >>>>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00? >>>>>>>>>>>>>>>>>> ?seems >>>>>>>>>>>>>>>>>> to be a webrev from another issue, should it have >>>>>>>>>>>>>>>>>> been?http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00/? >>>>>>>>>>>>>>>>>> ?? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> -- Igor >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Jan 30, 2020, at 10:10 PM, Chris Plummer >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Please review the following fix for some SA tests >>>>>>>>>>>>>>>>>>> that are failing on Mac OS X 10.14.5 and later: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The issue is that SA can't attach to a signed binary >>>>>>>>>>>>>>>>>>> starting with 10.14.5. There is no workaround for >>>>>>>>>>>>>>>>>>> this, so these tests are being disabled when it is >>>>>>>>>>>>>>>>>>> detected that the binary is signed and we are running >>>>>>>>>>>>>>>>>>> on 10.14 or later (I chose all 10.14 releases to >>>>>>>>>>>>>>>>>>> simplify the check). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Some background may help explain the fix. In order >>>>>>>>>>>>>>>>>>> for SA to attach to a live process (not a core file) >>>>>>>>>>>>>>>>>>> on OSX, either the attaching process (ie. the test) >>>>>>>>>>>>>>>>>>> has to be run as root, or sudo needs to be supported. >>>>>>>>>>>>>>>>>>> However, the only tests that make the sudo check are >>>>>>>>>>>>>>>>>>> the 20 or so that use ClhsdbLauncher. The rest all >>>>>>>>>>>>>>>>>>> rely on "@requires vm.hasSAandCanAttach" to filter >>>>>>>>>>>>>>>>>>> out tests that use SA attach. vm.hasSAandCanAttach >>>>>>>>>>>>>>>>>>> only checks if the test is being run as root. Thus >>>>>>>>>>>>>>>>>>> all our non-ClhsdbLauncher tests that SA attach to a >>>>>>>>>>>>>>>>>>> live process are currently not run unless they are >>>>>>>>>>>>>>>>>>> run as root. 8238268 [1] has been filed to address >>>>>>>>>>>>>>>>>>> this, making it so all the tests will attempt to use >>>>>>>>>>>>>>>>>>> sudo if not run as root. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Because of the difference in how ClhsdbLauncher tests >>>>>>>>>>>>>>>>>>> and "@requires? vm.hasSAandCanAttach" tests check to >>>>>>>>>>>>>>>>>>> see if they are runnable, this fix needs to address >>>>>>>>>>>>>>>>>>> both types of checks. The common code for both these >>>>>>>>>>>>>>>>>>> cases is Platform.shouldSAAttach(), which on OSX >>>>>>>>>>>>>>>>>>> basically equates to check to see if we are running >>>>>>>>>>>>>>>>>>> as root. I changed it to also return false if running >>>>>>>>>>>>>>>>>>> on signed binary with 10.14 or later. However, this >>>>>>>>>>>>>>>>>>> confused the ClhsdbLauncher use of >>>>>>>>>>>>>>>>>>> Platform.shouldSAAttach() somewhat, since it assumed >>>>>>>>>>>>>>>>>>> a false result only happens because you are not >>>>>>>>>>>>>>>>>>> running as root (in which case it would then check if >>>>>>>>>>>>>>>>>>> sudo will work). So ClhsdbLauncher now has double >>>>>>>>>>>>>>>>>>> check that the false result was not because of >>>>>>>>>>>>>>>>>>> running a signed binary. If it is signed, it won't do >>>>>>>>>>>>>>>>>>> the sudo check. This will get cleaned up with 8238268 >>>>>>>>>>>>>>>>>>> [1], which will move the sudo check into >>>>>>>>>>>>>>>>>>> Platform.shouldSAAttach(). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238268 >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> > > From chris.plummer at oracle.com Wed Feb 12 23:45:32 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 12 Feb 2020 15:45:32 -0800 Subject: RFR: JDK-8238710: LingeredApp doesn't log stdout/stderr if exits with non-zero code In-Reply-To: References: <0d5df909-bb4b-00ce-b2b9-babdb5311bb6@oracle.com> <24fec2c9-fcab-522a-fbf0-bab35a86e611@oracle.com> Message-ID: Ok. LGTM. Chris On 2/12/20 1:58 PM, Alex Menkov wrote: > Hi Chris, > > thanks for the review. > finishApp is also called from startApp(String... cmd) method > and appProcess can be not initialized there. > In the case finishApp will throw NPE (calling appProcess.exitValue()) > > --alex > > On 02/12/2020 13:53, Chris Plummer wrote: >> Hi Alex, >> >> Thanks for doing this. Not having output from a spawned process that >> failed is an issue with more than just LingeredApp tests. This is a >> good start in getting those fixed. >> >> I'm a little unclear on one part of your fix. Why did you move the >> "appProcess != null" into finishApp(). You already make that check in >> stopApp(). If anything it looks like that check should have been >> there before your changes, but is no longer needed after your changes. >> >> thanks, >> >> Chris >> >> On 2/12/20 1:28 PM, Alex Menkov wrote: >>> Hi all, >>> >>> Please review small fix for >>> https://bugs.openjdk.java.net/browse/JDK-8238710 >>> webrev: >>> http://cr.openjdk.java.net/~amenkov/jdk15/LingeredApp_log_error/webrev/ >>> >>> --alex >> From chris.plummer at oracle.com Wed Feb 12 23:48:59 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 12 Feb 2020 15:48:59 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: References: <16754f9b-61a3-15d6-7cb6-8e9895662657@oracle.com> <979CFEC7-538E-46EE-988F-1935AFB8AEFA@oracle.com> <5bddcc55-5340-bea4-8cbb-1338c121d6d4@oracle.com> <86C876E3-6005-4ADE-9224-DEF6A2908F26@oracle.com> <79c65fd4-166e-98de-9327-ede4e9a20421@oracle.com> <3a21aa8f-8081-da59-cb10-87e79f81e2e3@oracle.com> Message-ID: <251c4034-ca59-b78b-f330-952bfea8b47e@oracle.com> On 2/12/20 2:15 PM, David Holmes wrote: > On 13/02/2020 5:02 am, Chris Plummer wrote: >> Hi David, >> >>> What you have below is a mix of #2 and #3 - you convert to a generic >>> exception but also re-assert the interrupt state. That's a little >>> unusual. >> That what I also thought which is why I was suggesting not doing the >> interrupt() call and only throwing the RuntimeException. I agree that >> doing both does not make sense, and in general doing (3) does not >> make sense if the caller is not setup to properly check the interrupt >> state. > > From a library writer perspective you should have zero knowledge of > the caller and doing (3) makes perfect sense. Remember that you will > only re-assert the interrupt() if you get the InterruptedException in > the first place, which means that some other code already issued the > initial interrupt(). > > Whether this code shoud be treated as general purpose library code is > another matter. > > Personally, in this case I think I'd use (2) and make interruption a > failure mode. Ok. We're in agreement here. Igor, are you ok with this? try { ??? ... } catch (InterruptedException e) { ?? throw new RuntimeException(e); } thanks, Chris > > David > >> Chris >> >> >> On 2/12/20 5:58 AM, David Holmes wrote: >>> Hi Chris, >>> >>> I think you are overthinking this. :) >>> >>> What you have observed is that the code that actually uses this >>> method does not utilise interrupts, or expect them, so if you >>> artifically inject one in this library method then you see things >>> failing in unexpected ways. That also means that if the thread was >>> interrupted by some other piece of logic then it would also fail in >>> unexpected ways. That doesn't negate your choice to re-assert the >>> interrupt state. >>> >>> From a library writing perspective if you have a method that >>> performs a blocking call that can throw InterruptedException then >>> you generally have three choices: >>> >>> 1. Throw InterruptedException yourself and pass the buck to your >>> callers. >>> 2. Convert the InterruptedException to a more general failure >>> exception - typically an unchecked RuntimeException - for which >>> interruption is but one possible cause; or >>> 3. Catch the InterruptedException and allow the method to complete >>> normally (i.e. not by throwing an exception) but re-assert the >>> interrupt state so that a caller checking for interruption will >>> still see that it occurred. >>> >>> What you have below is a mix of #2 and #3 - you convert to a generic >>> exception but also re-assert the interrupt state. That's a little >>> unusual. >>> >>> David >>> >>> >>> On 12/02/2020 6:16 pm, Chris Plummer wrote: >>>> Hi Igor, >>>> >>>> I think it might be best to the interrupt() call out. I wanted to >>>> see what would happen if we ever got an InterruptedException, so I >>>> added the following to the start of Platform.shouldSAAttach(): >>>> >>>> ???????? try { >>>> ???????????? throw new InterruptedException(); >>>> ???????? } catch (InterruptedException e) { >>>> ???????????? Thread.currentThread().interrupt(); >>>> ???????????? throw new RuntimeException(e); >>>> ???????? } >>>> >>>> At the start of the test run, before any tests are actually run, I >>>> see the following: >>>> >>>> failed to get value for vm.hasSAandCanAttach >>>> java.lang.RuntimeException: java.lang.InterruptedException >>>> ???? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:300) >>>> ???? at requires.VMProps.vmHasSAandCanAttach(VMProps.java:327) >>>> ???? at requires.VMProps$SafeMap.put(VMProps.java:69) >>>> ???? at requires.VMProps.call(VMProps.java:101) >>>> ???? at requires.VMProps.call(VMProps.java:57) >>>> ???? at >>>> com.sun.javatest.regtest.agent.GetJDKProperties.run(GetJDKProperties.java:80) >>>> >>>> ???? at >>>> com.sun.javatest.regtest.agent.GetJDKProperties.main(GetJDKProperties.java:54) >>>> >>>> Caused by: java.lang.InterruptedException >>>> ???? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:297) >>>> ???? ... 6 more >>>> >>>> This seems reasonable. >>>> >>>> For each test that checks vm.hasSAandCanAttach I also see. >>>> >>>> TEST RESULT: Error. Error evaluating expression: >>>> vm.hasSAandCanAttach: java.lang.RuntimeException: >>>> java.lang.InterruptedException >>>> >>>> This too seems reasonable. >>>> >>>> For tests that don't check vm.hasSAandCanAttach, but instead make a >>>> runtime check that calls Platform.shouldSAAttach(), the test fails >>>> with: >>>> >>>> java.lang.IllegalThreadStateException: process hasn't exited >>>> ???? at >>>> java.base/java.lang.ProcessImpl.exitValue(ProcessImpl.java:500) >>>> ???? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:380) >>>> ???? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:433) >>>> ???? at ClhsdbAttach.main(ClhsdbAttach.java:77) >>>> ???? at >>>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native >>>> Method) >>>> ???? at >>>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>>> >>>> ???? at >>>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>> >>>> ???? at java.base/java.lang.reflect.Method.invoke(Method.java:564) >>>> ???? at >>>> com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) >>>> >>>> ???? at java.base/java.lang.Thread.run(Thread.java:832) >>>> >>>> This is a confusing way to fail. The reason it fails this way is >>>> because stopApp() first calls waitAppTerminiate(), which does the >>>> following: >>>> >>>> ???? public void waitAppTerminate() { >>>> ???????? // This code is modeled after tail end of >>>> ProcessTools.getOutput(). >>>> ???????? try { >>>> ???????????? appProcess.waitFor(); >>>> ???????????? outPumperThread.join(); >>>> ???????????? errPumperThread.join(); >>>> ???????? } catch (InterruptedException e) { >>>> ???????????? Thread.currentThread().interrupt(); >>>> ???????????? // pass >>>> ???????? } >>>> ???? } >>>> >>>> I added an e.printStackTrace() call and see the following: >>>> >>>> java.lang.InterruptedException >>>> ???? at java.base/java.lang.Object.wait(Native Method) >>>> ???? at java.base/java.lang.Object.wait(Object.java:321) >>>> ???? at java.base/java.lang.ProcessImpl.waitFor(ProcessImpl.java:474) >>>> ???? at >>>> jdk.test.lib.apps.LingeredApp.waitAppTerminate(LingeredApp.java:239) >>>> ???? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:380) >>>> ???? at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:434) >>>> >>>> So the earlier call to interrupt() is resulting in >>>> waitAppTerminate() not actually waiting for exit. This then results >>>> in stopApp() getting IllegalThreadStateException when calling >>>> Process.exitValue(). >>>> >>>> If I comment out the call to interrupt() in >>>> Platform.shouldSAAttach(), I think the failure stack trace is much >>>> better: >>>> >>>> java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: >>>> java.lang.InterruptedException >>>> ???? at ClhsdbAttach.main(ClhsdbAttach.java:75) >>>> ???? at >>>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native >>>> Method) >>>> ???? at >>>> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>>> >>>> ???? at >>>> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>> >>>> ???? at java.base/java.lang.reflect.Method.invoke(Method.java:564) >>>> ???? at >>>> com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) >>>> >>>> ???? at java.base/java.lang.Thread.run(Thread.java:832) >>>> Caused by: java.lang.RuntimeException: java.lang.InterruptedException >>>> ???? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:300) >>>> ???? at ClhsdbLauncher.run(ClhsdbLauncher.java:199) >>>> ???? at ClhsdbAttach.main(ClhsdbAttach.java:71) >>>> ???? ... 6 more >>>> Caused by: java.lang.InterruptedException >>>> ???? at jdk.test.lib.Platform.shouldSAAttach(Platform.java:297) >>>> ???? ... 8 more >>>> >>>> There's still a minor issue with rethrowing the RuntimeException >>>> encapsulated inside another RuntimeException. That the fault of the >>>> test which is catching all Exceptions and encapsulating them in a >>>> RuntimeException, even if the Exceptions itself is already a >>>> RuntimeException. It should add have a catch clause for >>>> RuntimeException, and just rethrow it without encapulating it. All >>>> the Clhsdb tests seem to do this, so that's about 20 places to fix. >>>> Probably not worth doing unless some other cleanup is being done at >>>> the same time. >>>> >>>> Chris >>>> >>>> On 2/11/20 10:30 PM, Igor Ignatyev wrote: >>>>> I'd say yes, it's better to still call Thread::interrupt. >>>>> >>>>> -- Igor >>>>> >>>>>> On Feb 11, 2020, at 10:19 PM, Chris Plummer >>>>>> > wrote: >>>>>> >>>>>> Ok. Should I still call interrupt()? >>>>>> >>>>>> Chris >>>>>> >>>>>> On 2/11/20 10:07 PM, Igor Ignatyev wrote: >>>>>>> Hi Chris, >>>>>>> >>>>>>> that's a common practice for any kind of library-ish code, if >>>>>>> there are no explicit check of interrupt status, it will be >>>>>>> checked a by next operation which might be interrupted. in this >>>>>>> particular case, I agree rethrowing it as an unchecked exception >>>>>>> might be a good alternative. >>>>>>> >>>>>>> -- Igor >>>>>>> >>>>>>>> On Feb 11, 2020, at 10:03 PM, Chris Plummer >>>>>>>> > >>>>>>>> wrote: >>>>>>>> >>>>>>>> Hi Igor, >>>>>>>> >>>>>>>> I guess I fail to see the benefit of this. Who is going to >>>>>>>> check the interrupt status of this thread and do something >>>>>>>> meaningful with it? It seems we would want to immediately >>>>>>>> propagate the failure by throwing a RuntimeException. This will >>>>>>>> work well when called from a test since this is a common way to >>>>>>>> fail a test. The other use of this code is by >>>>>>>> VMProps.vmHasSAandCanAttach(). It looks like if a >>>>>>>> RuntimeException is thrown the right thing will happen when >>>>>>>> SafeMap.put() catches the exception (it catches all Throwables). >>>>>>>> >>>>>>>> Chris >>>>>>>> >>>>>>>> On 2/11/20 7:12 PM, Igor Ignatev wrote: >>>>>>>>> rather like this : >>>>>>>>> >>>>>>>>>> } catch (InterruptedException e) { >>>>>>>>>> ?Thread.currentThread().interrupt(); >>>>>>>>>> ? ?return false; // assume not signed >>>>>>>>>> } >>>>>>>>> >>>>>>>>> ? Igor >>>>>>>>> >>>>>>>>>> On Feb 11, 2020, at 6:15 PM, Chris Plummer >>>>>>>>>> > >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> ? >>>>>>>>>> Like this? >>>>>>>>>> >>>>>>>>>> ??????? } catch (InterruptedException e) { >>>>>>>>>> Thread.currentThread().interrupt(); >>>>>>>>>> ??????????? throw new RuntimeException(e); >>>>>>>>>> ??????? } >>>>>>>>>> >>>>>>>>>> Chris >>>>>>>>>> >>>>>>>>>> On 2/11/20 2:23 PM, Igor Ignatyev wrote: >>>>>>>>>>> no, I meant to call Thread.currentThread().interrupt(), >>>>>>>>>>> calling that will restore interrupted state of the thread, >>>>>>>>>>> so an user of Platform class will be able to response to it >>>>>>>>>>> appropriately, w/ your current code, the fact that the >>>>>>>>>>> thread was interrupted will be missed, and in most cases it >>>>>>>>>>> is not right thing to do. >>>>>>>>>>> >>>>>>>>>>> -- Igor >>>>>>>>>>> >>>>>>>>>>>> On Feb 11, 2020, at 2:02 PM, Chris Plummer >>>>>>>>>>>> >>>>>>>>>>> > wrote: >>>>>>>>>>>> >>>>>>>>>>>> Hi Igor, >>>>>>>>>>>> >>>>>>>>>>>> I'm not sure what you mean by restore the interrupt state. >>>>>>>>>>>> Do you mean loop back to the waitFor() call? >>>>>>>>>>>> >>>>>>>>>>>> thanks, >>>>>>>>>>>> >>>>>>>>>>>> Chris >>>>>>>>>>>> >>>>>>>>>>>> On 2/11/20 1:55 PM, Igor Ignatyev wrote: >>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>> >>>>>>>>>>>>> I don't insist on (3), so I'm fine if you don't want to >>>>>>>>>>>>> change that part. one thing I'd change though is to >>>>>>>>>>>>> restore thread interrupted state at L#266 of Platform.java >>>>>>>>>>>>> (no need to publish new webrev) >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks, >>>>>>>>>>>>> -- Igor >>>>>>>>>>>>> >>>>>>>>>>>>>> On Feb 11, 2020, at 1:49 PM, Chris Plummer >>>>>>>>>>>>>> >>>>>>>>>>>>> > wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Igor, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Here's an updated webrev: >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.01/index.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I rebased to JDK 15 and made all the changes you >>>>>>>>>>>>>> suggested except for (3). I did not think it is necessary >>>>>>>>>>>>>> since the code is only executed on OSX. However, if you >>>>>>>>>>>>>> still feel allowing flexibility in the path separator is >>>>>>>>>>>>>> important, I can add that change too. >>>>>>>>>>>>>> >>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Chris >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 2/10/20 1:34 PM, Igor Ignatyev wrote: >>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> in general it all looks good, I have a few comments >>>>>>>>>>>>>>> (most of them are editorial): >>>>>>>>>>>>>>> in?Platform.java: >>>>>>>>>>>>>>> 1. you have doubled spaced at line#238 (b/w boolean >>>>>>>>>>>>>>> and?isSignedOSX) >>>>>>>>>>>>>>> 2. as?FileNotFoundException is?IOException, there is no >>>>>>>>>>>>>>> need to declare the former in the signature of?isSignedOSX >>>>>>>>>>>>>>> 3. it's better to pass jdkPath, "bin" and "java" as >>>>>>>>>>>>>>> separate arguments to Path.get, so the code won't depend >>>>>>>>>>>>>>> on file separator >>>>>>>>>>>>>>> 4. you are waiting for codesign to finish w/o reading >>>>>>>>>>>>>>> its cout / cerr, which might lead to a deadlock >>>>>>>>>>>>>>> (if?codesign will exhaust IO buffer before exiting), so >>>>>>>>>>>>>>> you need to either create two separate threads to read >>>>>>>>>>>>>>> cout and cerr or ?redirect these streams them to files >>>>>>>>>>>>>>> and read these files afterwards or just ignore cout/cerr >>>>>>>>>>>>>>> by using Redirect.DISCARD. I'd personally recommend the >>>>>>>>>>>>>>> latter as the result of codesign can be reliably deduced >>>>>>>>>>>>>>> from its exitcode (0 - signed, 1 - verification failed, >>>>>>>>>>>>>>> 2 - wrong arguments, 3 - not all requirements from R are >>>>>>>>>>>>>>> satisfied) and using cout/cerr is somewhat fragile as >>>>>>>>>>>>>>> there is no guarantee output format won't be changed. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> the rest looks good to me. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- Igor >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Feb 10, 2020, at 11:48 AM, Chris Plummer >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Ping #2. It's not that hard of a review. Most of it is >>>>>>>>>>>>>>>> the new Platform.isSignedOSX() method, which is well >>>>>>>>>>>>>>>> commented and pretty straight froward. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2/4/20 5:04 PM, Chris Plummer wrote: >>>>>>>>>>>>>>>>> Ping! >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> And I decided to push to 15 instead of 14. Will >>>>>>>>>>>>>>>>> backport to 14 eventually. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 1/30/20 10:20 PM, Chris Plummer wrote: >>>>>>>>>>>>>>>>>> Yes, you are correct: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00 >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 1/30/20 10:13 PM, Igor Ignatyev wrote: >>>>>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00? >>>>>>>>>>>>>>>>>>> ?seems >>>>>>>>>>>>>>>>>>> to be a webrev from another issue, should it have >>>>>>>>>>>>>>>>>>> been?http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00/? >>>>>>>>>>>>>>>>>>> ?? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- Igor >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Jan 30, 2020, at 10:10 PM, Chris Plummer >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Please review the following fix for some SA tests >>>>>>>>>>>>>>>>>>>> that are failing on Mac OS X 10.14.5 and later: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The issue is that SA can't attach to a signed >>>>>>>>>>>>>>>>>>>> binary starting with 10.14.5. There is no >>>>>>>>>>>>>>>>>>>> workaround for this, so these tests are being >>>>>>>>>>>>>>>>>>>> disabled when it is detected that the binary is >>>>>>>>>>>>>>>>>>>> signed and we are running on 10.14 or later (I >>>>>>>>>>>>>>>>>>>> chose all 10.14 releases to simplify the check). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Some background may help explain the fix. In order >>>>>>>>>>>>>>>>>>>> for SA to attach to a live process (not a core >>>>>>>>>>>>>>>>>>>> file) on OSX, either the attaching process (ie. the >>>>>>>>>>>>>>>>>>>> test) has to be run as root, or sudo needs to be >>>>>>>>>>>>>>>>>>>> supported. However, the only tests that make the >>>>>>>>>>>>>>>>>>>> sudo check are the 20 or so that use >>>>>>>>>>>>>>>>>>>> ClhsdbLauncher. The rest all rely on "@requires >>>>>>>>>>>>>>>>>>>> vm.hasSAandCanAttach" to filter out tests that use >>>>>>>>>>>>>>>>>>>> SA attach. vm.hasSAandCanAttach only checks if the >>>>>>>>>>>>>>>>>>>> test is being run as root. Thus all our >>>>>>>>>>>>>>>>>>>> non-ClhsdbLauncher tests that SA attach to a live >>>>>>>>>>>>>>>>>>>> process are currently not run unless they are run >>>>>>>>>>>>>>>>>>>> as root. 8238268 [1] has been filed to address >>>>>>>>>>>>>>>>>>>> this, making it so all the tests will attempt to >>>>>>>>>>>>>>>>>>>> use sudo if not run as root. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Because of the difference in how ClhsdbLauncher >>>>>>>>>>>>>>>>>>>> tests and "@requires? vm.hasSAandCanAttach" tests >>>>>>>>>>>>>>>>>>>> check to see if they are runnable, this fix needs >>>>>>>>>>>>>>>>>>>> to address both types of checks. The common code >>>>>>>>>>>>>>>>>>>> for both these cases is Platform.shouldSAAttach(), >>>>>>>>>>>>>>>>>>>> which on OSX basically equates to check to see if >>>>>>>>>>>>>>>>>>>> we are running as root. I changed it to also return >>>>>>>>>>>>>>>>>>>> false if running on signed binary with 10.14 or >>>>>>>>>>>>>>>>>>>> later. However, this confused the ClhsdbLauncher >>>>>>>>>>>>>>>>>>>> use of Platform.shouldSAAttach() somewhat, since it >>>>>>>>>>>>>>>>>>>> assumed a false result only happens because you are >>>>>>>>>>>>>>>>>>>> not running as root (in which case it would then >>>>>>>>>>>>>>>>>>>> check if sudo will work). So ClhsdbLauncher now has >>>>>>>>>>>>>>>>>>>> double check that the false result was not because >>>>>>>>>>>>>>>>>>>> of running a signed binary. If it is signed, it >>>>>>>>>>>>>>>>>>>> won't do the sudo check. This will get cleaned up >>>>>>>>>>>>>>>>>>>> with 8238268 [1], which will move the sudo check >>>>>>>>>>>>>>>>>>>> into Platform.shouldSAAttach(). >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238268 >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> >> From igor.ignatyev at oracle.com Wed Feb 12 23:51:11 2020 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Wed, 12 Feb 2020 15:51:11 -0800 Subject: RFR: 8238196: tests that use SA Attach should not be allowed to run against signed binaries on Mac OS X 10.14.5 and later In-Reply-To: <251c4034-ca59-b78b-f330-952bfea8b47e@oracle.com> References: <16754f9b-61a3-15d6-7cb6-8e9895662657@oracle.com> <979CFEC7-538E-46EE-988F-1935AFB8AEFA@oracle.com> <5bddcc55-5340-bea4-8cbb-1338c121d6d4@oracle.com> <86C876E3-6005-4ADE-9224-DEF6A2908F26@oracle.com> <79c65fd4-166e-98de-9327-ede4e9a20421@oracle.com> <3a21aa8f-8081-da59-cb10-87e79f81e2e3@oracle.com> <251c4034-ca59-b78b-f330-952bfea8b47e@oracle.com> Message-ID: <18EC84DF-BA8C-4C09-B803-8E39B21DE4D0@oracle.com> yes, I'm fine w/ that. -- Igor > On Feb 12, 2020, at 3:48 PM, Chris Plummer wrote: > > On 2/12/20 2:15 PM, David Holmes wrote: >> On 13/02/2020 5:02 am, Chris Plummer wrote: >>> Hi David, >>> >>>> What you have below is a mix of #2 and #3 - you convert to a generic exception but also re-assert the interrupt state. That's a little unusual. >>> That what I also thought which is why I was suggesting not doing the interrupt() call and only throwing the RuntimeException. I agree that doing both does not make sense, and in general doing (3) does not make sense if the caller is not setup to properly check the interrupt state. >> >> From a library writer perspective you should have zero knowledge of the caller and doing (3) makes perfect sense. Remember that you will only re-assert the interrupt() if you get the InterruptedException in the first place, which means that some other code already issued the initial interrupt(). >> >> Whether this code shoud be treated as general purpose library code is another matter. >> >> Personally, in this case I think I'd use (2) and make interruption a failure mode. > Ok. We're in agreement here. Igor, are you ok with this? > > try { > ... > } catch (InterruptedException e) { > throw new RuntimeException(e); > } > > thanks, > > Chris >> >> David >> >>> Chris >>> >>> >>> On 2/12/20 5:58 AM, David Holmes wrote: >>>> Hi Chris, >>>> >>>> I think you are overthinking this. :) >>>> >>>> What you have observed is that the code that actually uses this method does not utilise interrupts, or expect them, so if you artifically inject one in this library method then you see things failing in unexpected ways. That also means that if the thread was interrupted by some other piece of logic then it would also fail in unexpected ways. That doesn't negate your choice to re-assert the interrupt state. >>>> >>>> From a library writing perspective if you have a method that performs a blocking call that can throw InterruptedException then you generally have three choices: >>>> >>>> 1. Throw InterruptedException yourself and pass the buck to your callers. >>>> 2. Convert the InterruptedException to a more general failure exception - typically an unchecked RuntimeException - for which interruption is but one possible cause; or >>>> 3. Catch the InterruptedException and allow the method to complete normally (i.e. not by throwing an exception) but re-assert the interrupt state so that a caller checking for interruption will still see that it occurred. >>>> >>>> What you have below is a mix of #2 and #3 - you convert to a generic exception but also re-assert the interrupt state. That's a little unusual. >>>> >>>> David >>>> >>>> >>>> On 12/02/2020 6:16 pm, Chris Plummer wrote: >>>>> Hi Igor, >>>>> >>>>> I think it might be best to the interrupt() call out. I wanted to see what would happen if we ever got an InterruptedException, so I added the following to the start of Platform.shouldSAAttach(): >>>>> >>>>> try { >>>>> throw new InterruptedException(); >>>>> } catch (InterruptedException e) { >>>>> Thread.currentThread().interrupt(); >>>>> throw new RuntimeException(e); >>>>> } >>>>> >>>>> At the start of the test run, before any tests are actually run, I see the following: >>>>> >>>>> failed to get value for vm.hasSAandCanAttach >>>>> java.lang.RuntimeException: java.lang.InterruptedException >>>>> at jdk.test.lib.Platform.shouldSAAttach(Platform.java:300) >>>>> at requires.VMProps.vmHasSAandCanAttach(VMProps.java:327) >>>>> at requires.VMProps$SafeMap.put(VMProps.java:69) >>>>> at requires.VMProps.call(VMProps.java:101) >>>>> at requires.VMProps.call(VMProps.java:57) >>>>> at com.sun.javatest.regtest.agent.GetJDKProperties.run(GetJDKProperties.java:80) >>>>> at com.sun.javatest.regtest.agent.GetJDKProperties.main(GetJDKProperties.java:54) >>>>> Caused by: java.lang.InterruptedException >>>>> at jdk.test.lib.Platform.shouldSAAttach(Platform.java:297) >>>>> ... 6 more >>>>> >>>>> This seems reasonable. >>>>> >>>>> For each test that checks vm.hasSAandCanAttach I also see. >>>>> >>>>> TEST RESULT: Error. Error evaluating expression: vm.hasSAandCanAttach: java.lang.RuntimeException: java.lang.InterruptedException >>>>> >>>>> This too seems reasonable. >>>>> >>>>> For tests that don't check vm.hasSAandCanAttach, but instead make a runtime check that calls Platform.shouldSAAttach(), the test fails with: >>>>> >>>>> java.lang.IllegalThreadStateException: process hasn't exited >>>>> at java.base/java.lang.ProcessImpl.exitValue(ProcessImpl.java:500) >>>>> at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:380) >>>>> at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:433) >>>>> at ClhsdbAttach.main(ClhsdbAttach.java:77) >>>>> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>>>> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>>> at java.base/java.lang.reflect.Method.invoke(Method.java:564) >>>>> at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) >>>>> at java.base/java.lang.Thread.run(Thread.java:832) >>>>> >>>>> This is a confusing way to fail. The reason it fails this way is because stopApp() first calls waitAppTerminiate(), which does the following: >>>>> >>>>> public void waitAppTerminate() { >>>>> // This code is modeled after tail end of ProcessTools.getOutput(). >>>>> try { >>>>> appProcess.waitFor(); >>>>> outPumperThread.join(); >>>>> errPumperThread.join(); >>>>> } catch (InterruptedException e) { >>>>> Thread.currentThread().interrupt(); >>>>> // pass >>>>> } >>>>> } >>>>> >>>>> I added an e.printStackTrace() call and see the following: >>>>> >>>>> java.lang.InterruptedException >>>>> at java.base/java.lang.Object.wait(Native Method) >>>>> at java.base/java.lang.Object.wait(Object.java:321) >>>>> at java.base/java.lang.ProcessImpl.waitFor(ProcessImpl.java:474) >>>>> at jdk.test.lib.apps.LingeredApp.waitAppTerminate(LingeredApp.java:239) >>>>> at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:380) >>>>> at jdk.test.lib.apps.LingeredApp.stopApp(LingeredApp.java:434) >>>>> >>>>> So the earlier call to interrupt() is resulting in waitAppTerminate() not actually waiting for exit. This then results in stopApp() getting IllegalThreadStateException when calling Process.exitValue(). >>>>> >>>>> If I comment out the call to interrupt() in Platform.shouldSAAttach(), I think the failure stack trace is much better: >>>>> >>>>> java.lang.RuntimeException: Test ERROR java.lang.RuntimeException: java.lang.InterruptedException >>>>> at ClhsdbAttach.main(ClhsdbAttach.java:75) >>>>> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>> at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >>>>> at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>>> at java.base/java.lang.reflect.Method.invoke(Method.java:564) >>>>> at com.sun.javatest.regtest.agent.MainWrapper$MainThread.run(MainWrapper.java:127) >>>>> at java.base/java.lang.Thread.run(Thread.java:832) >>>>> Caused by: java.lang.RuntimeException: java.lang.InterruptedException >>>>> at jdk.test.lib.Platform.shouldSAAttach(Platform.java:300) >>>>> at ClhsdbLauncher.run(ClhsdbLauncher.java:199) >>>>> at ClhsdbAttach.main(ClhsdbAttach.java:71) >>>>> ... 6 more >>>>> Caused by: java.lang.InterruptedException >>>>> at jdk.test.lib.Platform.shouldSAAttach(Platform.java:297) >>>>> ... 8 more >>>>> >>>>> There's still a minor issue with rethrowing the RuntimeException encapsulated inside another RuntimeException. That the fault of the test which is catching all Exceptions and encapsulating them in a RuntimeException, even if the Exceptions itself is already a RuntimeException. It should add have a catch clause for RuntimeException, and just rethrow it without encapulating it. All the Clhsdb tests seem to do this, so that's about 20 places to fix. Probably not worth doing unless some other cleanup is being done at the same time. >>>>> >>>>> Chris >>>>> >>>>> On 2/11/20 10:30 PM, Igor Ignatyev wrote: >>>>>> I'd say yes, it's better to still call Thread::interrupt. >>>>>> >>>>>> -- Igor >>>>>> >>>>>>> On Feb 11, 2020, at 10:19 PM, Chris Plummer > wrote: >>>>>>> >>>>>>> Ok. Should I still call interrupt()? >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> On 2/11/20 10:07 PM, Igor Ignatyev wrote: >>>>>>>> Hi Chris, >>>>>>>> >>>>>>>> that's a common practice for any kind of library-ish code, if there are no explicit check of interrupt status, it will be checked a by next operation which might be interrupted. in this particular case, I agree rethrowing it as an unchecked exception might be a good alternative. >>>>>>>> >>>>>>>> -- Igor >>>>>>>> >>>>>>>>> On Feb 11, 2020, at 10:03 PM, Chris Plummer > wrote: >>>>>>>>> >>>>>>>>> Hi Igor, >>>>>>>>> >>>>>>>>> I guess I fail to see the benefit of this. Who is going to check the interrupt status of this thread and do something meaningful with it? It seems we would want to immediately propagate the failure by throwing a RuntimeException. This will work well when called from a test since this is a common way to fail a test. The other use of this code is by VMProps.vmHasSAandCanAttach(). It looks like if a RuntimeException is thrown the right thing will happen when SafeMap.put() catches the exception (it catches all Throwables). >>>>>>>>> >>>>>>>>> Chris >>>>>>>>> >>>>>>>>> On 2/11/20 7:12 PM, Igor Ignatev wrote: >>>>>>>>>> rather like this : >>>>>>>>>> >>>>>>>>>>> } catch (InterruptedException e) { >>>>>>>>>>> Thread.currentThread().interrupt(); >>>>>>>>>>> return false; // assume not signed >>>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> ? Igor >>>>>>>>>> >>>>>>>>>>> On Feb 11, 2020, at 6:15 PM, Chris Plummer > wrote: >>>>>>>>>>> >>>>>>>>>>> ? >>>>>>>>>>> Like this? >>>>>>>>>>> >>>>>>>>>>> } catch (InterruptedException e) { >>>>>>>>>>> Thread.currentThread().interrupt(); >>>>>>>>>>> throw new RuntimeException(e); >>>>>>>>>>> } >>>>>>>>>>> >>>>>>>>>>> Chris >>>>>>>>>>> >>>>>>>>>>> On 2/11/20 2:23 PM, Igor Ignatyev wrote: >>>>>>>>>>>> no, I meant to call Thread.currentThread().interrupt(), calling that will restore interrupted state of the thread, so an user of Platform class will be able to response to it appropriately, w/ your current code, the fact that the thread was interrupted will be missed, and in most cases it is not right thing to do. >>>>>>>>>>>> >>>>>>>>>>>> -- Igor >>>>>>>>>>>> >>>>>>>>>>>>> On Feb 11, 2020, at 2:02 PM, Chris Plummer > wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi Igor, >>>>>>>>>>>>> >>>>>>>>>>>>> I'm not sure what you mean by restore the interrupt state. Do you mean loop back to the waitFor() call? >>>>>>>>>>>>> >>>>>>>>>>>>> thanks, >>>>>>>>>>>>> >>>>>>>>>>>>> Chris >>>>>>>>>>>>> >>>>>>>>>>>>> On 2/11/20 1:55 PM, Igor Ignatyev wrote: >>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I don't insist on (3), so I'm fine if you don't want to change that part. one thing I'd change though is to restore thread interrupted state at L#266 of Platform.java (no need to publish new webrev) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> -- Igor >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Feb 11, 2020, at 1:49 PM, Chris Plummer > wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Igor, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Here's an updated webrev: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.01/index.html >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I rebased to JDK 15 and made all the changes you suggested except for (3). I did not think it is necessary since the code is only executed on OSX. However, if you still feel allowing flexibility in the path separator is important, I can add that change too. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2/10/20 1:34 PM, Igor Ignatyev wrote: >>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> in general it all looks good, I have a few comments (most of them are editorial): >>>>>>>>>>>>>>>> in Platform.java: >>>>>>>>>>>>>>>> 1. you have doubled spaced at line#238 (b/w boolean and isSignedOSX) >>>>>>>>>>>>>>>> 2. as FileNotFoundException is IOException, there is no need to declare the former in the signature of isSignedOSX >>>>>>>>>>>>>>>> 3. it's better to pass jdkPath, "bin" and "java" as separate arguments to Path.get, so the code won't depend on file separator >>>>>>>>>>>>>>>> 4. you are waiting for codesign to finish w/o reading its cout / cerr, which might lead to a deadlock (if codesign will exhaust IO buffer before exiting), so you need to either create two separate threads to read cout and cerr or redirect these streams them to files and read these files afterwards or just ignore cout/cerr by using Redirect.DISCARD. I'd personally recommend the latter as the result of codesign can be reliably deduced from its exitcode (0 - signed, 1 - verification failed, 2 - wrong arguments, 3 - not all requirements from R are satisfied) and using cout/cerr is somewhat fragile as there is no guarantee output format won't be changed. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> the rest looks good to me. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- Igor >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Feb 10, 2020, at 11:48 AM, Chris Plummer > wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Ping #2. It's not that hard of a review. Most of it is the new Platform.isSignedOSX() method, which is well commented and pretty straight froward. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 2/4/20 5:04 PM, Chris Plummer wrote: >>>>>>>>>>>>>>>>>> Ping! >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> And I decided to push to 15 instead of 14. Will backport to 14 eventually. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 1/30/20 10:20 PM, Chris Plummer wrote: >>>>>>>>>>>>>>>>>>> Yes, you are correct: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On 1/30/20 10:13 PM, Igor Ignatyev wrote: >>>>>>>>>>>>>>>>>>>> Hi Chris, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00? seems to be a webrev from another issue, should it have been?http://cr.openjdk.java.net/~cjplummer/8238196/webrev.00/? ? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- Igor >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Jan 30, 2020, at 10:10 PM, Chris Plummer > wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Please review the following fix for some SA tests that are failing on Mac OS X 10.14.5 and later: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8238196 >>>>>>>>>>>>>>>>>>>>> http://cr.openjdk.java.net/~cjplummer/8236913/webrev.00 >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> The issue is that SA can't attach to a signed binary starting with 10.14.5. There is no workaround for this, so these tests are being disabled when it is detected that the binary is signed and we are running on 10.14 or later (I chose all 10.14 releases to simplify the check). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Some background may help explain the fix. In order for SA to attach to a live process (not a core file) on OSX, either the attaching process (ie. the test) has to be run as root, or sudo needs to be supported. However, the only tests that make the sudo check are the 20 or so that use ClhsdbLauncher. The rest all rely on "@requires vm.hasSAandCanAttach" to filter out tests that use SA attach. vm.hasSAandCanAttach only checks if the test is being run as root. Thus all our non-ClhsdbLauncher tests that SA attach to a live process are currently not run unless they are run as root. 8238268 [1] has been filed to address this, making it so all the tests will attempt to use sudo if not run as root. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Because of the difference in how ClhsdbLauncher tests and "@requires? vm.hasSAandCanAttach" tests check to see if they are runnable, this fix needs to address both types of checks. The common code for both these cases is Platform.shouldSAAttach(), which on OSX basically equates to check to see if we are running as root. I changed it to also return false if running on signed binary with 10.14 or later. However, this confused the ClhsdbLauncher use of Platform.shouldSAAttach() somewhat, since it assumed a false result only happens because you are not running as root (in which case it would then check if sudo will work). So ClhsdbLauncher now has double check that the false result was not because of running a signed binary. If it is signed, it won't do the sudo check. This will get cleaned up with 8238268 [1], which will move the sudo check into Platform.shouldSAAttach(). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> thanks, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Chris >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> [1] https://bugs.openjdk.java.net/browse/JDK-8238268 -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Thu Feb 13 00:53:46 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 12 Feb 2020 16:53:46 -0800 Subject: RFR: JDK-8238710: LingeredApp doesn't log stdout/stderr if exits with non-zero code In-Reply-To: References: <0d5df909-bb4b-00ce-b2b9-babdb5311bb6@oracle.com> <24fec2c9-fcab-522a-fbf0-bab35a86e611@oracle.com> Message-ID: <4ddbdd71-1642-4f41-b9b0-97cd24bab3b7@oracle.com> Hi Alex, LGTM++ Thanks, Serguei On 2/12/20 3:45 PM, Chris Plummer wrote: > Ok. LGTM. > > Chris > > On 2/12/20 1:58 PM, Alex Menkov wrote: >> Hi Chris, >> >> thanks for the review. >> finishApp is also called from startApp(String... cmd) method >> and appProcess can be not initialized there. >> In the case finishApp will throw NPE (calling appProcess.exitValue()) >> >> --alex >> >> On 02/12/2020 13:53, Chris Plummer wrote: >>> Hi Alex, >>> >>> Thanks for doing this. Not having output from a spawned process that >>> failed is an issue with more than just LingeredApp tests. This is a >>> good start in getting those fixed. >>> >>> I'm a little unclear on one part of your fix. Why did you move the >>> "appProcess != null" into finishApp(). You already make that check >>> in stopApp(). If anything it looks like that check should have been >>> there before your changes, but is no longer needed after your changes. >>> >>> thanks, >>> >>> Chris >>> >>> On 2/12/20 1:28 PM, Alex Menkov wrote: >>>> Hi all, >>>> >>>> Please review small fix for >>>> https://bugs.openjdk.java.net/browse/JDK-8238710 >>>> webrev: >>>> http://cr.openjdk.java.net/~amenkov/jdk15/LingeredApp_log_error/webrev/ >>>> >>>> >>>> --alex >>> > From david.holmes at oracle.com Thu Feb 13 04:27:55 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 13 Feb 2020 14:27:55 +1000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <1f2c938f-f9ce-7fb7-7169-94daaf3542a4@oracle.com> Message-ID: Hi Ralf, On 12/02/2020 6:41 pm, Schmelter, Ralf wrote: > Hi David, > >> I see little point in subclassing NonJavaThread (via NamedThread) but >> then overriding pre_run() and post_run() so that you don't do anything >> that NonJavaThread is supposed to do regarding the NJT iterator >> capabilities. > > The problem is the post_run() method of NamedThread calls Thread::clear_thread_current(), which then makes it impossible to delete the thread at least in a debug build, since the code in ~Thread calls os::free_thread() which calls Thread::current()->->osthread() in an assert, which obviously will crash. If the lifecycle of your new NonJavaThread does not fit the existing NonJavaThreads then yes you will need to override pre_run() and post_run(), but you shouldn't just delete all the NJT iteration support - at least it isn't obvious to me that it is valid to do so. > Originally I tried not use my own threads at all and instead use the WorkGang from CollectedHeap:: get_safepoint_workers(). But this ultimately failed because I'm not allowed to iterate the heap in a worker thread on Shenandoah. Additionally ParallelGC did not implement get_safepoint_workers(), but that should have not been a problem. That begs the question for me exactly what it is that your new NJT worker thread will touch in the VM because that will determine where it needs to fit in the Thread hierarchy and what actions it needs to perform in pre_run() and post_run(). I'm unclear what state the VM will be in when this heap dump is performed and these worker threads are doing the compression. Thanks, David > Maybe it is better to try to get this to work (e.g. if I could specify a foreground task when calling run_task(), the problem could be avoid by doing the iteration in the foreground task). But I'm not sure how changes in this area are seen. > >> For your monitor operations, you should use a MonitorLocker and then >> call ml->wait() which will do the right thing with respect to "no >> safepoint checks" without you needing to specify it directly. > > Thanks, will do. > > Best regards, > Ralf > > -----Original Message----- > From: David Holmes > Sent: Dienstag, 11. Februar 2020 08:44 > To: Schmelter, Ralf ; Yasumasa Suenaga ; OpenJDK Serviceability > Cc: yasuenag at gmail.com > Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump > > Hi again Ralf, :) > > A few more comments after taking a closer look at the thread code. > > On the surface it seems to me this is a case where it would be okay to > introduce a subclass of Thread that is not JavaThread nor NonJavaThread. > I see little point in subclassing NonJavaThread (via NamedThread) but > then overriding pre_run() and post_run() so that you don't do anything > that NonJavaThread is supposed to do regarding the NJT iterator > capabilities. But we currently expect all threads to fit into one > category or another, so this is problematic. :( I thinking disabling the > NJT functionality is also problematic. So not sure what to suggest yet. > > BTW you extended NamedThread but you never actually set a name AFAICS. ?? > > For your monitor operations, you should use a MonitorLocker and then > call ml->wait() which will do the right thing with respect to "no > safepoint checks" without you needing to specify it directly. > > Cheers, > David > From Roger.Riggs at oracle.com Thu Feb 13 15:52:33 2020 From: Roger.Riggs at oracle.com (Roger Riggs) Date: Thu, 13 Feb 2020 10:52:33 -0500 Subject: RFR 8232622: Technical debt in BadAttributeValueExpException Message-ID: Please review a minor cleanup to remove code long since unnecessary. The type of the BadAttributeValueExpException argument is String and if it is not a string in the serialized stream, a suitable replacement is created. Issue: https://bugs.openjdk.java.net/browse/JDK-8232622 Patch: diff --git a/src/java.management/share/classes/javax/management/BadAttributeValueExpException.java b/src/java.management/share/classes/javax/management/BadAttributeValueExpException.java --- a/src/java.management/share/classes/javax/management/BadAttributeValueExpException.java +++ b/src/java.management/share/classes/javax/management/BadAttributeValueExpException.java @@ -1,5 +1,5 @@ ?/* - * Copyright (c) 1999, 2019, Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 1999, 2020, Oracle and/or its affiliates. All rights reserved. ? * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ? * ? * This code is free software; you can redistribute it and/or modify it @@ -48,7 +48,7 @@ public class BadAttributeValueExpExcepti ????? * for example, the string value can be the return of {@code attribute.toString()} ????? */ ???? @SuppressWarnings("serial") // See handling in constructor and readObject -??? private Object val; +??? private String val; ???? /** ????? * Constructs a BadAttributeValueExpException using the specified Object to @@ -72,19 +72,8 @@ public class BadAttributeValueExpExcepti ???????? ObjectInputStream.GetField gf = ois.readFields(); ???????? Object valObj = gf.get("val", null); -??????? if (valObj == null) { -??????????? val = null; -??????? } else if (valObj instanceof String) { -??????????? val= valObj; -??????? } else if (System.getSecurityManager() == null -??????????????? || valObj instanceof Long -??????????????? || valObj instanceof Integer -??????????????? || valObj instanceof Float -??????????????? || valObj instanceof Double -??????????????? || valObj instanceof Byte -??????????????? || valObj instanceof Short -??????????????? || valObj instanceof Boolean) { -??????????? val = valObj.toString(); +??????? if (valObj instanceof String || valObj == null) { +??????????? val = (String)valObj; ???????? } else { // the serialized object is from a version without JDK-8019292 fix ???????????? val = System.identityHashCode(valObj) + "@" + valObj.getClass().getName(); ???????? } Thanks, Roger From daniel.fuchs at oracle.com Thu Feb 13 17:38:34 2020 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Thu, 13 Feb 2020 17:38:34 +0000 Subject: RFR 8232622: Technical debt in BadAttributeValueExpException In-Reply-To: References: Message-ID: <0737e0cc-9202-bff4-4dcf-ae99e508c442@oracle.com> Hi Roger, I think you will need to preserve these cases: On 13/02/2020 15:52, Roger Riggs wrote: > -??????????????? || valObj instanceof Long > -??????????????? || valObj instanceof Integer > -??????????????? || valObj instanceof Float > -??????????????? || valObj instanceof Double > -??????????????? || valObj instanceof Byte > -??????????????? || valObj instanceof Short > -??????????????? || valObj instanceof Boolean) { They could legitimately be transmitted by an older unpatched JVM. we don't want to use System.identityHashCode(valObj) + "@" + valObj.getClass().getName(); for these. best regards, -- daniel From patricio.chilano.mateo at oracle.com Thu Feb 13 17:47:08 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Thu, 13 Feb 2020 14:47:08 -0300 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: References: Message-ID: <3c59b9f9-ec38-18c9-8f24-e1186a08a04a@oracle.com> Hi Richard, I?m only commenting on the handshake changes. I see that operation VM_EnterInterpOnlyMode can be called inside operation VM_SetFramePop which also allows nested operations. Here is a comment in VM_SetFramePop definition: // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. So if we change VM_EnterInterpOnlyMode to be a handshake, then now we could have a handshake inside a safepoint operation. The issue I see there is that at the end of the handshake the polling page of the target thread could be disarmed. So if the target thread happens to be in a blocked state just transiently and wakes up then it will not stop for the ongoing safepoint. Maybe I can file an RFE to assert that the polling page is armed at the beginning of disarm_safepoint(). I think one option could be to remove SafepointMechanism::disarm_if_needed() in HandshakeState::clear_handshake() and let each JavaThread disarm itself for the handshake case. Alternatively I think you could do something similar to what we do in Deoptimization::deoptimize_all_marked(): ? EnterInterpOnlyModeClosure hs; ? if (SafepointSynchronize::is_at_safepoint()) { ??? hs.do_thread(state->get_thread()); ? } else { ??? Handshake::execute(&hs, state->get_thread()); ? } (you could pass ?EnterInterpOnlyModeClosure? directly to the HandshakeClosure() constructor) I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is always called in a nested operation or just sometimes. Thanks, Patricio On 2/12/20 7:23 AM, Reingruber, Richard wrote: > // Repost including hotspot runtime and gc lists. > // Dean Long suggested to do so, because the enhancement replaces a vm operation > // with a handshake. > // Original thread: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030359.html > > Hi, > > could I please get reviews for this small enhancement in hotspot's jvmti implementation: > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 > > The change avoids making all compiled methods on stack not_entrant when switching a java thread to > interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. > > Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. > > Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. > > Thanks, Richard. > > See also my question if anyone knows a reason for making the compiled methods not_entrant: > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From Roger.Riggs at oracle.com Thu Feb 13 18:18:34 2020 From: Roger.Riggs at oracle.com (Roger Riggs) Date: Thu, 13 Feb 2020 13:18:34 -0500 Subject: RFR 8232622: Technical debt in BadAttributeValueExpException In-Reply-To: <0737e0cc-9202-bff4-4dcf-ae99e508c442@oracle.com> References: <0737e0cc-9202-bff4-4dcf-ae99e508c442@oracle.com> Message-ID: Hi Daniel, That's part of the technical debt that has to go. The change to do the conversion in the constructor has been there since JDK8. And the value of the argument is informative. There isn't too much overhead in keeping them but the are just noise at this point. Thanks, Roger On 2/13/20 12:38 PM, Daniel Fuchs wrote: > Hi Roger, > > I think you will need to preserve these cases: > > On 13/02/2020 15:52, Roger Riggs wrote: >> -??????????????? || valObj instanceof Long >> -??????????????? || valObj instanceof Integer >> -??????????????? || valObj instanceof Float >> -??????????????? || valObj instanceof Double >> -??????????????? || valObj instanceof Byte >> -??????????????? || valObj instanceof Short >> -??????????????? || valObj instanceof Boolean) { > > They could legitimately be transmitted by an older unpatched JVM. > we don't want to use System.identityHashCode(valObj) + "@" + > valObj.getClass().getName(); for these. > > best regards, > > -- daniel -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.fuchs at oracle.com Thu Feb 13 19:25:42 2020 From: daniel.fuchs at oracle.com (Daniel Fuchs) Date: Thu, 13 Feb 2020 19:25:42 +0000 Subject: RFR 8232622: Technical debt in BadAttributeValueExpException In-Reply-To: References: <0737e0cc-9202-bff4-4dcf-ae99e508c442@oracle.com> Message-ID: Hi Roger, OK - I can accept that then. best regards, -- daniel On 13/02/2020 18:18, Roger Riggs wrote: > Hi Daniel, > > That's part of the technical debt that has to go. > > The change to do the conversion in the constructor has been there since > JDK8. > And the value of the argument is informative. > > There isn't too much overhead in keeping them but the are just noise at > this point. > > Thanks, Roger From fairoz.matte at oracle.com Fri Feb 14 05:13:04 2020 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Thu, 13 Feb 2020 21:13:04 -0800 (PST) Subject: RFR (S) 8239055: Wrong implementation of VMState.hasListener Message-ID: <988ac692-444a-4025-99c5-421d4c554f3b@default> Hi, Please review a tiny change to correct the VMState.hasListener implementation. JBS: https://bugs.openjdk.java.net/browse/JDK-8239055 Webrev: http://cr.openjdk.java.net/~fmatte/8239055/webrev.00/ Thanks, Fairoz -------------- next part -------------- An HTML attachment was scrubbed... URL: From richard.reingruber at sap.com Fri Feb 14 12:58:41 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 14 Feb 2020 12:58:41 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: <3c59b9f9-ec38-18c9-8f24-e1186a08a04a@oracle.com> References: <3c59b9f9-ec38-18c9-8f24-e1186a08a04a@oracle.com> Message-ID: Hi Patricio, thanks for having a look. > I?m only commenting on the handshake changes. > I see that operation VM_EnterInterpOnlyMode can be called inside > operation VM_SetFramePop which also allows nested operations. Here is a > comment in VM_SetFramePop definition: > > // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is > // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. > > So if we change VM_EnterInterpOnlyMode to be a handshake, then now we > could have a handshake inside a safepoint operation. The issue I see > there is that at the end of the handshake the polling page of the target > thread could be disarmed. So if the target thread happens to be in a > blocked state just transiently and wakes up then it will not stop for > the ongoing safepoint. Maybe I can file an RFE to assert that the > polling page is armed at the beginning of disarm_safepoint(). I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a handshake cannot be nested in a vm operation. Maybe it should be asserted in the Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? > Alternatively I think you could do something similar to what we do in > Deoptimization::deoptimize_all_marked(): > > EnterInterpOnlyModeClosure hs; > if (SafepointSynchronize::is_at_safepoint()) { > hs.do_thread(state->get_thread()); > } else { > Handshake::execute(&hs, state->get_thread()); > } > (you could pass ?EnterInterpOnlyModeClosure? directly to the > HandshakeClosure() constructor) Maybe this could be used also in the Handshake::execute() methods as general solution? > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is > always called in a nested operation or just sometimes. At least one execution path without vm operation exists: JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong JvmtiEventControllerPrivate::recompute_enabled() : void JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further encouraged to do it with a handshake :) Thanks again, Richard. -----Original Message----- From: Patricio Chilano Sent: Donnerstag, 13. Februar 2020 18:47 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant Hi Richard, I?m only commenting on the handshake changes. I see that operation VM_EnterInterpOnlyMode can be called inside operation VM_SetFramePop which also allows nested operations. Here is a comment in VM_SetFramePop definition: // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. So if we change VM_EnterInterpOnlyMode to be a handshake, then now we could have a handshake inside a safepoint operation. The issue I see there is that at the end of the handshake the polling page of the target thread could be disarmed. So if the target thread happens to be in a blocked state just transiently and wakes up then it will not stop for the ongoing safepoint. Maybe I can file an RFE to assert that the polling page is armed at the beginning of disarm_safepoint(). I think one option could be to remove SafepointMechanism::disarm_if_needed() in HandshakeState::clear_handshake() and let each JavaThread disarm itself for the handshake case. Alternatively I think you could do something similar to what we do in Deoptimization::deoptimize_all_marked(): ? EnterInterpOnlyModeClosure hs; ? if (SafepointSynchronize::is_at_safepoint()) { ??? hs.do_thread(state->get_thread()); ? } else { ??? Handshake::execute(&hs, state->get_thread()); ? } (you could pass ?EnterInterpOnlyModeClosure? directly to the HandshakeClosure() constructor) I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is always called in a nested operation or just sometimes. Thanks, Patricio On 2/12/20 7:23 AM, Reingruber, Richard wrote: > // Repost including hotspot runtime and gc lists. > // Dean Long suggested to do so, because the enhancement replaces a vm operation > // with a handshake. > // Original thread: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030359.html > > Hi, > > could I please get reviews for this small enhancement in hotspot's jvmti implementation: > > Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ > Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 > > The change avoids making all compiled methods on stack not_entrant when switching a java thread to > interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. > > Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. > > Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. > > Thanks, Richard. > > See also my question if anyone knows a reason for making the compiled methods not_entrant: > http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From patricio.chilano.mateo at oracle.com Fri Feb 14 14:53:52 2020 From: patricio.chilano.mateo at oracle.com (Patricio Chilano) Date: Fri, 14 Feb 2020 11:53:52 -0300 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: References: <3c59b9f9-ec38-18c9-8f24-e1186a08a04a@oracle.com> Message-ID: <410eed04-e2ef-0f4f-1c56-19e6734a10f6@oracle.com> Hi Richard, On 2/14/20 9:58 AM, Reingruber, Richard wrote: > Hi Patricio, > > thanks for having a look. > > > I?m only commenting on the handshake changes. > > I see that operation VM_EnterInterpOnlyMode can be called inside > > operation VM_SetFramePop which also allows nested operations. Here is a > > comment in VM_SetFramePop definition: > > > > // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is > > // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. > > > > So if we change VM_EnterInterpOnlyMode to be a handshake, then now we > > could have a handshake inside a safepoint operation. The issue I see > > there is that at the end of the handshake the polling page of the target > > thread could be disarmed. So if the target thread happens to be in a > > blocked state just transiently and wakes up then it will not stop for > > the ongoing safepoint. Maybe I can file an RFE to assert that the > > polling page is armed at the beginning of disarm_safepoint(). > > I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a > handshake cannot be nested in a vm operation. Maybe it should be asserted in the > Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? > > > Alternatively I think you could do something similar to what we do in > > Deoptimization::deoptimize_all_marked(): > > > > EnterInterpOnlyModeClosure hs; > > if (SafepointSynchronize::is_at_safepoint()) { > > hs.do_thread(state->get_thread()); > > } else { > > Handshake::execute(&hs, state->get_thread()); > > } > > (you could pass ?EnterInterpOnlyModeClosure? directly to the > > HandshakeClosure() constructor) > > Maybe this could be used also in the Handshake::execute() methods as general solution? Right, we could also do that. Avoiding to clear the polling page in HandshakeState::clear_handshake() should be enough to fix this issue and execute a handshake inside a safepoint, but adding that "if" statement in Hanshake::execute() sounds good to avoid all the extra code that we go through when executing a handshake. I filed 8239084 to make that change. > > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is > > always called in a nested operation or just sometimes. > > At least one execution path without vm operation exists: > > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong > JvmtiEventControllerPrivate::recompute_enabled() : void > JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) > JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void > JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError > jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError > > I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a > handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further > encouraged to do it with a handshake :) Ah! I think you can still do it with a handshake with the Deoptimization::deoptimize_all_marked() like solution. I can change the if-else statement with just the Handshake::execute() call in 8239084. But up to you.? : ) Thanks, Patricio > Thanks again, > Richard. > > -----Original Message----- > From: Patricio Chilano > Sent: Donnerstag, 13. Februar 2020 18:47 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > Hi Richard, > > I?m only commenting on the handshake changes. > I see that operation VM_EnterInterpOnlyMode can be called inside > operation VM_SetFramePop which also allows nested operations. Here is a > comment in VM_SetFramePop definition: > > // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is > // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. > > So if we change VM_EnterInterpOnlyMode to be a handshake, then now we > could have a handshake inside a safepoint operation. The issue I see > there is that at the end of the handshake the polling page of the target > thread could be disarmed. So if the target thread happens to be in a > blocked state just transiently and wakes up then it will not stop for > the ongoing safepoint. Maybe I can file an RFE to assert that the > polling page is armed at the beginning of disarm_safepoint(). > > I think one option could be to remove > SafepointMechanism::disarm_if_needed() in > HandshakeState::clear_handshake() and let each JavaThread disarm itself > for the handshake case. > > Alternatively I think you could do something similar to what we do in > Deoptimization::deoptimize_all_marked(): > > ? EnterInterpOnlyModeClosure hs; > ? if (SafepointSynchronize::is_at_safepoint()) { > ??? hs.do_thread(state->get_thread()); > ? } else { > ??? Handshake::execute(&hs, state->get_thread()); > ? } > (you could pass ?EnterInterpOnlyModeClosure? directly to the > HandshakeClosure() constructor) > > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is > always called in a nested operation or just sometimes. > > Thanks, > Patricio > > On 2/12/20 7:23 AM, Reingruber, Richard wrote: >> // Repost including hotspot runtime and gc lists. >> // Dean Long suggested to do so, because the enhancement replaces a vm operation >> // with a handshake. >> // Original thread: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030359.html >> >> Hi, >> >> could I please get reviews for this small enhancement in hotspot's jvmti implementation: >> >> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >> >> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >> >> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >> >> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >> >> Thanks, Richard. >> >> See also my question if anyone knows a reason for making the compiled methods not_entrant: >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From richard.reingruber at sap.com Fri Feb 14 18:47:20 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 14 Feb 2020 18:47:20 +0000 Subject: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant In-Reply-To: <410eed04-e2ef-0f4f-1c56-19e6734a10f6@oracle.com> References: <3c59b9f9-ec38-18c9-8f24-e1186a08a04a@oracle.com> <410eed04-e2ef-0f4f-1c56-19e6734a10f6@oracle.com> Message-ID: Hi Patricio, > > I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a > > handshake cannot be nested in a vm operation. Maybe it should be asserted in the > > Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? > > > > > Alternatively I think you could do something similar to what we do in > > > Deoptimization::deoptimize_all_marked(): > > > > > > EnterInterpOnlyModeClosure hs; > > > if (SafepointSynchronize::is_at_safepoint()) { > > > hs.do_thread(state->get_thread()); > > > } else { > > > Handshake::execute(&hs, state->get_thread()); > > > } > > > (you could pass ?EnterInterpOnlyModeClosure? directly to the > > > HandshakeClosure() constructor) > > > > Maybe this could be used also in the Handshake::execute() methods as general solution? > Right, we could also do that. Avoiding to clear the polling page in > HandshakeState::clear_handshake() should be enough to fix this issue and > execute a handshake inside a safepoint, but adding that "if" statement > in Hanshake::execute() sounds good to avoid all the extra code that we > go through when executing a handshake. I filed 8239084 to make that change. Thanks for taking care of this and creating the RFE. > > > > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is > > > always called in a nested operation or just sometimes. > > > > At least one execution path without vm operation exists: > > > > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void > > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong > > JvmtiEventControllerPrivate::recompute_enabled() : void > > JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) > > JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void > > JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError > > jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError > > > > I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a > > handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further > > encouraged to do it with a handshake :) > Ah! I think you can still do it with a handshake with the > Deoptimization::deoptimize_all_marked() like solution. I can change the > if-else statement with just the Handshake::execute() call in 8239084. > But up to you. : ) Well, I think that's enough encouragement :) I'll wait for 8239084 and try then again. (no urgency and all) Thanks, Richard. -----Original Message----- From: Patricio Chilano Sent: Freitag, 14. Februar 2020 15:54 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant Hi Richard, On 2/14/20 9:58 AM, Reingruber, Richard wrote: > Hi Patricio, > > thanks for having a look. > > > I?m only commenting on the handshake changes. > > I see that operation VM_EnterInterpOnlyMode can be called inside > > operation VM_SetFramePop which also allows nested operations. Here is a > > comment in VM_SetFramePop definition: > > > > // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is > > // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. > > > > So if we change VM_EnterInterpOnlyMode to be a handshake, then now we > > could have a handshake inside a safepoint operation. The issue I see > > there is that at the end of the handshake the polling page of the target > > thread could be disarmed. So if the target thread happens to be in a > > blocked state just transiently and wakes up then it will not stop for > > the ongoing safepoint. Maybe I can file an RFE to assert that the > > polling page is armed at the beginning of disarm_safepoint(). > > I'm really glad you noticed the problematic nesting. This seems to be a general issue: currently a > handshake cannot be nested in a vm operation. Maybe it should be asserted in the > Handshake::execute() methods that they are not called by the vm thread evaluating a vm operation? > > > Alternatively I think you could do something similar to what we do in > > Deoptimization::deoptimize_all_marked(): > > > > EnterInterpOnlyModeClosure hs; > > if (SafepointSynchronize::is_at_safepoint()) { > > hs.do_thread(state->get_thread()); > > } else { > > Handshake::execute(&hs, state->get_thread()); > > } > > (you could pass ?EnterInterpOnlyModeClosure? directly to the > > HandshakeClosure() constructor) > > Maybe this could be used also in the Handshake::execute() methods as general solution? Right, we could also do that. Avoiding to clear the polling page in HandshakeState::clear_handshake() should be enough to fix this issue and execute a handshake inside a safepoint, but adding that "if" statement in Hanshake::execute() sounds good to avoid all the extra code that we go through when executing a handshake. I filed 8239084 to make that change. > > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is > > always called in a nested operation or just sometimes. > > At least one execution path without vm operation exists: > > JvmtiEventControllerPrivate::enter_interp_only_mode(JvmtiThreadState *) : void > JvmtiEventControllerPrivate::recompute_thread_enabled(JvmtiThreadState *) : jlong > JvmtiEventControllerPrivate::recompute_enabled() : void > JvmtiEventControllerPrivate::change_field_watch(jvmtiEvent, bool) : void (2 matches) > JvmtiEventController::change_field_watch(jvmtiEvent, bool) : void > JvmtiEnv::SetFieldAccessWatch(fieldDescriptor *) : jvmtiError > jvmti_SetFieldAccessWatch(jvmtiEnv *, jclass, jfieldID) : jvmtiError > > I tend to revert back to VM_EnterInterpOnlyMode as it wasn't my main intent to replace it with a > handshake, but to avoid making the compiled methods on stack not_entrant.... unless I'm further > encouraged to do it with a handshake :) Ah! I think you can still do it with a handshake with the Deoptimization::deoptimize_all_marked() like solution. I can change the if-else statement with just the Handshake::execute() call in 8239084. But up to you.? : ) Thanks, Patricio > Thanks again, > Richard. > > -----Original Message----- > From: Patricio Chilano > Sent: Donnerstag, 13. Februar 2020 18:47 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net > Subject: Re: RFR(S) 8238585: Use handshake for JvmtiEventControllerPrivate::enter_interp_only_mode() and don't make compiled methods on stack not_entrant > > Hi Richard, > > I?m only commenting on the handshake changes. > I see that operation VM_EnterInterpOnlyMode can be called inside > operation VM_SetFramePop which also allows nested operations. Here is a > comment in VM_SetFramePop definition: > > // Nested operation must be allowed for the VM_EnterInterpOnlyMode that is > // called from the JvmtiEventControllerPrivate::recompute_thread_enabled. > > So if we change VM_EnterInterpOnlyMode to be a handshake, then now we > could have a handshake inside a safepoint operation. The issue I see > there is that at the end of the handshake the polling page of the target > thread could be disarmed. So if the target thread happens to be in a > blocked state just transiently and wakes up then it will not stop for > the ongoing safepoint. Maybe I can file an RFE to assert that the > polling page is armed at the beginning of disarm_safepoint(). > > I think one option could be to remove > SafepointMechanism::disarm_if_needed() in > HandshakeState::clear_handshake() and let each JavaThread disarm itself > for the handshake case. > > Alternatively I think you could do something similar to what we do in > Deoptimization::deoptimize_all_marked(): > > ? EnterInterpOnlyModeClosure hs; > ? if (SafepointSynchronize::is_at_safepoint()) { > ??? hs.do_thread(state->get_thread()); > ? } else { > ??? Handshake::execute(&hs, state->get_thread()); > ? } > (you could pass ?EnterInterpOnlyModeClosure? directly to the > HandshakeClosure() constructor) > > I don?t know JVMTI code so I?m not sure if VM_EnterInterpOnlyMode is > always called in a nested operation or just sometimes. > > Thanks, > Patricio > > On 2/12/20 7:23 AM, Reingruber, Richard wrote: >> // Repost including hotspot runtime and gc lists. >> // Dean Long suggested to do so, because the enhancement replaces a vm operation >> // with a handshake. >> // Original thread: http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-February/030359.html >> >> Hi, >> >> could I please get reviews for this small enhancement in hotspot's jvmti implementation: >> >> Webrev: http://cr.openjdk.java.net/~rrich/webrevs/8238585/webrev.0/ >> Bug: https://bugs.openjdk.java.net/browse/JDK-8238585 >> >> The change avoids making all compiled methods on stack not_entrant when switching a java thread to >> interpreter only execution for jvmti purposes. It is sufficient to deoptimize the compiled frames on stack. >> >> Additionally a handshake is used instead of a vm operation to walk the stack and do the deoptimizations. >> >> Testing: JCK and JTREG tests, also in Xcomp mode with fastdebug and release builds on all platforms. >> >> Thanks, Richard. >> >> See also my question if anyone knows a reason for making the compiled methods not_entrant: >> http://mail.openjdk.java.net/pipermail/serviceability-dev/2020-January/030339.html From mandy.chung at oracle.com Sat Feb 15 02:16:58 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Fri, 14 Feb 2020 18:16:58 -0800 Subject: [PING] RFR: 8231111: Cgroups v2: Rework Metrics in java.base so as to recognize unified hierarchy In-Reply-To: <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> References: <75fc377f8d5ca76b7dac02f55db640cbdd305633.camel@redhat.com> <4bf65380bc26cd3bf684d7994b33e66bcb87927b.camel@redhat.com> <6CACCC0D-7F5A-42A3-83F1-746497940CCA@oracle.com> <29544339574e34c4c25cbec0314c26f35e8d1a99.camel@redhat.com> <4CE7C7F6-ABFA-4263-98B2-32BBD5013A3C@oracle.com> <10b5e83bfb7e618e5f5906c8e707057ff8680785.camel@redhat.com> <4a304e2ce72a53859b4e9cc8b21db404a260b531.camel@redhat.com> <97e7ff2b-f2cc-a666-afb8-521c0f5c37e7@oracle.com> <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> Message-ID: <964cf134-afe3-5eb3-9990-095e503bc5ad@oracle.com> Hi Severin, On 2/11/20 10:04 AM, Severin Gehwolf wrote: > Updated webrev: > Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/webrev/ > incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/incremental/webrev/ Thanks for updating this.? This patch looks okay in general while I still think this is quite hard to tell whether the implementation matches the spec in the unavailable & unlimited cases.? It'd be good if this can be cleaned up in the future.??? Thanks for the new test. src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFactory.java 92 logger.log(Level.WARNING, "Mixed cgroupv1 and cgroupv2 not supported. Metrics disabled."); There isn't a clear guideline on what logging level to use.? I will opt for API not emit any message as this method returns null.? or this should throw an exception if this should be reported instead. So debug or trace level seems to be suitable for this message. ?130 return new CgroupMetrics(new CgroupV2Subsystem(unified)); 131 } else { 132 return new CgroupV1Metrics(CgroupV1Subsystem.getInstance()); CgroupV2Subsystem is instantiated with a constructor.? Do you expect multiple instances of CgroupV2Subsystem (as opposed to CgroupV1Subsystem is a singleton)? src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java 391 String match = "hierarchical_memory_limit"; 392 retval = CgroupV1SubsystemController.getLongValueMatchingLine(memory, 393 "memory.stat", 394 match); Nit: the match variable is not needed. Similar pattern occurs in line 451-453. src/java.base/share/classes/jdk/internal/platform/MetricsCgroupV1.java This is cgroup v1 specific. I still think it should be moved to linux/classes rather than share/classes.? The tests should only run on linux. src/java.base/linux/classes/jdk/internal/platform/CgroupV1Metrics.java ?? copyright header is missing ?? You can define a private MetricsCgroupV1 subsystem field in this class to avoid the casting: 15 return ((MetricsCgroupV1)subsystem).getMemoryMaxUsage(); So CgroupMetrics::subsystem field can stay as private if desire. Side note:? CgroupV1Metrics is the implementation of MetricsCgroupV1.? The naming is a bit strange.? CgroupV1Metrics sounds better for the interface and the implementation class could be CgroupV1MetricsImpl and it'd help a reader to understand their relationship.? I leave it for you to decide. src/java.base/share/classes/sun/launcher/LauncherHelper.java + private static final long LONG_RETVAL_NOT_SUPPORTED = -2; This is specific to printSystemMetrics.? I suggest to move this to printSystemMetrics as a local variable.? Then formatCpuVal and formatLimitString to take an additional "unavailable" parameter so that these methods will print "N/A" if limit == unavailable. + private static String formatBoolean(Boolean value, String prefix) { This is no longer needed. Please remove it. test/lib/jdk/test/lib/containers/cgroup/MetricsTester.java ?? The Oracle copyright is taken out and the copyright is also changed from GPL to GPL+CP. ?? The Red Hat copyright can be added to the top of the file immediately before "DO NOT ALTER or REMOVE" line like [1]. [1] http://hg.openjdk.java.net/jdk/jdk/file/tip/test/hotspot/jtreg/compiler/onSpinWait/TestOnSpinWaitEnableDisable.java test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java ??? TestDockerMemoryMetrics requires docker.support.? I think any test depending on MetricsMemoryTester.java are filtered when running on platform other than linux.? Please verify that. test/lib/jdk/test/lib/containers/cgroup/CPUSetsReader.java ? 58???????? try(Stream stream = Files.lines(Paths.get(path))) { Nit: space between "try" and "(" is missing. test/jdk/jdk/internal/platform/cgroup/TestCgroupSubsystemController.java ?? I expect the copyright header be placed at the top of the file before all imports. thanks Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Mon Feb 17 04:07:03 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 17 Feb 2020 13:07:03 +0900 Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> Message-ID: PING: Could you review it? >> JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >> webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ This change has been already reviewed by Serguei. I need one more reviewer to push. Thanks, Yasumasa On 2020/02/03 1:37, Yasumasa Suenaga wrote: > PING: Could you reveiw this change? > >> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ > > I believe this change helps troubleshooter to fight to postmortem analysis. > > > Thanks, > > Yasumasa > > > On 2020/01/19 3:16, Yasumasa Suenaga wrote: >> PING: Could you review it? >> >> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >> >> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 . >> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>> Hi Serguei, >>> >>> Thanks for your comment! >>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said. >>> >>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>> >>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>> Hi Yasumasa, >>>> >>>> This is nice move in general. >>>> Thank you for working on this! >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>> >>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } >>>> >>>> >>>> I'd suggest to simplify the logic by refactoring to something like below: >>>> >>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame >>>> ?????????? DwarfParser dwarf = null; >>>> >>>> ?????????? if (libptr != 0L) { // Native frame >>>> ???????????? try { >>>> ?????????????? dwarf = new DwarfParser(libptr); >>>> ?????????????? dwarf.processDwarf(pc); >>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && >>>> ????????????????????????????? !dwarf.isBPOffsetAvailable()) >>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>> .addOffsetTo(dwarf.getCFAOffset()); >>>> >>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case >>>> ??????????? } >>>> ????????? } >>>> ????????? if (cfa == null) { >>>> ??????????? return null; >>>> ????????? } >>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>> >>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>> >>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>> >>>> ?? Better to rename 'ofs' => 'offs'. >>>> >>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA()); >>>> >>>> ?? Extra space after '-' sign. >>>> >>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) { >>>> >>>> ?? It feels like the logic has to be somehow refactored/simplified as >>>> ?? several typical fragments appears in slightly different contexts. >>>> ?? But it is not easy to understand what it is. >>>> ?? Could you, please, add some comments to key places explaining this logic. >>>> ?? Then I'll check if it is possible to make it a little bit simpler. >>>> >>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } >>>> >>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC): >>>> ????? private CFrame javaSender(ThreadContext context) { >>>> ??????? Address nextPC = getNextPC(false); >>>> ??????? if (nextPC == null) { >>>> ????????? return null; >>>> ??????? } >>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>> ??????? DwarfParser nextDwarf = null; >>>> >>>> ??????? if (libptr != 0L) { // Native frame >>>> ????????? try { >>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>> ??????????? nextDwarf.processDwarf(nextPC); >>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame >>>> ????????? } >>>> ??????? } >>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>> ????? } >>>> >>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, >>>> nextCFA, nextPC, nextDwarf); 167 } >>>> >>>> ??This one can be also simplified a little: >>>> >>>> ????? public CFrame sender(ThreadProxy thread) { >>>> ??????? ThreadContext context = thread.getContext(); >>>> >>>> ??????? if (dwarf == null) { // Java frame >>>> ????????? return javaSender(context); >>>> ??????? } >>>> ??????? Address nextPC = getNextPC(true); >>>> ??????? if (nextPC == null) { >>>> ????????? return null; >>>> ??????? } >>>> ??????? DwarfParser nextDwarf = null; >>>> ??????? if (!dwarf.isIn(nextPC)) { >>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>> ????????? if (libptr != 0L) { >>>> ??????????? try { >>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>> ????????????? nextDwarf.processDwarf(nextPC); >>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>> ??????????? } >>>> ????????? } >>>> ??????? } >>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>> ????? } >>>> >>>> Finally, it looks like just one method could replace both >>>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>>> >>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>> ??????? ThreadContext context = thread.getContext(); >>>> ??????? Address nextPC = getNextPC(false); >>>> ??????? if (nextPC == null) { >>>> ????????? return null; >>>> ??????? } >>>> ??????? DwarfParser nextDwarf = null; >>>> >>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>> ????????? if (libptr != 0L) { >>>> ??????????? try { >>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>> ????????????? nextDwarf.processDwarf(nextPC); >>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>> ??????????? } >>>> ????????? } >>>> ??????? } >>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>> ????? } >>>> >>>> I'm still reviewing the dwarf parser files. >>>> >>>> Thanks, >>>> Serguei >>>> >>>> >>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>> Hi, >>>>> >>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and >>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>> Could you review new webrev? >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>> >>>>> The diff from previous webrev is here: >>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>> >>>>> >>>>> Thanks, >>>>> >>>>> Yasumasa >>>>> >>>>> >>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>> Hi all, >>>>>> >>>>>> Please review this change: >>>>>> >>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>> >>>>>> >>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 >>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame >>>>>> for stack unwinding. >>>>>> >>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system >>>>>> library (e.g. libc) might be compiled with this feature. >>>>>> >>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). >>>>>> So it might be lack of stack frames. >>>>>> >>>>>> I guess JDK-8219201 is caused by same issue. >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> [1] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf >>>> From zgu at redhat.com Mon Feb 17 14:51:39 2020 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 17 Feb 2020 09:51:39 -0500 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: <8ea20a15-bdb9-27b0-c306-154f539a3674@oracle.com> References: <8ea20a15-bdb9-27b0-c306-154f539a3674@oracle.com> Message-ID: <3a5d60fd-04d5-da96-3d79-242d43fdec79@redhat.com> Hi Stefan, Thanks for the review and suggestions, updated accordingly: http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/ > > --- > Previously, the calls to 'mark' and 'visited' were inlineable, but now > every GC has to take a virtual call when marking the objects. My guess > is that this code is slow anyway, and that it doesn't matter too much, > but did you measure the effect of that change with, for example, G1? > I did rough measurement, timing vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test. If you know any tests/benchmarks I should measure, please let me know. Thanks, -Zhengyu > Thanks, > StefanK > >> Test: >> ?? hotspot_gc >> ?? vmTestbase_nsk_jdi >> ?? vmTestbase_nsk_jvmti >> >> Thanks, >> >> -Zhengyu >> >> > From linzang at tencent.com Tue Feb 18 04:15:43 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Tue, 18 Feb 2020 04:15:43 +0000 Subject: RFR: add parallel heap inspection support for jmap histo(G1) Message-ID: <11bca96c0e7745f5b2558cc49b42b996@tencent.com> Dear All, May I ask your help to review the follow changes: webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ bug: https://bugs.openjdk.java.net/browse/JDK-8215624 related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 This patch enable parallel heap inspection of G1 for jmap histo. my simple test shown it can speed up 2x of jmap -histo with parallelThreadNum set to 2 for heap at ~500M on 4-core platform. ________________________________ BRs, Lin -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Tue Feb 18 05:23:46 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 18 Feb 2020 15:23:46 +1000 Subject: RFR: add parallel heap inspection support for jmap histo(G1) In-Reply-To: <11bca96c0e7745f5b2558cc49b42b996@tencent.com> References: <11bca96c0e7745f5b2558cc49b42b996@tencent.com> Message-ID: Hi Lin, Adding in hotspot-gc-dev as they need to see how this interacts with GC worker threads, and whether it needs to be extended beyond G1. I happened to spot one nit when browsing: src/hotspot/share/gc/shared/collectedHeap.hpp + virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, + BoolObjectClosure* filter, + size_t* missed_count, + size_t thread_num) { + return NULL; s/NULL/false/ Cheers, David On 18/02/2020 2:15 pm, linzang(??) wrote: > Dear All, > ? ? ?May I ask your help to review the follow changes: > ? ? ?webrev: > http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ > ? ? ?bug: https://bugs.openjdk.java.net/browse/JDK-8215624 > ? ? ?related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 > ? ? ?This patch enable parallel heap inspection of G1 for jmap histo. > ? ? ?my simple test shown it can speed up 2x of jmap -histo with > parallelThreadNum set to 2 for heap at ~500M on 4-core platform. > > ------------------------------------------------------------------------ > BRs, > Lin From linzang at tencent.com Tue Feb 18 06:29:38 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Tue, 18 Feb 2020 06:29:38 +0000 Subject: RFR: add parallel heap inspection support for jmap histo(G1)(Internet mail) References: <11bca96c0e7745f5b2558cc49b42b996@tencent.com>, Message-ID: Dear David,? ? ? ? Thanks a lot! ? ? ? I have updated the refined code to?http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. ? ? ? IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. ? ? ? Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap,?then we can extend the solution to other kinds of heap.? ? ? Thanks, -------------- Lin >Hi Lin, > >Adding in hotspot-gc-dev as they need to see how this interacts with GC >worker threads, and whether it needs to be extended beyond G1. > >I happened to spot one nit when browsing: > >src/hotspot/share/gc/shared/collectedHeap.hpp > >+?? virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, >+????????????????????????????????????????? BoolObjectClosure* filter, >+????????????????????????????????????????? size_t* missed_count, >+????????????????????????????????????????? size_t thread_num) { >+???? return NULL; > >s/NULL/false/ > >Cheers, >David > >On 18/02/2020 2:15 pm, linzang(??) wrote: >> Dear All, >>? ? ? ?May I ask your help to review the follow changes: >>? ? ? ?webrev: >> http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ >> ? ? ?bug: https://bugs.openjdk.java.net/browse/JDK-8215624 >> ? ? ?related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 >>? ? ? ?This patch enable parallel heap inspection of G1 for jmap histo. >>? ? ? ?my simple test shown it can speed up 2x of jmap -histo with >> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. >> >> ------------------------------------------------------------------------ >> BRs, >> Lin > From sgehwolf at redhat.com Tue Feb 18 12:50:02 2020 From: sgehwolf at redhat.com (Severin Gehwolf) Date: Tue, 18 Feb 2020 13:50:02 +0100 Subject: [PING] RFR: 8231111: Cgroups v2: Rework Metrics in java.base so as to recognize unified hierarchy In-Reply-To: <964cf134-afe3-5eb3-9990-095e503bc5ad@oracle.com> References: <75fc377f8d5ca76b7dac02f55db640cbdd305633.camel@redhat.com> <4bf65380bc26cd3bf684d7994b33e66bcb87927b.camel@redhat.com> <6CACCC0D-7F5A-42A3-83F1-746497940CCA@oracle.com> <29544339574e34c4c25cbec0314c26f35e8d1a99.camel@redhat.com> <4CE7C7F6-ABFA-4263-98B2-32BBD5013A3C@oracle.com> <10b5e83bfb7e618e5f5906c8e707057ff8680785.camel@redhat.com> <4a304e2ce72a53859b4e9cc8b21db404a260b531.camel@redhat.com> <97e7ff2b-f2cc-a666-afb8-521c0f5c37e7@oracle.com> <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> <964cf134-afe3-5eb3-9990-095e503bc5ad@oracle.com> Message-ID: Hi Mandy, Thanks again for the review! Updated webrev: incremental (only review changes): http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/11/incremental/webrev/ full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/11/webrev/ More below. On Fri, 2020-02-14 at 18:16 -0800, Mandy Chung wrote: > Hi Severin, > > On 2/11/20 10:04 AM, Severin Gehwolf wrote: > > Updated webrev: > > Full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/webrev/ > > incremental: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/10/incremental/webrev/ > > Thanks for updating this. This patch looks okay in general while I > still think this is quite hard to tell whether the implementation > matches the spec in the unavailable & unlimited cases. It'd be good > if this can be cleaned up in the future. Thanks for the new test. > > src/java.base/linux/classes/jdk/internal/platform/CgroupSubsystemFact > ory.java > 92 logger.log(Level.WARNING, "Mixed cgroupv1 and cgroupv2 not supported. Metrics disabled."); > > > There isn't a clear guideline on what logging level to use. I will > opt for API not emit any message as this method returns null. or > this should throw an exception if this should be reported instead. > So debug or trace level seems to be suitable for this message. Fixed. > > 130 return new CgroupMetrics(new CgroupV2Subsystem(unified)); > 131 } else { > 132 return new CgroupV1Metrics(CgroupV1Subsystem.getInstance()); > > > CgroupV2Subsystem is instantiated with a constructor. Do you expect > multiple instances of CgroupV2Subsystem (as opposed to > CgroupV1Subsystem is a singleton)? I've made CgroupV2Subsystem a singleton too now. > src/java.base/linux/classes/jdk/internal/platform/cgroupv1/CgroupV1Subsystem.java > 391 String match = "hierarchical_memory_limit"; > 392 retval = CgroupV1SubsystemController.getLongValueMatchingLine(memory, > 393 "memory.stat", > 394 match); > > > Nit: the match variable is not needed. Similar pattern occurs in line > 451-453. Actually, line 391 matches "hierarchical_memory_limit", line 450 matches "hierarchical_memsw_limit". So the match variable *is* needed. They match different items in file "memory.stat". > > src/java.base/share/classes/jdk/internal/platform/MetricsCgroupV1.java > > This is cgroup v1 specific. I still think it should be moved to > linux/classes rather than share/classes. The tests should only run > on linux. OK. Done. > src/java.base/linux/classes/jdk/internal/platform/CgroupV1Metrics.jav > a > copyright header is missing Fixed. > You can define a private MetricsCgroupV1 subsystem field in this > class to avoid the casting: > 15 return ((MetricsCgroupV1)subsystem).getMemoryMaxUsage(); > > So CgroupMetrics::subsystem field can stay as private if desire. Sure (at the cost of an additional field). Changed as suggested. > Side note: CgroupV1Metrics is the implementation of > MetricsCgroupV1. The naming is a bit strange. CgroupV1Metrics > sounds better for the interface and the implementation class could be > CgroupV1MetricsImpl and it'd help a reader to understand their > relationship. I leave it for you to decide. I wasn't happy with that myself. Changed to CgroupV1Metrics{,Impl} as suggested. > src/java.base/share/classes/sun/launcher/LauncherHelper.java > > + private static final long LONG_RETVAL_NOT_SUPPORTED = -2; > This is specific to printSystemMetrics. I suggest to move this to > printSystemMetrics as a local variable. Then formatCpuVal and > formatLimitString to take an additional "unavailable" parameter so > that these methods will print "N/A" if limit == unavailable. Fixed. > + private static String formatBoolean(Boolean value, String > prefix) { > > > This is no longer needed. Please remove it. Right. Removed. > test/lib/jdk/test/lib/containers/cgroup/MetricsTester.java > The Oracle copyright is taken out and the copyright is also changed from GPL to GPL+CP. > The Red Hat copyright can be added to the top of the file immediately before "DO NOT ALTER or REMOVE" line like [1]. > > [1] http://hg.openjdk.java.net/jdk/jdk/file/tip/test/hotspot/jtreg/compiler/onSpinWait/TestOnSpinWaitEnableDisable.java Hmm, old MetricsTester got renamed with this patch to MetricsTesterCgroupV1. MetricsTesterCgroupV1 still has the old copyright. The version you've looked at is the common part and instantiates tester for cgroup v1 or cgroup v2 as required. Aside: Not sure why old MetricsTester (or new MetricsTesterCgroupV1) is GPL (over GPL+CP).l Either way, I've changed license to GPL over GPL+CP for the new test classes with Red Hat copyright. > test/jdk/jdk/internal/platform/docker/MetricsMemoryTester.java > TestDockerMemoryMetrics requires docker.support. I think any > test depending on MetricsMemoryTester.java are filtered when running > on platform other than linux. Please verify that. MetricsMemoryTester is only used by TestDockerMemoryMetrics, which has @requires docker.support. FWIW, this hasn't changed with this patch. > test/lib/jdk/test/lib/containers/cgroup/CPUSetsReader.java > 58 try(Stream stream = Files.lines(Paths.get(path))) { > > Nit: space between "try" and "(" is missing. Fixed. > test/jdk/jdk/internal/platform/cgroup/TestCgroupSubsystemController.java > I expect the copyright header be placed at the top of the file > before all imports. Fixed. Thanks, Severin From mandy.chung at oracle.com Tue Feb 18 19:00:58 2020 From: mandy.chung at oracle.com (Mandy Chung) Date: Tue, 18 Feb 2020 11:00:58 -0800 Subject: [PING] RFR: 8231111: Cgroups v2: Rework Metrics in java.base so as to recognize unified hierarchy In-Reply-To: References: <75fc377f8d5ca76b7dac02f55db640cbdd305633.camel@redhat.com> <4bf65380bc26cd3bf684d7994b33e66bcb87927b.camel@redhat.com> <6CACCC0D-7F5A-42A3-83F1-746497940CCA@oracle.com> <29544339574e34c4c25cbec0314c26f35e8d1a99.camel@redhat.com> <4CE7C7F6-ABFA-4263-98B2-32BBD5013A3C@oracle.com> <10b5e83bfb7e618e5f5906c8e707057ff8680785.camel@redhat.com> <4a304e2ce72a53859b4e9cc8b21db404a260b531.camel@redhat.com> <97e7ff2b-f2cc-a666-afb8-521c0f5c37e7@oracle.com> <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> <964cf134-afe3-5eb3-9990-095e503bc5ad@oracle.com> Message-ID: <3dcc3e0b-af61-0f52-2ac0-1ce529365a2d@oracle.com> On 2/18/20 4:50 AM, Severin Gehwolf wrote: > Hi Mandy, > > Thanks again for the review! > > Updated webrev: > incremental (only review changes): http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/11/incremental/webrev/ > full: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/11/webrev/ This looks good.? I only skimmed on the tests and not reviewed in details (I assume Bob has reviewed them). ?All new cgroup-specific and metrics implementation classes are now linux-specific classes which is good. > More below. > > >> test/lib/jdk/test/lib/containers/cgroup/MetricsTester.java >> The Oracle copyright is taken out and the copyright is also changed from GPL to GPL+CP. >> The Red Hat copyright can be added to the top of the file immediately before "DO NOT ALTER or REMOVE" line like [1]. >> >> [1] http://hg.openjdk.java.net/jdk/jdk/file/tip/test/hotspot/jtreg/compiler/onSpinWait/TestOnSpinWaitEnableDisable.java > Hmm, old MetricsTester got renamed with this patch to > MetricsTesterCgroupV1. MetricsTesterCgroupV1 still has the old > copyright. The version you've looked at is the common part and > instantiates tester for cgroup v1 or cgroup v2 as required. Thanks for clarifying.? I now see that MetricsTester.java is a new file in this patch but the webrev shows as an existing file. > Aside: Not sure why old MetricsTester (or new MetricsTesterCgroupV1) is > GPL (over GPL+CP). > Either way, I've changed license to GPL over GPL+CP for the new test > classes with Red Hat copyright. I skimmed through the copyright header and license text.? Looks fine to me. Mandy -------------- next part -------------- An HTML attachment was scrubbed... URL: From bob.vandette at oracle.com Tue Feb 18 19:34:15 2020 From: bob.vandette at oracle.com (Bob Vandette) Date: Tue, 18 Feb 2020 14:34:15 -0500 Subject: [PING] RFR: 8231111: Cgroups v2: Rework Metrics in java.base so as to recognize unified hierarchy In-Reply-To: <3dcc3e0b-af61-0f52-2ac0-1ce529365a2d@oracle.com> References: <75fc377f8d5ca76b7dac02f55db640cbdd305633.camel@redhat.com> <4bf65380bc26cd3bf684d7994b33e66bcb87927b.camel@redhat.com> <6CACCC0D-7F5A-42A3-83F1-746497940CCA@oracle.com> <29544339574e34c4c25cbec0314c26f35e8d1a99.camel@redhat.com> <4CE7C7F6-ABFA-4263-98B2-32BBD5013A3C@oracle.com> <10b5e83bfb7e618e5f5906c8e707057ff8680785.camel@redhat.com> <4a304e2ce72a53859b4e9cc8b21db404a260b531.camel@redhat.com> <97e7ff2b-f2cc-a666-afb8-521c0f5c37e7@oracle.com> <9af8d61496860692c305b2f5d55e8b0938562ccb.camel@redhat.com> <964cf134-afe3-5eb3-9990-095e503bc5ad@oracle.com> <3dcc3e0b-af61-0f52-2ac0-1ce529365a2d@oracle.com> Message-ID: <9FFD90E3-76B8-430F-B54B-60AF383026C1@oracle.com> > On Feb 18, 2020, at 2:00 PM, Mandy Chung wrote: > > > > On 2/18/20 4:50 AM, Severin Gehwolf wrote: >> Hi Mandy, >> >> Thanks again for the review! >> >> Updated webrev: >> incremental (only review changes): >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/11/incremental/webrev/ >> >> full: >> http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8231111/11/webrev/ > > This looks good. I only skimmed on the tests and not reviewed in details (I assume Bob has reviewed them). Yes, I checked the tests and they look fine. Bob. > > All new cgroup-specific and metrics implementation classes are now linux-specific classes which is good. > >> More below. >> >> >> >>> test/lib/jdk/test/lib/containers/cgroup/MetricsTester.java >>> The Oracle copyright is taken out and the copyright is also changed from GPL to GPL+CP. >>> The Red Hat copyright can be added to the top of the file immediately before "DO NOT ALTER or REMOVE" line like [1]. >>> >>> [1] >>> http://hg.openjdk.java.net/jdk/jdk/file/tip/test/hotspot/jtreg/compiler/onSpinWait/TestOnSpinWaitEnableDisable.java >> Hmm, old MetricsTester got renamed with this patch to >> MetricsTesterCgroupV1. MetricsTesterCgroupV1 still has the old >> copyright. The version you've looked at is the common part and >> instantiates tester for cgroup v1 or cgroup v2 as required. >> > > Thanks for clarifying. I now see that MetricsTester.java is a new file in this patch but the webrev shows as an existing file. > >> Aside: Not sure why old MetricsTester (or new MetricsTesterCgroupV1) is >> GPL (over GPL+CP). >> Either way, I've changed license to GPL over GPL+CP for the new test >> classes with Red Hat copyright. >> > > I skimmed through the copyright header and license text. Looks fine to me. > > Mandy > From serguei.spitsyn at oracle.com Tue Feb 18 20:24:55 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 18 Feb 2020 12:24:55 -0800 Subject: RFR 8232622: Technical debt in BadAttributeValueExpException In-Reply-To: References: Message-ID: <5981d9a8-0da9-5618-c359-42239ff05c06@oracle.com> Hi Roger, It looks good to me. Thanks, Serguei On 2/13/20 7:52 AM, Roger Riggs wrote: > Please review a minor cleanup to remove code long since unnecessary. > The type of the BadAttributeValueExpException argument is String and > if it is not a string in the serialized stream, a suitable replacement > is created. > > Issue: https://bugs.openjdk.java.net/browse/JDK-8232622 > > Patch: > > diff --git > a/src/java.management/share/classes/javax/management/BadAttributeValueExpException.java > b/src/java.management/share/classes/javax/management/BadAttributeValueExpException.java > > --- > a/src/java.management/share/classes/javax/management/BadAttributeValueExpException.java > +++ > b/src/java.management/share/classes/javax/management/BadAttributeValueExpException.java > @@ -1,5 +1,5 @@ > ?/* > - * Copyright (c) 1999, 2019, Oracle and/or its affiliates. All rights > reserved. > + * Copyright (c) 1999, 2020, Oracle and/or its affiliates. All rights > reserved. > ? * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. > ? * > ? * This code is free software; you can redistribute it and/or modify it > @@ -48,7 +48,7 @@ public class BadAttributeValueExpExcepti > ????? * for example, the string value can be the return of {@code > attribute.toString()} > ????? */ > ???? @SuppressWarnings("serial") // See handling in constructor and > readObject > -??? private Object val; > +??? private String val; > > ???? /** > ????? * Constructs a BadAttributeValueExpException using the specified > Object to > @@ -72,19 +72,8 @@ public class BadAttributeValueExpExcepti > ???????? ObjectInputStream.GetField gf = ois.readFields(); > ???????? Object valObj = gf.get("val", null); > > -??????? if (valObj == null) { > -??????????? val = null; > -??????? } else if (valObj instanceof String) { > -??????????? val= valObj; > -??????? } else if (System.getSecurityManager() == null > -??????????????? || valObj instanceof Long > -??????????????? || valObj instanceof Integer > -??????????????? || valObj instanceof Float > -??????????????? || valObj instanceof Double > -??????????????? || valObj instanceof Byte > -??????????????? || valObj instanceof Short > -??????????????? || valObj instanceof Boolean) { > -??????????? val = valObj.toString(); > +??????? if (valObj instanceof String || valObj == null) { > +??????????? val = (String)valObj; > ???????? } else { // the serialized object is from a version without > JDK-8019292 fix > ???????????? val = System.identityHashCode(valObj) + "@" + > valObj.getClass().getName(); > ???????? } > > > Thanks, Roger From serguei.spitsyn at oracle.com Tue Feb 18 20:26:42 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 18 Feb 2020 12:26:42 -0800 Subject: RFR (S) 8239055: Wrong implementation of VMState.hasListener In-Reply-To: <988ac692-444a-4025-99c5-421d4c554f3b@default> References: <988ac692-444a-4025-99c5-421d4c554f3b@default> Message-ID: <874b118b-c3c5-ce16-c62a-0d96fd505f56@oracle.com> Hi Fairoz, Looks good. Thanks, Serguei On 2/13/20 9:13 PM, Fairoz Matte wrote: > > Hi, > > Please review a tiny change to correct the VMState.hasListener > implementation. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8239055 > > Webrev: http://cr.openjdk.java.net/~fmatte/8239055/webrev.00/ > > Thanks, > > Fairoz > -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Feb 18 20:53:38 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 18 Feb 2020 12:53:38 -0800 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: <3a5d60fd-04d5-da96-3d79-242d43fdec79@redhat.com> References: <8ea20a15-bdb9-27b0-c306-154f539a3674@oracle.com> <3a5d60fd-04d5-da96-3d79-242d43fdec79@redhat.com> Message-ID: <723567f2-a61b-302f-0853-32de05cb562d@oracle.com> Hi Zhengyu, It looks okay to me. The testing you do looks enough for verification. But I'm not sure about performance testing though. Thanks, Serguei On 2/17/20 6:51 AM, Zhengyu Gu wrote: > Hi Stefan, > > Thanks for the review and suggestions, updated accordingly: > > http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/ > >> >> --- >> Previously, the calls to 'mark' and 'visited' were inlineable, but >> now every GC has to take a virtual call when marking the objects. My >> guess is that this code is slow anyway, and that it doesn't matter >> too much, but did you measure the effect of that change with, for >> example, G1? >> > I did rough measurement, timing > vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test. > > If you know any tests/benchmarks I should measure, please let me know. > > Thanks, > > -Zhengyu > > >> Thanks, >> StefanK >> >>> Test: >>> ?? hotspot_gc >>> ?? vmTestbase_nsk_jdi >>> ?? vmTestbase_nsk_jvmti >>> >>> Thanks, >>> >>> -Zhengyu >>> >>> >> > From serguei.spitsyn at oracle.com Tue Feb 18 20:59:10 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 18 Feb 2020 12:59:10 -0800 Subject: RFR: add parallel heap inspection support for jmap histo(G1)(Internet mail) In-Reply-To: References: <11bca96c0e7745f5b2558cc49b42b996@tencent.com> Message-ID: Hi Lin, Could you, please, re-post your RFR with the right enhancement number in the message subject? It will be more trackable this way. Thanks, Serguei On 2/17/20 10:29 PM, linzang(??) wrote: > Dear David, > ? ? ? Thanks a lot! > ? ? ? I have updated the refined code to?http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. > ? ? ? IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. > ? ? ? Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap,?then we can extend the solution to other kinds of heap. > > Thanks, > -------------- > Lin >> Hi Lin, >> >> Adding in hotspot-gc-dev as they need to see how this interacts with GC >> worker threads, and whether it needs to be extended beyond G1. >> >> I happened to spot one nit when browsing: >> >> src/hotspot/share/gc/shared/collectedHeap.hpp >> >> +?? virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, >> +????????????????????????????????????????? BoolObjectClosure* filter, >> +????????????????????????????????????????? size_t* missed_count, >> +????????????????????????????????????????? size_t thread_num) { >> +???? return NULL; >> >> s/NULL/false/ >> >> Cheers, >> David >> >> On 18/02/2020 2:15 pm, linzang(??) wrote: >>> Dear All, >>> ? ? ? ?May I ask your help to review the follow changes: >>> ? ? ? ?webrev: >>> http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ >>> ? ? ?bug: https://bugs.openjdk.java.net/browse/JDK-8215624 >>> ? ? ?related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 >>> ? ? ? ?This patch enable parallel heap inspection of G1 for jmap histo. >>> ? ? ? ?my simple test shown it can speed up 2x of jmap -histo with >>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. >>> >>> ------------------------------------------------------------------------ >>> BRs, >>> Lin > > From zgu at redhat.com Tue Feb 18 21:18:38 2020 From: zgu at redhat.com (Zhengyu Gu) Date: Tue, 18 Feb 2020 16:18:38 -0500 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: <723567f2-a61b-302f-0853-32de05cb562d@oracle.com> References: <8ea20a15-bdb9-27b0-c306-154f539a3674@oracle.com> <3a5d60fd-04d5-da96-3d79-242d43fdec79@redhat.com> <723567f2-a61b-302f-0853-32de05cb562d@oracle.com> Message-ID: Hi Serguei, On 2/18/20 3:53 PM, serguei.spitsyn at oracle.com wrote: > Hi Zhengyu, > > It looks okay to me. > The testing you do looks enough for verification. > But I'm not sure about performance testing though. Thanks for reviewing. I asked around my colleagues, if they knew any benchmarks for JVMTI heap walk, the answer was 'no'. As Stefan mentioned, this is a slow piece of code, I doubt if there is any benchmarks for it. I would appreciate it if any performance people can chip in. Thanks, -Zhengyu > > Thanks, > Serguei > > > On 2/17/20 6:51 AM, Zhengyu Gu wrote: >> Hi Stefan, >> >> Thanks for the review and suggestions, updated accordingly: >> >> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/ >> >>> >>> --- >>> Previously, the calls to 'mark' and 'visited' were inlineable, but >>> now every GC has to take a virtual call when marking the objects. My >>> guess is that this code is slow anyway, and that it doesn't matter >>> too much, but did you measure the effect of that change with, for >>> example, G1? >>> >> I did rough measurement, timing >> vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test. >> >> If you know any tests/benchmarks I should measure, please let me know. >> >> Thanks, >> >> -Zhengyu >> >> >>> Thanks, >>> StefanK >>> >>>> Test: >>>> ?? hotspot_gc >>>> ?? vmTestbase_nsk_jdi >>>> ?? vmTestbase_nsk_jvmti >>>> >>>> Thanks, >>>> >>>> -Zhengyu >>>> >>>> >>> >> > From ioi.lam at oracle.com Wed Feb 19 00:15:55 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 18 Feb 2020 16:15:55 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> Message-ID: Hi Ralf, We are usually pretty picky about adding new features into the JVM. This seems to be an edge case (where your environment has more RAM than disk). I think it would be better to handle this outside of the JVM (using a named pipe and and external program such as the parallel gzip "pigz") to limit the maintenance overhead of the JVM. This would also have the benefit that you can do it with almost no local storage -- you can read from the named pipe, optionally compress the data, and send that over the network. Thanks - Ioi On 2/18/20 1:11 PM, serguei.spitsyn at oracle.com wrote: > Hi Ralf, > > Thank you for the explanation. > I'm still not sure this justifies the complexity introduced with this > fix. > But it would be nice to collect more opinions on this. > > On 2/12/20 9:02 AM, Laurence Cable wrote: >> >> >> On 2/12/20 4:17 AM, Schmelter, Ralf wrote: >>> Hi Serguei, >>> >>> the use case is being able to get a heap dump from big Java servers. >>> These usually run on machines with a lot of memory and CPUs, but not >>> much disk space (which they don't need apart from some trace files >>> and the server code itself). And if we can get the customer to mount >>> some NFS file system on the machine, it is usually slow. So writing >>> only a third or forth of the data is a big win. >>> >>> Doing the compression outside the VM would either depend on the >>> hprof file written first (so we would still need the disk space) or >>> have another channel to dump the data (e.g. via socket). >> or named pipe >>> ? But this would add complexity too and needs an external program. >> agreed >>> >>> I've compiled 2 release versions on Windows with and without my >>> change. The change adds 14.5k to the server.dll (which is 10.4 MB). >>> Not sure if this is considered acceptable. > > This is for the HotSpot Runtime team to decide if it is acceptable or > not. > So, I've added the hotspot-runtime-dev to the list. > >> but what is the performance impact of this? >>> >>> Best regards, >>> Ralf >>> >>> -----Original Message----- >>> From: serguei.spitsyn at oracle.com >>> Sent: Dienstag, 11. Februar 2020 20:49 >>> To: Schmelter, Ralf ; Yasumasa Suenaga >>> ; OpenJDK Serviceability >>> >>> Cc: yasuenag at gmail.com >>> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped >>> heap dump >>> >>> Ralf, >>> >>> I see this feature adds a lot of code. In fact, I'm not sure, it is >>> worth to add this kind of complexity (including new compressing >>> threads) >>> into the VM implementation. What is a real use case behind it? Could >>> this compressing be done separately from VM implementation? >>> >>> Thanks, >>> Serguei >> > From larry.cable at oracle.com Wed Feb 19 00:40:57 2020 From: larry.cable at oracle.com (Laurence Cable) Date: Tue, 18 Feb 2020 16:40:57 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> Message-ID: On 2/18/20 4:15 PM, Ioi Lam wrote: > Hi Ralf, > > We are usually pretty picky about adding new features into the JVM. > This seems to be an edge case (where your environment has more RAM > than disk). I think it would be better to handle this outside of the > JVM (using a named pipe and and external program such as the parallel > gzip "pigz") to limit the maintenance overhead of the JVM. > > This would also have the benefit that you can do it with almost no > local storage -- you can read from the named pipe, optionally compress > the data, and send that over the network. nice solution! > > Thanks > - Ioi > > On 2/18/20 1:11 PM, serguei.spitsyn at oracle.com wrote: >> Hi Ralf, >> >> Thank you for the explanation. >> I'm still not sure this justifies the complexity introduced with this >> fix. >> But it would be nice to collect more opinions on this. >> >> On 2/12/20 9:02 AM, Laurence Cable wrote: >>> >>> >>> On 2/12/20 4:17 AM, Schmelter, Ralf wrote: >>>> Hi Serguei, >>>> >>>> the use case is being able to get a heap dump from big Java >>>> servers. These usually run on machines with a lot of memory and >>>> CPUs, but not much disk space (which they don't need apart from >>>> some trace files and the server code itself). And if we can get the >>>> customer to mount some NFS file system on the machine, it is >>>> usually slow. So writing only a third or forth of the data is a big >>>> win. >>>> >>>> Doing the compression outside the VM would either depend on the >>>> hprof file written first (so we would still need the disk space) or >>>> have another channel to dump the data (e.g. via socket). >>> or named pipe >>>> ? But this would add complexity too and needs an external program. >>> agreed >>>> >>>> I've compiled 2 release versions on Windows with and without my >>>> change. The change adds 14.5k to the server.dll (which is 10.4 MB). >>>> Not sure if this is considered acceptable. >> >> This is for the HotSpot Runtime team to decide if it is acceptable or >> not. >> So, I've added the hotspot-runtime-dev to the list. >> >>> but what is the performance impact of this? >>>> >>>> Best regards, >>>> Ralf >>>> >>>> -----Original Message----- >>>> From: serguei.spitsyn at oracle.com >>>> Sent: Dienstag, 11. Februar 2020 20:49 >>>> To: Schmelter, Ralf ; Yasumasa Suenaga >>>> ; OpenJDK Serviceability >>>> >>>> Cc: yasuenag at gmail.com >>>> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped >>>> heap dump >>>> >>>> Ralf, >>>> >>>> I see this feature adds a lot of code. In fact, I'm not sure, it is >>>> worth to add this kind of complexity (including new compressing >>>> threads) >>>> into the VM implementation. What is a real use case behind it? Could >>>> this compressing be done separately from VM implementation? >>>> >>>> Thanks, >>>> Serguei >>> >> > From linzang at tencent.com Wed Feb 19 01:34:40 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Wed, 19 Feb 2020 01:34:40 +0000 Subject: RFR: JDK-8215264 add parallel heap inspection support for jmap histo(G1)(Internet mail) References: <11bca96c0e7745f5b2558cc49b42b996@tencent.com>, , , Message-ID: <7e215dc97a584554b3e854d8801dc256@tencent.com> Re-post this RFR with enhancement number to make it trackable. webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/ bug: https://bugs.openjdk.java.net/browse/JDK-8215624 CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 ? Thanks! -------------- Lin >Hi Lin, > >Could you, please, re-post your RFR with the right enhancement number in >the message subject? >It will be more trackable this way. > >Thanks, >Serguei > > >On 2/17/20 10:29 PM, linzang(??) wrote: >> Dear David, >>? ? ? ? Thanks a lot! >> ? ? ? I have updated the refined code to?http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. >>? ? ? ? IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. >>? ? ? ? Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap,?then we can extend the solution to other kinds of heap. >>???? >> Thanks, >> -------------- >> Lin >>> Hi Lin, >>> >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC >>> worker threads, and whether it needs to be extended beyond G1. >>> >>> I happened to spot one nit when browsing: >>> >>> src/hotspot/share/gc/shared/collectedHeap.hpp >>> >>> +?? virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, >>> +????????????????????????????????????????? BoolObjectClosure* filter, >>> +????????????????????????????????????????? size_t* missed_count, >>> +????????????????????????????????????????? size_t thread_num) { >>> +???? return NULL; >>> >>> s/NULL/false/ >>> >>> Cheers, >>> David >>> >>> On 18/02/2020 2:15 pm, linzang(??) wrote: >>>> Dear All, >>>>? ? ? ? ?May I ask your help to review the follow changes: >>>>? ? ? ? ?webrev: >>>> http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ >>>> ? ? ?bug: https://bugs.openjdk.java.net/browse/JDK-8215624 >>>> ? ? ?related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 >>>>? ? ? ? ?This patch enable parallel heap inspection of G1 for jmap histo. >>>>? ? ? ? ?my simple test shown it can speed up 2x of jmap -histo with >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. >>>> >>>> ------------------------------------------------------------------------ >>>> BRs, >>>> Lin >> > > From linzang at tencent.com Wed Feb 19 01:38:31 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Wed, 19 Feb 2020 01:38:31 +0000 Subject: RFR: JDK-8215264 add parallel heap inspection support for jmap histo(G1)(Internet mail) References: <11bca96c0e7745f5b2558cc49b42b996@tencent.com>, , , , <7e215dc97a584554b3e854d8801dc256@tencent.com> Message-ID: So sorry the number in this title is wrong. please ignore it ! so sorry about making this mistake.? will re post with correct number.? -------------- Lin >Re-post this RFR with enhancement number to make it trackable. >webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/ >bug: https://bugs.openjdk.java.net/browse/JDK-8215624 >CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 >? >Thanks! >-------------- >Lin >>Hi Lin, >> >>Could you, please, re-post your RFR with the right enhancement number in >>the message subject? >>It will be more trackable this way. >> >>Thanks, >>Serguei >> >> >>On 2/17/20 10:29 PM, linzang(??) wrote: >>> Dear David, >>>? ? ? ? Thanks a lot! >>> ? ? ? I have updated the refined code to?http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. >>>? ? ? ? IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. >>>? ? ? ? Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap,?then we can extend the solution to other kinds of heap. >>>???? >>> Thanks, >>> -------------- >>> Lin >>>> Hi Lin, >>>> >>>> Adding in hotspot-gc-dev as they need to see how this interacts with GC >>>> worker threads, and whether it needs to be extended beyond G1. >>>> >>>> I happened to spot one nit when browsing: >>>> >>>> src/hotspot/share/gc/shared/collectedHeap.hpp >>>> >>>> +?? virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, >>>> +????????????????????????????????????????? BoolObjectClosure* filter, >>>> +????????????????????????????????????????? size_t* missed_count, >>>> +????????????????????????????????????????? size_t thread_num) { >>>> +???? return NULL; >>>> >>>> s/NULL/false/ >>>> >>>> Cheers, >>>> David >>>> >>>> On 18/02/2020 2:15 pm, linzang(??) wrote: >>>>> Dear All, >>>>>? ? ? ? ?May I ask your help to review the follow changes: >>>>>? ? ? ? ?webrev: >>>>> http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ >>>>> ? ? ?bug: https://bugs.openjdk.java.net/browse/JDK-8215624 >>>>> ? ? ?related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 >>>>>? ? ? ? ?This patch enable parallel heap inspection of G1 for jmap histo. >>>>>? ? ? ? ?my simple test shown it can speed up 2x of jmap -histo with >>>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. >>>>> >>>>> ------------------------------------------------------------------------ >>>>> BRs, >>>>> Lin >>> > >> From linzang at tencent.com Wed Feb 19 01:40:34 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Wed, 19 Feb 2020 01:40:34 +0000 Subject: RFR: JDK-8215624 add parallel heap inspection support for jmap histo(G1)(Internet mail) References: <11bca96c0e7745f5b2558cc49b42b996@tencent.com>, , , Message-ID: Re-post this RFR with correct enhancement number to make it trackable. please ignore the previous wrong post. sorry for troubles.? webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/ bug: https://bugs.openjdk.java.net/browse/JDK-8215624 CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 -------------- Lin >Hi Lin, > >Could you, please, re-post your RFR with the right enhancement number in >the message subject? >It will be more trackable this way. > >Thanks, >Serguei > > >On 2/17/20 10:29 PM, linzang(??) wrote: >> Dear David, >>? ? ? ? Thanks a lot! >> ? ? ? I have updated the refined code to?http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/. >>? ? ? ? IMHO the parallel heap inspection can be extended to all kinds of heap as long as the heap layout can support parallel iteration. >>? ? ? ? Maybe we can firstly use this webrev to discuss how to implement it, because I am not sure my current implementation is an appropriate way to communicate with collectedHeap,?then we can extend the solution to other kinds of heap. >>???? >> Thanks, >> -------------- >> Lin >>> Hi Lin, >>> >>> Adding in hotspot-gc-dev as they need to see how this interacts with GC >>> worker threads, and whether it needs to be extended beyond G1. >>> >>> I happened to spot one nit when browsing: >>> >>> src/hotspot/share/gc/shared/collectedHeap.hpp >>> >>> +?? virtual bool run_par_heap_inspect_task(KlassInfoTable* cit, >>> +????????????????????????????????????????? BoolObjectClosure* filter, >>> +????????????????????????????????????????? size_t* missed_count, >>> +????????????????????????????????????????? size_t thread_num) { >>> +???? return NULL; >>> >>> s/NULL/false/ >>> >>> Cheers, >>> David >>> >>> On 18/02/2020 2:15 pm, linzang(??) wrote: >>>> Dear All, >>>>? ? ? ? ?May I ask your help to review the follow changes: >>>>? ? ? ? ?webrev: >>>> http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/ >>>> ? ? ?bug: https://bugs.openjdk.java.net/browse/JDK-8215624 >>>> ? ? ?related CSR: https://bugs.openjdk.java.net/browse/JDK-8239290 >>>>? ? ? ? ?This patch enable parallel heap inspection of G1 for jmap histo. >>>>? ? ? ? ?my simple test shown it can speed up 2x of jmap -histo with >>>> parallelThreadNum set to 2 for heap at ~500M on 4-core platform. >>>> >>>> ------------------------------------------------------------------------ >>>> BRs, >>>> Lin >> > > From fairoz.matte at oracle.com Wed Feb 19 04:11:13 2020 From: fairoz.matte at oracle.com (Fairoz Matte) Date: Tue, 18 Feb 2020 20:11:13 -0800 (PST) Subject: RFR (S) 8239055: Wrong implementation of VMState.hasListener In-Reply-To: <874b118b-c3c5-ce16-c62a-0d96fd505f56@oracle.com> References: <988ac692-444a-4025-99c5-421d4c554f3b@default> <874b118b-c3c5-ce16-c62a-0d96fd505f56@oracle.com> Message-ID: Hi Serguei, Thanks for the review. Thanks, Fairoz From: Serguei Spitsyn Sent: Wednesday, February 19, 2020 1:57 AM To: Fairoz Matte ; serviceability-dev at openjdk.java.net Subject: Re: RFR (S) 8239055: Wrong implementation of VMState.hasListener Hi Fairoz, Looks good. Thanks, Serguei On 2/13/20 9:13 PM, Fairoz Matte wrote: Hi, ? Please review a tiny change to correct the VMState.hasListener implementation. ? JBS: https://bugs.openjdk.java.net/browse/JDK-8239055 Webrev: http://cr.openjdk.java.net/~fmatte/8239055/webrev.00/ ? Thanks, Fairoz ? From ralf.schmelter at sap.com Wed Feb 19 12:30:39 2020 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Wed, 19 Feb 2020 12:30:39 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <1f2c938f-f9ce-7fb7-7169-94daaf3542a4@oracle.com> Message-ID: Hi David, > If the lifecycle of your new NonJavaThread does not fit the existing > NonJavaThreads then yes you will need to override pre_run() and > post_run(), but you shouldn't just delete all the NJT iteration support > - at least it isn't obvious to me that it is valid to do so. The only thing I want to do is to be able to delete the thread after it is finished. I first thought I could copy the JfrThreadSampler thread, which deletes itself in the post_run() method. But it turned out, that the run() method of this thread never stops, so the post_run() method is never called. And when I implement the same post_run() method, I hit asserts in the debug build when calling the destructor, since os::free_thread() asserts using Thread::current(), which does not work since the post_run() method of NonJavaThread already wiped out the thread local storage. So it seems to me now that currently deleting a non-java thread is not supported and nobody does it. But the code itself and the comments in Thread::call_run() seem to indicate, it should work. So maybe it's just as simple as adjusting the asserts in os::free_thread(). > That begs the question for me exactly what it is that your new NJT > worker thread will touch in the VM because that will determine where it > needs to fit in the Thread hierarchy and what actions it needs to > perform in pre_run() and post_run(). I'm unclear what state the VM will > be in when this heap dump is performed and these worker threads are > doing the compression. Currently the worker threads only use the Monitor from the VM. Otherwise they just compress the buffers and write them to the file using the os interface. The heap dump itself is performed in a VM operation. They don't accesses objects, classes or other 'VM objects'. Best regards, Ralf -----Original Message----- From: David Holmes Sent: Donnerstag, 13. Februar 2020 05:28 To: Schmelter, Ralf ; Yasumasa Suenaga ; OpenJDK Serviceability Cc: yasuenag at gmail.com Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump Hi Ralf, On 12/02/2020 6:41 pm, Schmelter, Ralf wrote: > Hi David, > >> I see little point in subclassing NonJavaThread (via NamedThread) but >> then overriding pre_run() and post_run() so that you don't do anything >> that NonJavaThread is supposed to do regarding the NJT iterator >> capabilities. > > The problem is the post_run() method of NamedThread calls Thread::clear_thread_current(), which then makes it impossible to delete the thread at least in a debug build, since the code in ~Thread calls os::free_thread() which calls Thread::current()->->osthread() in an assert, which obviously will crash. If the lifecycle of your new NonJavaThread does not fit the existing NonJavaThreads then yes you will need to override pre_run() and post_run(), but you shouldn't just delete all the NJT iteration support - at least it isn't obvious to me that it is valid to do so. > Originally I tried not use my own threads at all and instead use the WorkGang from CollectedHeap:: get_safepoint_workers(). But this ultimately failed because I'm not allowed to iterate the heap in a worker thread on Shenandoah. Additionally ParallelGC did not implement get_safepoint_workers(), but that should have not been a problem. That begs the question for me exactly what it is that your new NJT worker thread will touch in the VM because that will determine where it needs to fit in the Thread hierarchy and what actions it needs to perform in pre_run() and post_run(). I'm unclear what state the VM will be in when this heap dump is performed and these worker threads are doing the compression. Thanks, David > Maybe it is better to try to get this to work (e.g. if I could specify a foreground task when calling run_task(), the problem could be avoid by doing the iteration in the foreground task). But I'm not sure how changes in this area are seen. > >> For your monitor operations, you should use a MonitorLocker and then >> call ml->wait() which will do the right thing with respect to "no >> safepoint checks" without you needing to specify it directly. > > Thanks, will do. > > Best regards, > Ralf From chiroito107 at gmail.com Wed Feb 19 13:36:23 2020 From: chiroito107 at gmail.com (Chihiro Ito) Date: Wed, 19 Feb 2020 22:36:23 +0900 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows Message-ID: Hi, Could you review this tiny fix, please? This problem affected not the only path on Windows, but also Linux and URLs using ":". Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.00/ JBS : https://bugs.openjdk.java.net/browse/JDK-8222489 Regards, Chihiro -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthias.baesken at sap.com Wed Feb 19 14:21:55 2020 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Wed, 19 Feb 2020 14:21:55 +0000 Subject: RFR [XS]: 8239462: jdk.hotspot.agent misses some ReleaseStringUTFChars calls in case of early returns Message-ID: Hello, please review this small change . We miss at a few places ReleaseStringUTFChars calls in the native jdk.hotspot.agent coding. This happens in case of early returns . Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8239462 http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.0/ Thanks, Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.schmelter at sap.com Wed Feb 19 15:24:10 2020 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Wed, 19 Feb 2020 15:24:10 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> Message-ID: Hi Ioi, > This seems to be an edge case (where your environment has more > RAM than disk) I would not say it's an edge case. Especially in a cloud environment, your container does not need much free diskspace, since the data is stored in a database and logging goes to stdout. > I think it would be better to handle this outside of the JVM > (using a named pipe and and external program such as the parallel gzip > "pigz") to limit the maintenance overhead of the JVM. But then you would have to implement writing the heap dump to a named pipe (and not only on Unix, but on Windows too). And you would still want to do the writing in background threads, so most of the code would stay. You need something like netcat on Windows. And it doesn't cover writing a heap dump on OOM via the VM flag. And you should to compress the hprof file in a specific way, since it will make it much faster to random access the gzipped hprof file directly. Note that I think it is a good idea to be able to write the dump to non-file destination. But removing the compression will not save much code and will make the handling messier. Best regards, Ralf -----Original Message----- From: Ioi Lam Sent: Mittwoch, 19. Februar 2020 01:16 To: serguei.spitsyn at oracle.com; Schmelter, Ralf ; hotspot-runtime-dev at openjdk.java.net runtime Cc: Laurence Cable ; serviceability-dev at openjdk.java.net Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump Hi Ralf, We are usually pretty picky about adding new features into the JVM. This seems to be an edge case (where your environment has more RAM than disk). I think it would be better to handle this outside of the JVM (using a named pipe and and external program such as the parallel gzip "pigz") to limit the maintenance overhead of the JVM. This would also have the benefit that you can do it with almost no local storage -- you can read from the named pipe, optionally compress the data, and send that over the network. Thanks - Ioi From ioi.lam at oracle.com Wed Feb 19 17:40:04 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 19 Feb 2020 09:40:04 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> Message-ID: On 2/19/20 7:24 AM, Schmelter, Ralf wrote: > Hi Ioi, > >> This seems to be an edge case (where your environment has more >> RAM than disk) > I would not say it's an edge case. Especially in a cloud environment, your container does not need much free diskspace, since the data is stored in a database and logging goes to stdout. > >> I think it would be better to handle this outside of the JVM >> (using a named pipe and and external program such as the parallel gzip >> "pigz") to limit the maintenance overhead of the JVM. > But then you would have to implement writing the heap dump to a named pipe (and not only on Unix, but on Windows too). And you would still want to do the writing in background threads, so most of the code would stay. You need something like netcat on Windows. And it doesn't cover writing a heap dump on OOM via the VM flag. > > And you should to compress the hprof file in a specific way, since it will make it much faster to random access the gzipped hprof file directly. > > Note that I think it is a good idea to be able to write the dump to non-file destination. But removing the compression will not save much code and will make the handling messier. I was thinking of doing something like this: $ mkfifo /tmp/pipe $ cat /tmp/pipe | gzip -c - > /tmp/zipped & $ jcmd $PID GC.heap_dump filename=/tmp/pipe You can replace the "> /tmp/zipped" part with a program that reads from stdin and send it over the network. I tried the above with a recent JDK build (with your changes in JDK-8234510: Remove file seeking requirement for writing a heap dump), but it doesn't seem to work, probably because we need to change this code a little bit http://hg.openjdk.java.net/jdk/jdk/file/7ef41e83066b/src/hotspot/share/services/heapDumper.cpp#l465 DumpWriter::DumpWriter(const char* path) : _fd(-1), _bytes_written(0), _pos(0), ?????????????????????????????????????????? _in_dump_segment(false), _error(NULL) { ??? ... ??? _fd = os::create_binary_file(path, false);??? // don't replace existing file <<< I also saw a post saying that the JVM can write to named pipes on Windows: https://stackoverflow.com/questions/634564/how-to-open-a-windows-named-pipe-from-java There's no built-in mkfifo command on Windows, but the above link points to a .NET example that creates a named pipe and uses that to communicate with the JVM. I don't know whether this will be a better solution than your proposed changes, but I think it should be explored as a possible alternative. It does seem to require a little work to get your whole data collection system working, but it also seems more flexible and extensible. Thanks - Ioi > > Best regards, > Ralf > > > -----Original Message----- > From: Ioi Lam > Sent: Mittwoch, 19. Februar 2020 01:16 > To: serguei.spitsyn at oracle.com; Schmelter, Ralf ; hotspot-runtime-dev at openjdk.java.net runtime > Cc: Laurence Cable ; serviceability-dev at openjdk.java.net > Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump > > Hi Ralf, > > We are usually pretty picky about adding new features into the JVM. This > seems to be an edge case (where your environment has more RAM than > disk). I think it would be better to handle this outside of the JVM > (using a named pipe and and external program such as the parallel gzip > "pigz") to limit the maintenance overhead of the JVM. > > This would also have the benefit that you can do it with almost no local > storage -- you can read from the named pipe, optionally compress the > data, and send that over the network. > > Thanks > - Ioi From larry.cable at oracle.com Wed Feb 19 17:43:48 2020 From: larry.cable at oracle.com (Laurence Cable) Date: Wed, 19 Feb 2020 09:43:48 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> Message-ID: <14b7e075-55b8-f1c5-1b6d-94f8c23cb18b@oracle.com> probably need the named pipe name to be unique "per" dump... s/pipe/$PID/ - Larry On 2/19/20 9:40 AM, Ioi Lam wrote: > > > On 2/19/20 7:24 AM, Schmelter, Ralf wrote: >> Hi Ioi, >> >>> This seems to be an edge case (where your environment has more >>> RAM than disk) >> I would not say it's an edge case. Especially in a cloud environment, >> your container does not need much free diskspace, since the data is >> stored in a database and logging goes to stdout. >> >>> I think it would be better to handle this outside of the JVM >>> (using a named pipe and and external program such as the parallel gzip >>> "pigz") to limit the maintenance overhead of the JVM. >> But then you would have to implement writing the heap dump to a named >> pipe (and not only on Unix, but on Windows too). And you would still >> want to do the writing in background threads, so most of the code >> would stay. You need something like netcat on Windows. And it doesn't >> cover writing a heap dump on OOM via the VM flag. >> >> And you should to compress the hprof file in a specific way, since it >> will make it much faster to random access the gzipped hprof file >> directly. >> >> Note that I think it is a good idea to be able to write the dump to >> non-file destination. But removing the compression will not save much >> code and will make the handling messier. > > I was thinking of doing something like this: > > $ mkfifo /tmp/pipe > $ cat /tmp/pipe | gzip -c - > /tmp/zipped & > $ jcmd $PID GC.heap_dump filename=/tmp/pipe > > You can replace the "> /tmp/zipped" part with a program that reads > from stdin and send it over the network. > > I tried the above with a recent JDK build (with your changes in > JDK-8234510: Remove file seeking requirement for writing a heap dump), > but it doesn't seem to work, probably because we need to change this > code a little bit > > http://hg.openjdk.java.net/jdk/jdk/file/7ef41e83066b/src/hotspot/share/services/heapDumper.cpp#l465 > > > DumpWriter::DumpWriter(const char* path) : _fd(-1), _bytes_written(0), > _pos(0), > _in_dump_segment(false), _error(NULL) { > ??? ... > ??? _fd = os::create_binary_file(path, false);??? // don't replace > existing file <<< > > I also saw a post saying that the JVM can write to named pipes on > Windows: > > https://stackoverflow.com/questions/634564/how-to-open-a-windows-named-pipe-from-java > > > There's no built-in mkfifo command on Windows, but the above link > points to a .NET example that creates a named pipe and uses that to > communicate with the JVM. > > I don't know whether this will be a better solution than your proposed > changes, but I think it should be explored as a possible alternative. > It does seem to require a little work to get your whole data > collection system working, but it also seems more flexible and > extensible. > > Thanks > - Ioi > > >> >> Best regards, >> Ralf >> >> >> -----Original Message----- >> From: Ioi Lam >> Sent: Mittwoch, 19. Februar 2020 01:16 >> To: serguei.spitsyn at oracle.com; Schmelter, Ralf >> ; hotspot-runtime-dev at openjdk.java.net >> runtime >> Cc: Laurence Cable ; >> serviceability-dev at openjdk.java.net >> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped >> heap dump >> >> Hi Ralf, >> >> We are usually pretty picky about adding new features into the JVM. This >> seems to be an edge case (where your environment has more RAM than >> disk). I think it would be better to handle this outside of the JVM >> (using a named pipe and and external program such as the parallel gzip >> "pigz") to limit the maintenance overhead of the JVM. >> >> This would also have the benefit that you can do it with almost no local >> storage -- you can read from the named pipe, optionally compress the >> data, and send that over the network. >> >> Thanks >> - Ioi > From tprintezis at twitter.com Wed Feb 19 18:22:35 2020 From: tprintezis at twitter.com (Tony Printezis) Date: Wed, 19 Feb 2020 10:22:35 -0800 Subject: SEGV in EdgeUtils::field_name_symbol(Edge const&) Message-ID: Hi, (Is this the right mailing list for this?) I?ve been looking at a SEGV in EdgeUtils::field_name_symbol(Edge const&) that we have been seeing in our nightly testing when running jdk/jfr/jcmd/TestJcmdDump.java. I can reproduce it using graal and parallel gc (cms also) on Linux with our 11 release, as well as OpenJDK 11u, 12, 13, and 14. The culprit seems to be this method: static const InstanceKlass* field_type(const StoredEdge& edge) { assert(!edge.is_root() || !EdgeUtils::is_array_element(edge), "invariant"); return (const InstanceKlass*)edge.reference_owner_klass(); } In fact, edge.reference_owner_klass()->is_instance_klass() == false, as the class here seems to be an object array class (I?ve seen [Ljava.lang.Class; and [Ljava.lang.Enum;). Is this a known issue? I?m not familiar with this code. Should field_name_symbol() return NULL in this case? Thanks, Tony ????? Tony Printezis | @TonyPrintezis | tprintezis at twitter.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From tprintezis at twitter.com Wed Feb 19 18:33:29 2020 From: tprintezis at twitter.com (Tony Printezis) Date: Wed, 19 Feb 2020 10:33:29 -0800 Subject: SEGV in EdgeUtils::field_name_symbol(Edge const&) In-Reply-To: References: Message-ID: FWIW, this is the stack trace when the crash happens: EdgeUtils::field_name_symbol(Edge const&) ObjectSampleWriter::write(StoredEdge const*) ObjectSampleWriter::operator()(StoredEdge&) ObjectSampleCheckpoint::write(ObjectSampler*, EdgeStore*, bool, Thread*) EventEmitter::write_events(ObjectSampler*, EdgeStore*, bool) PathToGcRootsOperation::doit() VM_Operation::evaluate() VMThread::evaluate_operation(VM_Operation*) VMThread::loop() VMThread::run() ????? Tony Printezis | @TonyPrintezis | tprintezis at twitter.com On February 19, 2020 at 1:22:35 PM, Tony Printezis (tprintezis at twitter.com) wrote: Hi, (Is this the right mailing list for this?) I?ve been looking at a SEGV in EdgeUtils::field_name_symbol(Edge const&) that we have been seeing in our nightly testing when running jdk/jfr/jcmd/TestJcmdDump.java. I can reproduce it using graal and parallel gc (cms also) on Linux with our 11 release, as well as OpenJDK 11u, 12, 13, and 14. The culprit seems to be this method: static const InstanceKlass* field_type(const StoredEdge& edge) { assert(!edge.is_root() || !EdgeUtils::is_array_element(edge), "invariant"); return (const InstanceKlass*)edge.reference_owner_klass(); } In fact, edge.reference_owner_klass()->is_instance_klass() == false, as the class here seems to be an object array class (I?ve seen [Ljava.lang.Class; and [Ljava.lang.Enum;). Is this a known issue? I?m not familiar with this code. Should field_name_symbol() return NULL in this case? Thanks, Tony ????? Tony Printezis | @TonyPrintezis | tprintezis at twitter.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Wed Feb 19 19:52:28 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 19 Feb 2020 14:52:28 -0500 Subject: SEGV in EdgeUtils::field_name_symbol(Edge const&) In-Reply-To: References: Message-ID: <2644c4a0-d449-6538-71a6-7b149df43ab2@oracle.com> Hi Tony! Thanks for filing: ??? JDK-8239497 SEGV in EdgeUtils::field_name_symbol(Edge const&) ??? https://bugs.openjdk.java.net/browse/JDK-8239497 I've added hotspot-jfr-dev at ... alias to this thread, but the JFR folks usually lurk on the Serviceability alias also. Dan On 2/19/20 1:33 PM, Tony Printezis wrote: > FWIW, this is the stack trace when the crash happens: > > EdgeUtils::field_name_symbol(Edge const&) > ObjectSampleWriter::write(StoredEdge const*) > ObjectSampleWriter::operator()(StoredEdge&) > ObjectSampleCheckpoint::write(ObjectSampler*, EdgeStore*, bool, Thread*) > EventEmitter::write_events(ObjectSampler*, EdgeStore*, bool) > PathToGcRootsOperation::doit() > VM_Operation::evaluate() > VMThread::evaluate_operation(VM_Operation*) > VMThread::loop() > VMThread::run() > > > ????? > Tony Printezis | @TonyPrintezis | tprintezis at twitter.com > > > > On February 19, 2020 at 1:22:35 PM, Tony Printezis > (tprintezis at twitter.com ) wrote: > >> Hi, >> >> (Is this the right mailing list for this?) >> >> I?ve been looking at a SEGV in?EdgeUtils::field_name_symbol(Edge >> const&) that we have been seeing in our nightly testing when running >> jdk/jfr/jcmd/TestJcmdDump.java. I can reproduce it using graal and >> parallel gc (cms also) on Linux with our 11 release, as well as >> OpenJDK 11u, 12, 13, and 14. >> >> The culprit seems to be this method: >> >> static const InstanceKlass* field_type(const StoredEdge& edge) { >> assert(!edge.is_root() || !EdgeUtils::is_array_element(edge), >> "invariant"); >> ? return (const InstanceKlass*)edge.reference_owner_klass(); >> } >> >> In fact, edge.reference_owner_klass()->is_instance_klass() == false, >> as the class here seems to be an object array class (I?ve seen >> [Ljava.lang.Class; and [Ljava.lang.Enum;). >> >> Is this a known issue? I?m not familiar with this code. Should >> field_name_symbol() return NULL in this case? >> >> Thanks, >> >> Tony >> >> >> ????? >> Tony Printezis | @TonyPrintezis | tprintezis at twitter.com >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Wed Feb 19 21:20:19 2020 From: christoph.langer at sap.com (Langer, Christoph) Date: Wed, 19 Feb 2020 21:20:19 +0000 Subject: RFR [XS]: 8239462: jdk.hotspot.agent misses some ReleaseStringUTFChars calls in case of early returns In-Reply-To: References: Message-ID: Hi Matthias, I think this is good. In line 1187 of src/jdk.hotspot.agent/solaris/native/libsaproc/saproc.cpp, there?s a space missing after the if. I guess you should insert it before pushing ?? Cheers Christoph From: serviceability-dev On Behalf Of Baesken, Matthias Sent: Mittwoch, 19. Februar 2020 15:22 To: 'hotspot-dev at openjdk.java.net' Cc: serviceability-dev at openjdk.java.net Subject: [CAUTION] RFR [XS]: 8239462: jdk.hotspot.agent misses some ReleaseStringUTFChars calls in case of early returns Hello, please review this small change . We miss at a few places ReleaseStringUTFChars calls in the native jdk.hotspot.agent coding. This happens in case of early returns . Bug/webrev : https://bugs.openjdk.java.net/browse/JDK-8239462 http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.0/ Thanks, Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Wed Feb 19 23:59:24 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 20 Feb 2020 08:59:24 +0900 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> Message-ID: Hi, Generally I agree with Ioi, but I think it is not a problem only for gzipped heap dump. For example, Compiler.codelist and Compiler.CodeHeap_Analytics might be large text. In addition, some users want to redirect the result from jcmd to other command or log collector. So I think it would be better if jcmd provides stdout redurect option to all subocmmands. E.g. $ jcmd GC.heap_dump -stdout | gzip -c - > heapdump.hprof.gz Thanks, Yasumasa On 2020/02/20 2:40, Ioi Lam wrote: > > > On 2/19/20 7:24 AM, Schmelter, Ralf wrote: >> Hi Ioi, >> >>> This seems to be an edge case (where your environment has more >>> RAM than disk) >> I would not say it's an edge case. Especially in a cloud environment, your container does not need much free diskspace, since the data is stored in a database and logging goes to stdout. >> >>> I think it would be better to handle this outside of the JVM >>> (using a named pipe and and external program such as the parallel gzip >>> "pigz") to limit the maintenance overhead of the JVM. >> But then you would have to implement writing the heap dump to a named pipe (and not only on Unix, but on Windows too). And you would still want to do the writing in background threads, so most of the code would stay. You need something like netcat on Windows. And it doesn't cover writing a heap dump on OOM via the VM flag. >> >> And you should to compress the hprof file in a specific way, since it will make it much faster to random access the gzipped hprof file directly. >> >> Note that I think it is a good idea to be able to write the dump to non-file destination. But removing the compression will not save much code and will make the handling messier. > > I was thinking of doing something like this: > > $ mkfifo /tmp/pipe > $ cat /tmp/pipe | gzip -c - > /tmp/zipped & > $ jcmd $PID GC.heap_dump filename=/tmp/pipe > > You can replace the "> /tmp/zipped" part with a program that reads from stdin and send it over the network. > > I tried the above with a recent JDK build (with your changes in JDK-8234510: Remove file seeking requirement for writing a heap dump), but it doesn't seem to work, probably because we need to change this code a little bit > > http://hg.openjdk.java.net/jdk/jdk/file/7ef41e83066b/src/hotspot/share/services/heapDumper.cpp#l465 > > DumpWriter::DumpWriter(const char* path) : _fd(-1), _bytes_written(0), _pos(0), > ?????????????????????????????????????????? _in_dump_segment(false), _error(NULL) { > ??? ... > ??? _fd = os::create_binary_file(path, false);??? // don't replace existing file <<< > > I also saw a post saying that the JVM can write to named pipes on Windows: > > https://stackoverflow.com/questions/634564/how-to-open-a-windows-named-pipe-from-java > > There's no built-in mkfifo command on Windows, but the above link points to a .NET example that creates a named pipe and uses that to communicate with the JVM. > > I don't know whether this will be a better solution than your proposed changes, but I think it should be explored as a possible alternative. It does seem to require a little work to get your whole data collection system working, but it also seems more flexible and extensible. > > Thanks > - Ioi > > >> >> Best regards, >> Ralf >> >> >> -----Original Message----- >> From: Ioi Lam >> Sent: Mittwoch, 19. Februar 2020 01:16 >> To: serguei.spitsyn at oracle.com; Schmelter, Ralf ; hotspot-runtime-dev at openjdk.java.net runtime >> Cc: Laurence Cable ; serviceability-dev at openjdk.java.net >> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump >> >> Hi Ralf, >> >> We are usually pretty picky about adding new features into the JVM. This >> seems to be an edge case (where your environment has more RAM than >> disk). I think it would be better to handle this outside of the JVM >> (using a named pipe and and external program such as the parallel gzip >> "pigz") to limit the maintenance overhead of the JVM. >> >> This would also have the benefit that you can do it with almost no local >> storage -- you can read from the named pipe, optionally compress the >> data, and send that over the network. >> >> Thanks >> - Ioi > From alexey.menkov at oracle.com Thu Feb 20 00:16:23 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 19 Feb 2020 16:16:23 -0800 Subject: RFR [XS]: 8239462: jdk.hotspot.agent misses some ReleaseStringUTFChars calls in case of early returns In-Reply-To: References: Message-ID: <6ed0d004-3e8c-4b27-f583-06dbacf45173@oracle.com> Looks like src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.cpp has similar issues. It would be nice to fix them as well. --alex On 02/19/2020 06:21, Baesken, Matthias wrote: > Hello, please review this small change . > We miss at a few places ReleaseStringUTFChars calls in the native > jdk.hotspot.agent coding. > This happens in case of early returns . > > > Bug/webrev : > > https://bugs.openjdk.java.net/browse/JDK-8239462 > > http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.0/ > > > Thanks, Matthias > From suenaga at oss.nttdata.com Thu Feb 20 00:34:20 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 20 Feb 2020 09:34:20 +0900 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: Message-ID: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> Hi Chihiro, I think this problem is caused by spec of `Properties::store(Writer)`. `Properties::store(OutputStream)` says that the output format is as same as `store(Writer)` [1]. `Properties::store(Writer)` says that `#`, `!`, `=`, `:` are written with a preceding backslash [2]. So I think we should not use `Properties::store` to serialize properties. Thanks, Yasumasa [1] https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.OutputStream,java.lang.String) [2] https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.Writer,java.lang.String) On 2020/02/19 22:36, Chihiro Ito wrote: > Hi, > > Could you review this tiny fix, please? > > This problem affected not the only path on Windows, but also Linux and URLs using ":". > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.00/ > JBS : https://bugs.openjdk.java.net/browse/JDK-8222489 > > Regards, > Chihiro From ioi.lam at oracle.com Thu Feb 20 01:45:06 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Wed, 19 Feb 2020 17:45:06 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> Message-ID: <80b25fee-21c4-b99a-34d0-6cf0da06b52c@oracle.com> I like this proposal. It's simple, easily extensible and should work on all platforms. Thanks - Ioi On 2/19/20 3:59 PM, Yasumasa Suenaga wrote: > Hi, > > Generally I agree with Ioi, but I think it is not a problem only for > gzipped heap dump. > > For example, Compiler.codelist and Compiler.CodeHeap_Analytics might > be large text. > In addition, some users want to redirect the result from jcmd to other > command or log collector. > > So I think it would be better if jcmd provides stdout redurect option > to all subocmmands. E.g. > > ? $ jcmd GC.heap_dump -stdout | gzip -c - > heapdump.hprof.gz > > > Thanks, > > Yasumasa > > > On 2020/02/20 2:40, Ioi Lam wrote: >> >> >> On 2/19/20 7:24 AM, Schmelter, Ralf wrote: >>> Hi Ioi, >>> >>>> This seems to be an edge case (where your environment has more >>>> RAM than disk) >>> I would not say it's an edge case. Especially in a cloud >>> environment, your container does not need much free diskspace, since >>> the data is stored in a database and logging goes to stdout. >>> >>>> I think it would be better to handle this outside of the JVM >>>> (using a named pipe and and external program such as the parallel gzip >>>> "pigz") to limit the maintenance overhead of the JVM. >>> But then you would have to implement writing the heap dump to a >>> named pipe (and not only on Unix, but on Windows too). And you would >>> still want to do the writing in background threads, so most of the >>> code would stay. You need something like netcat on Windows. And it >>> doesn't cover writing a heap dump on OOM via the VM flag. >>> >>> And you should to compress the hprof file in a specific way, since >>> it will make it much faster to random access the gzipped hprof file >>> directly. >>> >>> Note that I think it is a good idea to be able to write the dump to >>> non-file destination. But removing the compression will not save >>> much code and will make the handling messier. >> >> I was thinking of doing something like this: >> >> $ mkfifo /tmp/pipe >> $ cat /tmp/pipe | gzip -c - > /tmp/zipped & >> $ jcmd $PID GC.heap_dump filename=/tmp/pipe >> >> You can replace the "> /tmp/zipped" part with a program that reads >> from stdin and send it over the network. >> >> I tried the above with a recent JDK build (with your changes in >> JDK-8234510: Remove file seeking requirement for writing a heap >> dump), but it doesn't seem to work, probably because we need to >> change this code a little bit >> >> http://hg.openjdk.java.net/jdk/jdk/file/7ef41e83066b/src/hotspot/share/services/heapDumper.cpp#l465 >> >> >> DumpWriter::DumpWriter(const char* path) : _fd(-1), >> _bytes_written(0), _pos(0), >> _in_dump_segment(false), _error(NULL) { >> ???? ... >> ???? _fd = os::create_binary_file(path, false);??? // don't replace >> existing file <<< >> >> I also saw a post saying that the JVM can write to named pipes on >> Windows: >> >> https://urldefense.com/v3/__https://stackoverflow.com/questions/634564/how-to-open-a-windows-named-pipe-from-java__;!!GqivPVa7Brio!NzwD3eTX5oDe2WDGidQjXgiDXpQ7SdnRdyo4D9qxHI46dPcXb5PVzrxZ4UNiUw$ >> >> There's no built-in mkfifo command on Windows, but the above link >> points to a .NET example that creates a named pipe and uses that to >> communicate with the JVM. >> >> I don't know whether this will be a better solution than your >> proposed changes, but I think it should be explored as a possible >> alternative. It does seem to require a little work to get your whole >> data collection system working, but it also seems more flexible and >> extensible. >> >> Thanks >> - Ioi >> >> >>> >>> Best regards, >>> Ralf >>> >>> >>> -----Original Message----- >>> From: Ioi Lam >>> Sent: Mittwoch, 19. Februar 2020 01:16 >>> To: serguei.spitsyn at oracle.com; Schmelter, Ralf >>> ; hotspot-runtime-dev at openjdk.java.net >>> runtime >>> Cc: Laurence Cable ; >>> serviceability-dev at openjdk.java.net >>> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped >>> heap dump >>> >>> Hi Ralf, >>> >>> We are usually pretty picky about adding new features into the JVM. >>> This >>> seems to be an edge case (where your environment has more RAM than >>> disk). I think it would be better to handle this outside of the JVM >>> (using a named pipe and and external program such as the parallel gzip >>> "pigz") to limit the maintenance overhead of the JVM. >>> >>> This would also have the benefit that you can do it with almost no >>> local >>> storage -- you can read from the named pipe, optionally compress the >>> data, and send that over the network. >>> >>> Thanks >>> - Ioi >> From serguei.spitsyn at oracle.com Thu Feb 20 01:57:19 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Wed, 19 Feb 2020 17:57:19 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <80b25fee-21c4-b99a-34d0-6cf0da06b52c@oracle.com> References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> <80b25fee-21c4-b99a-34d0-6cf0da06b52c@oracle.com> Message-ID: <210ec9f2-477d-ea89-85d1-aa7b66a6dd31@oracle.com> I also like this proposal. Thanks, Serguei On 2/19/20 5:45 PM, Ioi Lam wrote: > I like this proposal. It's simple, easily extensible and should work > on all platforms. > > Thanks > - Ioi > > On 2/19/20 3:59 PM, Yasumasa Suenaga wrote: >> Hi, >> >> Generally I agree with Ioi, but I think it is not a problem only for >> gzipped heap dump. >> >> For example, Compiler.codelist and Compiler.CodeHeap_Analytics might >> be large text. >> In addition, some users want to redirect the result from jcmd to >> other command or log collector. >> >> So I think it would be better if jcmd provides stdout redurect option >> to all subocmmands. E.g. >> >> ? $ jcmd GC.heap_dump -stdout | gzip -c - > heapdump.hprof.gz >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/02/20 2:40, Ioi Lam wrote: >>> >>> >>> On 2/19/20 7:24 AM, Schmelter, Ralf wrote: >>>> Hi Ioi, >>>> >>>>> This seems to be an edge case (where your environment has more >>>>> RAM than disk) >>>> I would not say it's an edge case. Especially in a cloud >>>> environment, your container does not need much free diskspace, >>>> since the data is stored in a database and logging goes to stdout. >>>> >>>>> I think it would be better to handle this outside of the JVM >>>>> (using a named pipe and and external program such as the parallel >>>>> gzip >>>>> "pigz") to limit the maintenance overhead of the JVM. >>>> But then you would have to implement writing the heap dump to a >>>> named pipe (and not only on Unix, but on Windows too). And you >>>> would still want to do the writing in background threads, so most >>>> of the code would stay. You need something like netcat on Windows. >>>> And it doesn't cover writing a heap dump on OOM via the VM flag. >>>> >>>> And you should to compress the hprof file in a specific way, since >>>> it will make it much faster to random access the gzipped hprof file >>>> directly. >>>> >>>> Note that I think it is a good idea to be able to write the dump to >>>> non-file destination. But removing the compression will not save >>>> much code and will make the handling messier. >>> >>> I was thinking of doing something like this: >>> >>> $ mkfifo /tmp/pipe >>> $ cat /tmp/pipe | gzip -c - > /tmp/zipped & >>> $ jcmd $PID GC.heap_dump filename=/tmp/pipe >>> >>> You can replace the "> /tmp/zipped" part with a program that reads >>> from stdin and send it over the network. >>> >>> I tried the above with a recent JDK build (with your changes in >>> JDK-8234510: Remove file seeking requirement for writing a heap >>> dump), but it doesn't seem to work, probably because we need to >>> change this code a little bit >>> >>> http://hg.openjdk.java.net/jdk/jdk/file/7ef41e83066b/src/hotspot/share/services/heapDumper.cpp#l465 >>> >>> >>> DumpWriter::DumpWriter(const char* path) : _fd(-1), >>> _bytes_written(0), _pos(0), >>> _in_dump_segment(false), _error(NULL) { >>> ???? ... >>> ???? _fd = os::create_binary_file(path, false);??? // don't replace >>> existing file <<< >>> >>> I also saw a post saying that the JVM can write to named pipes on >>> Windows: >>> >>> https://urldefense.com/v3/__https://stackoverflow.com/questions/634564/how-to-open-a-windows-named-pipe-from-java__;!!GqivPVa7Brio!NzwD3eTX5oDe2WDGidQjXgiDXpQ7SdnRdyo4D9qxHI46dPcXb5PVzrxZ4UNiUw$ >>> >>> There's no built-in mkfifo command on Windows, but the above link >>> points to a .NET example that creates a named pipe and uses that to >>> communicate with the JVM. >>> >>> I don't know whether this will be a better solution than your >>> proposed changes, but I think it should be explored as a possible >>> alternative. It does seem to require a little work to get your whole >>> data collection system working, but it also seems more flexible and >>> extensible. >>> >>> Thanks >>> - Ioi >>> >>> >>>> >>>> Best regards, >>>> Ralf >>>> >>>> >>>> -----Original Message----- >>>> From: Ioi Lam >>>> Sent: Mittwoch, 19. Februar 2020 01:16 >>>> To: serguei.spitsyn at oracle.com; Schmelter, Ralf >>>> ; hotspot-runtime-dev at openjdk.java.net >>>> runtime >>>> Cc: Laurence Cable ; >>>> serviceability-dev at openjdk.java.net >>>> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped >>>> heap dump >>>> >>>> Hi Ralf, >>>> >>>> We are usually pretty picky about adding new features into the JVM. >>>> This >>>> seems to be an edge case (where your environment has more RAM than >>>> disk). I think it would be better to handle this outside of the JVM >>>> (using a named pipe and and external program such as the parallel gzip >>>> "pigz") to limit the maintenance overhead of the JVM. >>>> >>>> This would also have the benefit that you can do it with almost no >>>> local >>>> storage -- you can read from the named pipe, optionally compress the >>>> data, and send that over the network. >>>> >>>> Thanks >>>> - Ioi >>> > From suenaga at oss.nttdata.com Thu Feb 20 02:04:40 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 20 Feb 2020 11:04:40 +0900 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <210ec9f2-477d-ea89-85d1-aa7b66a6dd31@oracle.com> References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> <80b25fee-21c4-b99a-34d0-6cf0da06b52c@oracle.com> <210ec9f2-477d-ea89-85d1-aa7b66a6dd31@oracle.com> Message-ID: Heap Dump has already been written serially since JDK-8234510 at a glance. So I think we can change `jcmd` frontend and DCmd in HotSpot to implement `-stdout` option. But it might not be out of 8237354 change. Thanks, Yasumasa On 2020/02/20 10:57, serguei.spitsyn at oracle.com wrote: > I also like this proposal. > > Thanks, > Serguei > > > On 2/19/20 5:45 PM, Ioi Lam wrote: >> I like this proposal. It's simple, easily extensible and should work on all platforms. >> >> Thanks >> - Ioi >> >> On 2/19/20 3:59 PM, Yasumasa Suenaga wrote: >>> Hi, >>> >>> Generally I agree with Ioi, but I think it is not a problem only for gzipped heap dump. >>> >>> For example, Compiler.codelist and Compiler.CodeHeap_Analytics might be large text. >>> In addition, some users want to redirect the result from jcmd to other command or log collector. >>> >>> So I think it would be better if jcmd provides stdout redurect option to all subocmmands. E.g. >>> >>> ? $ jcmd GC.heap_dump -stdout | gzip -c - > heapdump.hprof.gz >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2020/02/20 2:40, Ioi Lam wrote: >>>> >>>> >>>> On 2/19/20 7:24 AM, Schmelter, Ralf wrote: >>>>> Hi Ioi, >>>>> >>>>>> This seems to be an edge case (where your environment has more >>>>>> RAM than disk) >>>>> I would not say it's an edge case. Especially in a cloud environment, your container does not need much free diskspace, since the data is stored in a database and logging goes to stdout. >>>>> >>>>>> I think it would be better to handle this outside of the JVM >>>>>> (using a named pipe and and external program such as the parallel gzip >>>>>> "pigz") to limit the maintenance overhead of the JVM. >>>>> But then you would have to implement writing the heap dump to a named pipe (and not only on Unix, but on Windows too). And you would still want to do the writing in background threads, so most of the code would stay. You need something like netcat on Windows. And it doesn't cover writing a heap dump on OOM via the VM flag. >>>>> >>>>> And you should to compress the hprof file in a specific way, since it will make it much faster to random access the gzipped hprof file directly. >>>>> >>>>> Note that I think it is a good idea to be able to write the dump to non-file destination. But removing the compression will not save much code and will make the handling messier. >>>> >>>> I was thinking of doing something like this: >>>> >>>> $ mkfifo /tmp/pipe >>>> $ cat /tmp/pipe | gzip -c - > /tmp/zipped & >>>> $ jcmd $PID GC.heap_dump filename=/tmp/pipe >>>> >>>> You can replace the "> /tmp/zipped" part with a program that reads from stdin and send it over the network. >>>> >>>> I tried the above with a recent JDK build (with your changes in JDK-8234510: Remove file seeking requirement for writing a heap dump), but it doesn't seem to work, probably because we need to change this code a little bit >>>> >>>> http://hg.openjdk.java.net/jdk/jdk/file/7ef41e83066b/src/hotspot/share/services/heapDumper.cpp#l465 >>>> >>>> DumpWriter::DumpWriter(const char* path) : _fd(-1), _bytes_written(0), _pos(0), >>>> _in_dump_segment(false), _error(NULL) { >>>> ???? ... >>>> ???? _fd = os::create_binary_file(path, false);??? // don't replace existing file <<< >>>> >>>> I also saw a post saying that the JVM can write to named pipes on Windows: >>>> >>>> https://urldefense.com/v3/__https://stackoverflow.com/questions/634564/how-to-open-a-windows-named-pipe-from-java__;!!GqivPVa7Brio!NzwD3eTX5oDe2WDGidQjXgiDXpQ7SdnRdyo4D9qxHI46dPcXb5PVzrxZ4UNiUw$ >>>> There's no built-in mkfifo command on Windows, but the above link points to a .NET example that creates a named pipe and uses that to communicate with the JVM. >>>> >>>> I don't know whether this will be a better solution than your proposed changes, but I think it should be explored as a possible alternative. It does seem to require a little work to get your whole data collection system working, but it also seems more flexible and extensible. >>>> >>>> Thanks >>>> - Ioi >>>> >>>> >>>>> >>>>> Best regards, >>>>> Ralf >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: Ioi Lam >>>>> Sent: Mittwoch, 19. Februar 2020 01:16 >>>>> To: serguei.spitsyn at oracle.com; Schmelter, Ralf ; hotspot-runtime-dev at openjdk.java.net runtime >>>>> Cc: Laurence Cable ; serviceability-dev at openjdk.java.net >>>>> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump >>>>> >>>>> Hi Ralf, >>>>> >>>>> We are usually pretty picky about adding new features into the JVM. This >>>>> seems to be an edge case (where your environment has more RAM than >>>>> disk). I think it would be better to handle this outside of the JVM >>>>> (using a named pipe and and external program such as the parallel gzip >>>>> "pigz") to limit the maintenance overhead of the JVM. >>>>> >>>>> This would also have the benefit that you can do it with almost no local >>>>> storage -- you can read from the named pipe, optionally compress the >>>>> data, and send that over the network. >>>>> >>>>> Thanks >>>>> - Ioi >>>> >> > From larry.cable at oracle.com Thu Feb 20 02:04:18 2020 From: larry.cable at oracle.com (Laurence Cable) Date: Thu, 20 Feb 2020 02:04:18 +0000 (UTC) Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <80b25fee-21c4-b99a-34d0-6cf0da06b52c@oracle.com> References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> <80b25fee-21c4-b99a-34d0-6cf0da06b52c@oracle.com> Message-ID: On 2/19/20 5:45 PM, Ioi Lam wrote: > I like this proposal. It's simple, easily extensible and should work > on all platforms. +1 *and* it also is in the spirit/philosophy or *IX cmds ... > > Thanks > - Ioi > > On 2/19/20 3:59 PM, Yasumasa Suenaga wrote: >> Hi, >> >> Generally I agree with Ioi, but I think it is not a problem only for >> gzipped heap dump. >> >> For example, Compiler.codelist and Compiler.CodeHeap_Analytics might >> be large text. >> In addition, some users want to redirect the result from jcmd to >> other command or log collector. >> >> So I think it would be better if jcmd provides stdout redurect option >> to all subocmmands. E.g. >> >> $ jcmd GC.heap_dump -stdout | gzip -c - > heapdump.hprof.gz >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/02/20 2:40, Ioi Lam wrote: >>> >>> >>> On 2/19/20 7:24 AM, Schmelter, Ralf wrote: >>>> Hi Ioi, >>>> >>>>> This seems to be an edge case (where your environment has more >>>>> RAM than disk) >>>> I would not say it's an edge case. Especially in a cloud >>>> environment, your container does not need much free diskspace, >>>> since the data is stored in a database and logging goes to stdout. >>>> >>>>> I think it would be better to handle this outside of the JVM >>>>> (using a named pipe and and external program such as the parallel >>>>> gzip >>>>> "pigz") to limit the maintenance overhead of the JVM. >>>> But then you would have to implement writing the heap dump to a >>>> named pipe (and not only on Unix, but on Windows too). And you >>>> would still want to do the writing in background threads, so most >>>> of the code would stay. You need something like netcat on Windows. >>>> And it doesn't cover writing a heap dump on OOM via the VM flag. >>>> >>>> And you should to compress the hprof file in a specific way, since >>>> it will make it much faster to random access the gzipped hprof file >>>> directly. >>>> >>>> Note that I think it is a good idea to be able to write the dump to >>>> non-file destination. But removing the compression will not save >>>> much code and will make the handling messier. >>> >>> I was thinking of doing something like this: >>> >>> $ mkfifo /tmp/pipe >>> $ cat /tmp/pipe | gzip -c - > /tmp/zipped & >>> $ jcmd $PID GC.heap_dump filename=/tmp/pipe >>> >>> You can replace the "> /tmp/zipped" part with a program that reads >>> from stdin and send it over the network. >>> >>> I tried the above with a recent JDK build (with your changes in >>> JDK-8234510: Remove file seeking requirement for writing a heap >>> dump), but it doesn't seem to work, probably because we need to >>> change this code a little bit >>> >>> http://hg.openjdk.java.net/jdk/jdk/file/7ef41e83066b/src/hotspot/share/services/heapDumper.cpp#l465 >>> >>> >>> DumpWriter::DumpWriter(const char* path) : _fd(-1), >>> _bytes_written(0), _pos(0), >>> _in_dump_segment(false), _error(NULL) { >>> ... >>> _fd = os::create_binary_file(path, false); // don't replace >>> existing file <<< >>> >>> I also saw a post saying that the JVM can write to named pipes on >>> Windows: >>> >>> https://urldefense.com/v3/__https://stackoverflow.com/questions/634564/how-to-open-a-windows-named-pipe-from-java__;!!GqivPVa7Brio!NzwD3eTX5oDe2WDGidQjXgiDXpQ7SdnRdyo4D9qxHI46dPcXb5PVzrxZ4UNiUw$ >>> >>> There's no built-in mkfifo command on Windows, but the above link >>> points to a .NET example that creates a named pipe and uses that to >>> communicate with the JVM. >>> >>> I don't know whether this will be a better solution than your >>> proposed changes, but I think it should be explored as a possible >>> alternative. It does seem to require a little work to get your whole >>> data collection system working, but it also seems more flexible and >>> extensible. >>> >>> Thanks >>> - Ioi >>> >>> >>>> >>>> Best regards, >>>> Ralf >>>> >>>> >>>> -----Original Message----- >>>> From: Ioi Lam >>>> Sent: Mittwoch, 19. Februar 2020 01:16 >>>> To: serguei.spitsyn at oracle.com; Schmelter, Ralf >>>> ; hotspot-runtime-dev at openjdk.java.net >>>> runtime >>>> Cc: Laurence Cable ; >>>> serviceability-dev at openjdk.java.net >>>> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped >>>> heap dump >>>> >>>> Hi Ralf, >>>> >>>> We are usually pretty picky about adding new features into the JVM. >>>> This >>>> seems to be an edge case (where your environment has more RAM than >>>> disk). I think it would be better to handle this outside of the JVM >>>> (using a named pipe and and external program such as the parallel gzip >>>> "pigz") to limit the maintenance overhead of the JVM. >>>> >>>> This would also have the benefit that you can do it with almost no >>>> local >>>> storage -- you can read from the named pipe, optionally compress the >>>> data, and send that over the network. >>>> >>>> Thanks >>>> - Ioi >>> > From poonam.bajaj at oracle.com Thu Feb 20 02:22:34 2020 From: poonam.bajaj at oracle.com (Poonam Parhar) Date: Wed, 19 Feb 2020 18:22:34 -0800 Subject: RFR (S) 8239055: Wrong implementation of VMState.hasListener In-Reply-To: <874b118b-c3c5-ce16-c62a-0d96fd505f56@oracle.com> References: <988ac692-444a-4025-99c5-421d4c554f3b@default> <874b118b-c3c5-ce16-c62a-0d96fd505f56@oracle.com> Message-ID: Hello Fairoz, The change looks good to me as well. Thanks, Poonam On 2/18/20 12:26 PM, serguei.spitsyn at oracle.com wrote: > Hi Fairoz, > > Looks good. > > Thanks, > Serguei > > > On 2/13/20 9:13 PM, Fairoz Matte wrote: >> >> Hi, >> >> Please review a tiny change to correct the VMState.hasListener >> implementation. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8239055 >> >> Webrev: http://cr.openjdk.java.net/~fmatte/8239055/webrev.00/ >> >> Thanks, >> >> Fairoz >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthias.baesken at sap.com Thu Feb 20 08:53:41 2020 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 20 Feb 2020 08:53:41 +0000 Subject: RFR [XS]: 8239462: jdk.hotspot.agent misses some ReleaseStringUTFChars calls in case of early returns In-Reply-To: <6ed0d004-3e8c-4b27-f583-06dbacf45173@oracle.com> References: <6ed0d004-3e8c-4b27-f583-06dbacf45173@oracle.com> Message-ID: Hi Alex / Christoph, thanks for the reviews. New webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.1/ - includes LinuxDebuggerLocal.cpp - adds a blank Christoph wanted to have A question (hopefully not a stupid one ?? ): At most places in the coding, GetStringUTFChars success is 1. handled by checking NULL , like this : const char *s = (*env)->GetStringUTFChars(env, p, NULL); if (s == NULL) { // handle failure } 2.At some places , success / failure is not handled at all . 3.Here (e.g. LinuxDebuggerLocal.cpp) success / failure check is done by if (env->ExceptionOccurred()) { ... } Which one is the best / right way to do it (most likely not 2.) ? Best regards, Matthias > > Looks like > src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.cpp > has similar issues. It would be nice to fix them as well. > > --alex > > On 02/19/2020 06:21, Baesken, Matthias wrote: > > Hello, please review this small change . > > We miss at a few places ReleaseStringUTFChars calls in the native > > jdk.hotspot.agent coding. > > This happens in case of early returns . > > > > > > Bug/webrev : > > > > https://bugs.openjdk.java.net/browse/JDK-8239462 > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.0/ > > > > > > Thanks, Matthias > > From chiroito107 at gmail.com Thu Feb 20 11:20:43 2020 From: chiroito107 at gmail.com (Chihiro Ito) Date: Thu, 20 Feb 2020 20:20:43 +0900 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> Message-ID: Hi Yasumasa, Thank you for your quick review. I modified the code without Properties::store. Could you review this again, please? Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.01/ Regards, Chihiro 2020?2?20?(?) 9:34 Yasumasa Suenaga : > Hi Chihiro, > > I think this problem is caused by spec of `Properties::store(Writer)`. > > `Properties::store(OutputStream)` says that the output format is as same > as `store(Writer)` [1]. > `Properties::store(Writer)` says that `#`, `!`, `=`, `:` are written with > a preceding backslash [2]. > > So I think we should not use `Properties::store` to serialize properties. > > > Thanks, > > Yasumasa > > > [1] > https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.OutputStream,java.lang.String) > [2] > https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.Writer,java.lang.String) > > > On 2020/02/19 22:36, Chihiro Ito wrote: > > Hi, > > > > Could you review this tiny fix, please? > > > > This problem affected not the only path on Windows, but also Linux and > URLs using ":". > > > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.00/ > > JBS : https://bugs.openjdk.java.net/browse/JDK-8222489 > > > > Regards, > > Chihiro > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.schmelter at sap.com Thu Feb 20 13:21:11 2020 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Thu, 20 Feb 2020 13:21:11 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> Message-ID: Hi Yasumasa, I think it would be great if we could redirect larger chunks data to jcmd. But you have to differentiate between binary data (for the heap dump) and text data (for the e.g. codelist). Currently jcmd assumes all bytes to be UTF-8 encoded, converts them to Unicode and then uses the platform encoding to write characters. This is not suitable for binary data. And of course you cannot use the bufferedStream to get the output to jcmd. You would have to implement an outputStream which can directly write to the AttachListener connection. But even with this change, I would still like the gzip compression to be done in the VM. Let me try to list all the advantages I see for doing this: 1. It is by far the easiest to use. You just have to specify -gz for the jcmd. While your command line (jcmd .... | gzip -c > file) is easy enough, it assumes you have gzip (not by default on Windows) and it would be painfully slow (~ 10 x and more), since it is not parallel. You could use pigz, but it is not as ubiquitous as gzip. I know it is sometimes hard to image this could be a problem for anyone, but it is. It is easy to tell a customer to execute jcmd GC.heap_dump -gz test.hprof.gz. Adding additional requirements, especially if it is external programs, and your chance of success diminish fast. 2. The -XX:HeapDumpOnOutOfMemoryError, -XX: HeapDumpBeforeFullGC and -XX: HeapDumpAfterFullGC options can easily create gzipped heap dumps directly when the compression is in the VM. And especially if you create more than one dump (with the before/after gc flags), compression is very useful. Or if you want to support compressed heap dumps it in the HotSpotDiagnosticMXBean. Just add a flag and/or compression level. 3. The created gz-file is not a simple gz-file you would get when simply using gzip. It is created in a way that makes it possible to treat it like a random access file without decompressing it. Currently for example the Eclipse Memory Analyzer (MAT) has the option to directly open a gzipped hprof file and use it without decompression. And for the initial parsing, they can just read the file sequentially, so this is not too slow. But when accessing the values of objects or arrays, they have to seek to specific positions in the gzipped hprof file. This is currently implemented by having a Java implementation of a InflaterInputStream which is capable to completely copy its state. This copy is then used to start decompressing at the specific offset for which is was created. As you can imagine, the state of the inflater is not small (MAT assumes about 64Kb, 32kB is needed at least for the dictionary), so it limits the number of starting positions you can use for large files. But it works for all kinds of gzip compressed streams. The gzip implementation used to write the heap dump in the VM creates many small gzip compressed chunks. At the start of each chunk you can create a fresh GZIPInputStream without having to store any internal state. You only need to remember the physical offset and the logical offset (so 2 long values) for each chunk. If you then want to read data at a specific logical offset, you binary search the nearest preceding chunk and create a GZIPInputStream reading from the physical offset of that chunk. So on average you have to decompress about half a chunk to get to the data you need. If you look in the in webrev, you can see http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.0/test/lib/jdk/test/lib/hprof/parser/GzipRandomAccess.java.html. This implements the needed logic to treat the gzipped hprof file as a random access file. I have used it to add support for gzipped files in the jhat library (which is only used in tests). In jhat hat for example, the resolution of references is done via random access. And the file also contains all the functionality MAT would need. You can generate a more or less equivalent file if you use pigz with the --independent option. But to make it easier to detect that the gzip file is chunked (without decompressing it first), I've added a comment marking it as a hprof file with a given chunk size. This would be missing from the pigz file, but they instead adding 9 bytes when --independent is specified (00 00 ff ff 00 00 00 ff ff), so you could detect it too. To summarize, the gzipped hprof file created by the VM makes it much easier for tools to access them efficiently at random positions. You can do something equivalent with pigz, but not with gzip. And getting support for this type of gzipped hprof file by the heap dump tools will be much easier, if this is the format the openjdk produces, so it will be widespread. Best regards, Ralf -----Original Message----- From: Yasumasa Suenaga Sent: Donnerstag, 20. Februar 2020 00:59 To: Ioi Lam ; Schmelter, Ralf ; serguei.spitsyn at oracle.com; hotspot-runtime-dev at openjdk.java.net runtime Cc: serviceability-dev at openjdk.java.net Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump Hi, Generally I agree with Ioi, but I think it is not a problem only for gzipped heap dump. For example, Compiler.codelist and Compiler.CodeHeap_Analytics might be large text. In addition, some users want to redirect the result from jcmd to other command or log collector. So I think it would be better if jcmd provides stdout redurect option to all subocmmands. E.g. $ jcmd GC.heap_dump -stdout | gzip -c - > heapdump.hprof.gz Thanks, Yasumasa From suenaga at oss.nttdata.com Thu Feb 20 13:39:07 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 20 Feb 2020 22:39:07 +0900 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> Message-ID: <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> Hi Chihiro, On 2020/02/20 20:20, Chihiro Ito wrote: > Hi Yasumasa, > > Thank you for your quick review. > > I modified the code without Properties::store. > > Could you review this again, please? > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.01/ - Your change shows "\n" as "\\n". Is it ok? Currently "\n" would be shown straightly. - Your change uses Character::codePointAt to convert char to int value. According to Javadoc, it would be different value if a char is in surrogate range. - Description of serializePropertiesToByteArray() says the return value is encoded in ISO 8859-1, but it does not seems to be so because the logic depends on the spec of Properties::store. Is it ok? - Test case does not stable because system properties might be different from your environment. I suggest you to set system properties for testing explicitly. E.g. -Dnormal=normal_val -D"space space=blank blank" -Dnonascii=????? -Dopenjdk_url=http://openjdk.java.net/ -Dbackslash="\\" * Also I recommend you to check "\n" in the test from `line.separator`. I think it is stable property. I've not convinced whether we should compliant to the comment which says for ISO 8859-1. If it is important, we can use CharsetEncoder from ISO_8859_1 as below: http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-encoder/ OTOH we can keep current behavior, we can implement more simply as below: (It's similar to yours.) http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-props-style/ Thanks, Yasumasa > Regards, > Chihiro > > > 2020?2?20?(?) 9:34 Yasumasa Suenaga >: > > Hi Chihiro, > > I think this problem is caused by spec of `Properties::store(Writer)`. > > `Properties::store(OutputStream)` says that the output format is as same as `store(Writer)` [1]. > `Properties::store(Writer)` says that `#`, `!`, `=`, `:` are written with a preceding backslash [2]. > > So I think we should not use `Properties::store` to serialize properties. > > > Thanks, > > Yasumasa > > > [1] https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.OutputStream,java.lang.String) > [2] https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.Writer,java.lang.String) > > > On 2020/02/19 22:36, Chihiro Ito wrote: > > Hi, > > > > Could you review this tiny fix, please? > > > > This problem affected not the only path on Windows, but also Linux and URLs using ":". > > > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.00/ > > JBS : https://bugs.openjdk.java.net/browse/JDK-8222489 > > > > Regards, > > Chihiro > From suenaga at oss.nttdata.com Thu Feb 20 14:51:35 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 20 Feb 2020 23:51:35 +0900 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> Message-ID: <10957ddd-c51a-f614-a663-508cfd3de7a6@oss.nttdata.com> Hi Ralf, On 2020/02/20 22:21, Schmelter, Ralf wrote: > Hi Yasumasa, > > I think it would be great if we could redirect larger chunks data to jcmd. > > But you have to differentiate between binary data (for the heap dump) and text data (for the e.g. codelist). > > Currently jcmd assumes all bytes to be UTF-8 encoded, converts them to Unicode and then uses the platform encoding to write characters. This is not suitable for binary data. > > And of course you cannot use the bufferedStream to get the output to jcmd. You would have to implement an outputStream which can directly write to the AttachListener connection. I've understood it, but I think we can implement new class which extends outputStream or bufferedStream. In jcmd side, we can switch the method to handle binary or text data. In HotSpot side, we can switch stream class to use with parameter(s) from frontend (jcmd). > But even with this change, I would still like the gzip compression to be done in the VM. Let me try to list all the advantages I see for doing this: > > 1. It is by far the easiest to use. You just have to specify -gz for the jcmd. While your command line (jcmd .... | gzip -c > file) is easy enough, it assumes you have gzip (not by default on Windows) and it would be painfully slow (~ 10 x and more), since it is not parallel. You could use pigz, but it is not as ubiquitous as gzip. I know it is sometimes hard to image this could be a problem for anyone, but it is. > > It is easy to tell a customer to execute jcmd GC.heap_dump -gz test.hprof.gz. Adding additional requirements, especially if it is external programs, and your chance of success diminish fast. As an troubleshooter, I agree with you to ease of use and ease of instruction for customers. But we can clear your concern if we provide command examples or shell script to collect data. In case of modern Windows, tar (of course, it includes -z option) is available. we can compress heap dump with it. https://techcommunity.microsoft.com/t5/containers/tar-and-curl-come-to-windows/ba-p/382409 > 2. The -XX:HeapDumpOnOutOfMemoryError, -XX: HeapDumpBeforeFullGC and -XX: HeapDumpAfterFullGC options can easily create gzipped heap dumps directly when the compression is in the VM. And especially if you create more than one dump (with the before/after gc flags), compression is very useful. Or if you want to support compressed heap dumps it in the HotSpotDiagnosticMXBean. Just add a flag and/or compression level. Do you have experience about HeapDumpBeforeFullGC and/or HeapDumpAfterFullGC? I guess they are not used in production environment. I recommend my customers to use -XX:HeapDumpOnOutOfMemoryError, but also we can use -XX:OnOutOfMemoryError. If disk is enough to dump, we can invoke `gz` via -XX:OnOutOfMemoryError. It calls after HeapDumpOnOutOfMemoryError. > 3. The created gz-file is not a simple gz-file you would get when simply using gzip. > > It is created in a way that makes it possible to treat it like a random access file without decompressing it. > > Currently for example the Eclipse Memory Analyzer (MAT) has the option to directly open a gzipped hprof file and use it without decompression. And for the initial parsing, they can just read the file sequentially, so this is not too slow. > > But when accessing the values of objects or arrays, they have to seek to specific positions in the gzipped hprof file. This is currently implemented by having a Java implementation of a InflaterInputStream which is capable to completely copy its state. This copy is then used to start decompressing at the specific offset for which is was created. As you can imagine, the state of the inflater is not small (MAT assumes about 64Kb, 32kB is needed at least for the dictionary), so it limits the number of starting positions you can use for large files. But it works for all kinds of gzip compressed streams. > > The gzip implementation used to write the heap dump in the VM creates many small gzip compressed chunks. At the start of each chunk you can create a fresh GZIPInputStream without having to store any internal state. You only need to remember the physical offset and the logical offset (so 2 long values) for each chunk. If you then want to read data at a specific logical offset, you binary search the nearest preceding chunk and create a GZIPInputStream reading from the physical offset of that chunk. So on average you have to decompress about half a chunk to get to the data you need. > > If you look in the in webrev, you can see http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.0/test/lib/jdk/test/lib/hprof/parser/GzipRandomAccess.java.html. This implements the needed logic to treat the gzipped hprof file as a random access file. I have used it to add support for gzipped files in the jhat library (which is only used in tests). In jhat hat for example, the resolution of references is done via random access. And the file also contains all the functionality MAT would need. I've used MAT for analyzing heap dump, and I usually check various objects in it. AFAIK heap dump is heap snapshot. So we need to traverse it entirely, isn't it? If so, we need to decompress heap dump entirely in actually. > You can generate a more or less equivalent file if you use pigz with the --independent option. But to make it easier to detect that the gzip file is chunked (without decompressing it first), I've added a comment marking it as a hprof file with a given chunk size. This would be missing from the pigz file, but they instead adding 9 bytes when --independent is specified (00 00 ff ff 00 00 00 ff ff), so you could detect it too. Is it in spec of gzip? I'm not familiar of gzip, but I concern if it is specialized for something. > To summarize, the gzipped hprof file created by the VM makes it much easier for tools to access them efficiently at random positions. You can do something equivalent with pigz, but not with gzip. > > And getting support for this type of gzipped hprof file by the heap dump tools will be much easier, if this is the format the openjdk produces, so it will be widespread. I think it is a balance between implementation/maintenance cost of your change and ease of use/disk space reduction. In case of Linux, we can redirect archiver/compressor with /proc/sys/kernel/core_pattern. IMHO it is nature if heap dump handles as same as memory dump. Thanks, Yasumasa > Best regards, > Ralf > > -----Original Message----- > From: Yasumasa Suenaga > Sent: Donnerstag, 20. Februar 2020 00:59 > To: Ioi Lam ; Schmelter, Ralf ; serguei.spitsyn at oracle.com; hotspot-runtime-dev at openjdk.java.net runtime > Cc: serviceability-dev at openjdk.java.net > Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump > > Hi, > > Generally I agree with Ioi, but I think it is not a problem only for gzipped heap dump. > > For example, Compiler.codelist and Compiler.CodeHeap_Analytics might be large text. > In addition, some users want to redirect the result from jcmd to other command or log collector. > > So I think it would be better if jcmd provides stdout redurect option to all subocmmands. E.g. > > $ jcmd GC.heap_dump -stdout | gzip -c - > heapdump.hprof.gz > > > Thanks, > > Yasumasa > From alexey.menkov at oracle.com Thu Feb 20 20:07:55 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Thu, 20 Feb 2020 12:07:55 -0800 Subject: RFR [XS]: 8239462: jdk.hotspot.agent misses some ReleaseStringUTFChars calls in case of early returns In-Reply-To: References: <6ed0d004-3e8c-4b27-f583-06dbacf45173@oracle.com> Message-ID: Hi Matthias, Looks good in general, but I think it makes sense to fix #2 cases (at least I see them in LinuxDebuggerLocal). If GetStringUTFChars fails, the code will crash. Also I see GetStringUTFChars(str, JNI_FALSE). This look bad as well - 2nd arg is a pointer, so it should be NULL or nullptr. As for #1 and #3 - AFAIU they are both right ways. If GetStringUTFChars fails, it throws OOM and return NULL. And one more thing to consider. LinuxDebuggerLocal_attach0 function looks terrible - 7 ReleaseStringUTFChars calls for 2 GetStringUTFChars. Maybe it make sense to introduce simple wrapper like AutoJavaString in src/jdk.hotspot.agent/windows/native/libsaproc/sawindbg.cpp It would make the code simpler and less error prone. --alex On 02/20/2020 00:53, Baesken, Matthias wrote: > Hi Alex / Christoph, thanks for the reviews. > > New webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.1/ > > - includes LinuxDebuggerLocal.cpp > - adds a blank Christoph wanted to have > > > A question (hopefully not a stupid one ?? ): > At most places in the coding, GetStringUTFChars success is > 1. handled by checking NULL , like this : > > const char *s = (*env)->GetStringUTFChars(env, p, NULL); > if (s == NULL) { > // handle failure > } > > 2.At some places , success / failure is not handled at all . > > 3.Here (e.g. LinuxDebuggerLocal.cpp) success / failure check is done by > > if (env->ExceptionOccurred()) { ... } > > Which one is the best / right way to do it (most likely not 2.) ? > > > Best regards, Matthias > > > >> >> Looks like >> src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.cpp >> has similar issues. It would be nice to fix them as well. >> >> --alex >> >> On 02/19/2020 06:21, Baesken, Matthias wrote: >>> Hello, please review this small change . >>> We miss at a few places ReleaseStringUTFChars calls in the native >>> jdk.hotspot.agent coding. >>> This happens in case of early returns . >>> >>> >>> Bug/webrev : >>> >>> https://bugs.openjdk.java.net/browse/JDK-8239462 >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.0/ >>> >>> >>> Thanks, Matthias >>> From matthias.baesken at sap.com Fri Feb 21 08:09:26 2020 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 21 Feb 2020 08:09:26 +0000 Subject: RFR [XS]: 8239462: jdk.hotspot.agent misses some ReleaseStringUTFChars calls in case of early returns In-Reply-To: References: <6ed0d004-3e8c-4b27-f583-06dbacf45173@oracle.com> Message-ID: > Also I see GetStringUTFChars(str, JNI_FALSE). This look bad as well - > 2nd arg is a pointer, so it should be NULL or nullptr. Hi looks like there is another one here, do you think these JNI_FALSE params would really cause trouble ? Not even the compiler warns here ... src/java.desktop/unix/native/libawt_xawt/xawt/XlibWrapper.c:824: cname = (char *) (*env)->GetStringUTFChars(env, jstr, JNI_FALSE); Best regards, Matthias > Hi Matthias, > > Looks good in general, but I think it makes sense to fix #2 cases (at > least I see them in LinuxDebuggerLocal). If GetStringUTFChars fails, the > code will crash. > Also I see GetStringUTFChars(str, JNI_FALSE). This look bad as well - > 2nd arg is a pointer, so it should be NULL or nullptr. > > As for #1 and #3 - AFAIU they are both right ways. > If GetStringUTFChars fails, it throws OOM and return NULL. > > And one more thing to consider. > LinuxDebuggerLocal_attach0 function looks terrible - 7 > ReleaseStringUTFChars calls for 2 GetStringUTFChars. > Maybe it make sense to introduce simple wrapper like AutoJavaString in > src/jdk.hotspot.agent/windows/native/libsaproc/sawindbg.cpp > It would make the code simpler and less error prone. > > --alex > > On 02/20/2020 00:53, Baesken, Matthias wrote: > > Hi Alex / Christoph, thanks for the reviews. > > > > New webrev : > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.1/ > > > > - includes LinuxDebuggerLocal.cpp > > - adds a blank Christoph wanted to have > > > > > > A question (hopefully not a stupid one ?? ): > > At most places in the coding, GetStringUTFChars success is > > 1. handled by checking NULL , like this : > > > > const char *s = (*env)->GetStringUTFChars(env, p, NULL); > > if (s == NULL) { > > // handle failure > > } > > > > 2.At some places , success / failure is not handled at all . > > > > 3.Here (e.g. LinuxDebuggerLocal.cpp) success / failure check is done by > > > > if (env->ExceptionOccurred()) { ... } > > > > Which one is the best / right way to do it (most likely not 2.) ? > > > > > > Best regards, Matthias > > > > > > > >> > >> Looks like > >> src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.cpp > >> has similar issues. It would be nice to fix them as well. > >> > >> --alex > >> > >> On 02/19/2020 06:21, Baesken, Matthias wrote: > >>> Hello, please review this small change . > >>> We miss at a few places ReleaseStringUTFChars calls in the native > >>> jdk.hotspot.agent coding. > >>> This happens in case of early returns . > >>> > >>> > >>> Bug/webrev : > >>> > >>> https://bugs.openjdk.java.net/browse/JDK-8239462 > >>> > >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.0/ > >>> > >>> > >>> Thanks, Matthias > >>> From matthias.baesken at sap.com Fri Feb 21 08:32:35 2020 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 21 Feb 2020 08:32:35 +0000 Subject: RFR [XS]: 8239462: jdk.hotspot.agent misses some ReleaseStringUTFChars calls in case of early returns In-Reply-To: References: <6ed0d004-3e8c-4b27-f583-06dbacf45173@oracle.com> Message-ID: Hi Alex , new webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.2/ Best Regards, Matthias > > Hi Matthias, > > Looks good in general, but I think it makes sense to fix #2 cases (at > least I see them in LinuxDebuggerLocal). If GetStringUTFChars fails, the > code will crash. > Also I see GetStringUTFChars(str, JNI_FALSE). This look bad as well - > 2nd arg is a pointer, so it should be NULL or nullptr. > > As for #1 and #3 - AFAIU they are both right ways. > If GetStringUTFChars fails, it throws OOM and return NULL. > > And one more thing to consider. > LinuxDebuggerLocal_attach0 function looks terrible - 7 > ReleaseStringUTFChars calls for 2 GetStringUTFChars. > Maybe it make sense to introduce simple wrapper like AutoJavaString in > src/jdk.hotspot.agent/windows/native/libsaproc/sawindbg.cpp > It would make the code simpler and less error prone. > > --alex > From stefan.karlsson at oracle.com Fri Feb 21 10:23:44 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Fri, 21 Feb 2020 11:23:44 +0100 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: <3a5d60fd-04d5-da96-3d79-242d43fdec79@redhat.com> References: <8ea20a15-bdb9-27b0-c306-154f539a3674@oracle.com> <3a5d60fd-04d5-da96-3d79-242d43fdec79@redhat.com> Message-ID: <901d7307-cf36-0367-e09c-ff47c76bbc25@oracle.com> Hi Zhengyu, On 2020-02-17 15:51, Zhengyu Gu wrote: > Hi Stefan, > > Thanks for the review and suggestions, updated accordingly: > > http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/ Thanks for moving the code. I think this looks good. If you're up for it, I have a couple of style change suggestions: 1) ObjectMarker uses two verbs to describe the same thing: "mark" and "visit". I propose that we only use "mark" in ObjectMarker and leave the usage of "visited" to the Jvmti code. 2) Some updates to odd whitespaces 3) Using forward declarations in Shenandoah code. I've bundled those changes into webrevs: https://cr.openjdk.java.net/~stefank/8238633/webrev.01.delta https://cr.openjdk.java.net/~stefank/8238633/webrev.01 Regarding performance testing, the HeapWalkTests you used seems to use a very small heap. I think it would be good to redo the measurements on a larger heap. Could you take the HeapWalkTest and add a few GBs of small, linked objects? Thank, StefanK > >> >> --- >> Previously, the calls to 'mark' and 'visited' were inlineable, but >> now every GC has to take a virtual call when marking the objects. My >> guess is that this code is slow anyway, and that it doesn't matter >> too much, but did you measure the effect of that change with, for >> example, G1? >> > I did rough measurement, timing > vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test. > > If you know any tests/benchmarks I should measure, please let me know. > > Thanks, > > -Zhengyu > > >> Thanks, >> StefanK >> >>> Test: >>> ?? hotspot_gc >>> ?? vmTestbase_nsk_jdi >>> ?? vmTestbase_nsk_jvmti >>> >>> Thanks, >>> >>> -Zhengyu >>> >>> >> > From coleen.phillimore at oracle.com Fri Feb 21 13:01:07 2020 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Fri, 21 Feb 2020 08:01:07 -0500 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: References: <8ea20a15-bdb9-27b0-c306-154f539a3674@oracle.com> <3a5d60fd-04d5-da96-3d79-242d43fdec79@redhat.com> <901d7307-cf36-0367-e09c-ff47c76bbc25@oracle.com> Message-ID: Adding serviceability-dev back. Coleen On 2/21/20 7:59 AM, coleen.phillimore at oracle.com wrote: > > Hi, I had a quick look at this, minus the shenandoah code. > > http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/src/hotspot/share/gc/shared/objectMarker.hpp.html > > > I think this file could have forward declarations of GrowableArray and > I didn't see a need for the markWord.hpp include. > > This change on the whole looks good to me. > > Coleen > > On 2/21/20 5:23 AM, Stefan Karlsson wrote: >> Hi Zhengyu, >> >> On 2020-02-17 15:51, Zhengyu Gu wrote: >>> Hi Stefan, >>> >>> Thanks for the review and suggestions, updated accordingly: >>> >>> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/ >> >> Thanks for moving the code. I think this looks good. >> >> If you're up for it, I have a couple of style change suggestions: >> >> 1) ObjectMarker uses two verbs to describe the same thing: "mark" and >> "visit". I propose that we only use "mark" in ObjectMarker and leave >> the usage of "visited" to the Jvmti code. >> >> 2) Some updates to odd whitespaces >> >> 3) Using forward declarations in Shenandoah code. >> >> I've bundled those changes into webrevs: >> >> https://cr.openjdk.java.net/~stefank/8238633/webrev.01.delta >> https://cr.openjdk.java.net/~stefank/8238633/webrev.01 >> >> Regarding performance testing, the HeapWalkTests you used seems to >> use a very small heap. I think it would be good to redo the >> measurements on a larger heap. Could you take the HeapWalkTest and >> add a few GBs of small, linked objects? >> >> Thank, >> StefanK >>> >>>> >>>> --- >>>> Previously, the calls to 'mark' and 'visited' were inlineable, but >>>> now every GC has to take a virtual call when marking the objects. >>>> My guess is that this code is slow anyway, and that it doesn't >>>> matter too much, but did you measure the effect of that change >>>> with, for example, G1? >>>> >>> I did rough measurement, timing >>> vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test. >>> >>> If you know any tests/benchmarks I should measure, please let me know. >>> >>> Thanks, >>> >>> -Zhengyu >>> >>> >>>> Thanks, >>>> StefanK >>>> >>>>> Test: >>>>> ?? hotspot_gc >>>>> ?? vmTestbase_nsk_jdi >>>>> ?? vmTestbase_nsk_jvmti >>>>> >>>>> Thanks, >>>>> >>>>> -Zhengyu >>>>> >>>>> >>>> >>> >> > From richard.reingruber at sap.com Fri Feb 21 14:08:29 2020 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 21 Feb 2020 14:08:29 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Ping :) Richard. -----Original Message----- From: hotspot-compiler-dev On Behalf Of Reingruber, Richard Sent: Dienstag, 4. Februar 2020 09:59 To: David Holmes ; Vladimir Kozlov (vladimir.kozlov at oracle.com) ; Robbin Ehn ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi, I have prepared webrev.4 that incorporates feedback from webrev.3 (thanks!) Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ Incremental: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/ I was not able to eliminate the additional suspend flag now. I'll take care of this as soon as the existing suspend-resume-mechanism is reworked. Testing: Nightly tests @SAP: JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance Suite, SAP specific tests with fastdebug and release builds on all platforms Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x parallel for 24h Thanks, Richard. More details on the changes: * Hide DeoptimizeObjectsALotThread from external view. * Changed EscapeBarrier_lock to be a _safepoint_check_never lock. It used to be _safepoint_check_sometimes, which will be eliminated sooner or later. I added explicit thread state changes with ThreadBlockInVM to code paths where we can wait() on EscapeBarrier_lock to become safepoint safe. * Use handshake EscapeBarrierSuspendHandshake to suspend target threads instead of vm operation VM_ThreadSuspendAllForObjDeopt. * Removed uses of Threads_lock. When adding a new thread we suspend it iff EA optimizations are being reverted. In the previous version we were waiting on Threads_lock while EA optimizations were reverted. See EscapeBarrier::thread_added(). * Made tests require Xmixed compilation mode. * Made tests agnostic regarding tiered compilation. I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or disabled. * Exercising EATests.java as well with stress test options DeoptimizeObjectsALot* Due to the non-deterministic deoptimizations some tests need to be skipped. We do this to prevent bit-rot of the stress test code. * Executing EATests.java as well with graal if available. Driver for this is EATestsJVMCI.java. Graal cannot pass all tests, because it does not provide all the new debug info (namely not_global_escape_in_scope and arg_escape in scopeDesc.hpp). And graal does not yet support the JVMTI operations force early return and pop frame. * Removed tracing from new jdi tests in EATests.java. Too much trace output before the debugging connection is established can cause deadlock because output buffers fill up. (See https://bugs.openjdk.java.net/browse/JDK-8173304) * Many copyright year changes and smaller clean-up changes of testing code (trailing white-space and the like). -----Original Message----- From: David Holmes Sent: Donnerstag, 19. Dezember 2019 03:12 To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com) Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents Hi Richard, I think my issue is with the way EliminateNestedLocks works so I'm going to look into that more deeply. Thanks for the explanations. David On 18/12/2019 12:47 am, Reingruber, Richard wrote: > Hi David, > > > > > Some further queries/concerns: > > > > > > > > src/hotspot/share/runtime/objectMonitor.cpp > > > > > > > > Can you please explain the changes to ObjectMonitor::wait: > > > > > > > > ! _recursions = save // restore the old recursion count > > > > ! + jt->get_and_reset_relock_count_after_wait(); // > > > > increased by the deferred relock count > > > > > > > > what is the "deferred relock count"? I gather it relates to > > > > > > > > "The code was extended to be able to deoptimize objects of a > > > frame that > > > > is not the top frame and to let another thread than the owning > > > thread do > > > > it." > > > > > > Yes, these relate. Currently EA based optimizations are reverted, when a compiled frame is > > > replaced with corresponding interpreter frames. Part of this is relocking objects with eliminated > > > locking. New with the enhancement is that we do this also just before object references are > > > acquired through JVMTI. In this case we deoptimize also the owning compiled frame C and we > > > register deoptimized objects as deferred updates. When control returns to C it gets deoptimized, > > > we notice that objects are already deoptimized (reallocated and relocked), so we don't do it again > > > (relocking twice would be incorrect of course). Deferred updates are copied into the new > > > interpreter frames. > > > > > > Problem: relocking is not possible if the target thread T is waiting on the monitor that needs to > > > be relocked. This happens only with non-local objects with EliminateNestedLocks. Instead relocking > > > is deferred until T owns the monitor again. This is what the piece of code above does. > > > > Sorry I need some more detail here. How can you wait() on an object > > monitor if the object allocation and/or locking was optimised away? And > > what is a "non-local object" in this context? Isn't EA restricted to > > thread-confined objects? > > "Non-local object" is an object that escapes its thread. The issue I'm addressing with the changes > in ObjectMonitor::wait are almost unrelated to EA. They are caused by EliminateNestedLocks, where C2 > eliminates recursive locking of an already owned lock. The lock owning object exists on the heap, it > is locked and you can call wait() on it. > > EliminateLocks is the C2 option that controls lock elimination based on EA. Both optimizations have > in common that objects with eliminated locking need to be relocked when deoptimizing a frame, > i.e. when replacing a compiled frame with equivalent interpreter > frames. Deoptimization::relock_objects does that job for /all/ eliminated locks in scope. /All/ can > be a mix of eliminated nested locks and locks of not-escaping objects. > > New with the enhancement: I call relock_objects earlier, just before objects pontentially > escape. But then later when the owning compiled frame gets deoptimized, I must not do it again: > > See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp: > > 373 if ((jvmci_enabled || ((DoEscapeAnalysis || EliminateNestedLocks) && EliminateLocks)) > 374 && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) { > 375 bool unused; > 376 eliminate_locks(thread, chunk, realloc_failures, deoptee, exec_mode, unused); > 377 } > > Now when calling relock_objects early it is quiet possible that I have to relock an object the > target thread currently waits for. Obviously I cannot relock in this case, instead I chose to > introduce relock_count_after_wait to JavaThread. > > > Is it just that some of the locking gets optimized away e.g. > > > > synchronised(obj) { > > synchronised(obj) { > > synchronised(obj) { > > obj.wait(); > > } > > } > > } > > > > If this is reduced to a form as-if it were a single lock of the monitor > > (due to EA) and the wait() triggers a JVM TI event which leads to the > > escape of "obj" then we need to reconstruct the true lock state, and so > > when the wait() internally unblocks and reacquires the monitor it has to > > set the true recursion count to 3, not the 1 that it appeared to be when > > wait() was initially called. Is that the scenario? > > Kind of... except that the locking is not eliminated due to EA and there is no JVM TI event > triggered by wait. > > Add > > LocalObject l1 = new LocalObject(); > > in front of the synchrnized blocks and assume a JVM TI agent acquires l1. This triggers the code in > question. > > See that relocking/reallocating is transactional. If it is done then for /all/ objects in scope and it is > done at most once. It wouldn't be quite so easy to split this in relocking of nested/EA-based > eliminated locks. > > > If so I find this truly awful. Anyone using wait() in a realistic form > > requires a notification and so the object cannot be thread confined. In > > It is not thread confined. > > > which case I would strongly argue that upon hitting the wait() the deopt > > should occur unconditionally and so the lock state is correct before we > > wait and so we don't need to mess with the recursion count internally > > when we reacquire the monitor. > > > > > > > > > which I don't like the sound of at all when it comes to ObjectMonitor > > > > state. So I'd like to understand in detail exactly what is going on here > > > > and why. This is a very intrusive change that seems to badly break > > > > encapsulation and impacts future changes to ObjectMonitor that are under > > > > investigation. > > > > > > I would not regard this as breaking encapsulation. Certainly not badly. > > > > > > I've added a property relock_count_after_wait to JavaThread. The property is well > > > encapsulated. Future ObjectMonitor implementations have to deal with recursion too. They are free > > > in choosing a way to do that as long as that property is taken into account. This is hardly a > > > limitation. > > > > I do think this badly breaks encapsulation as you have to add a callout > > from the guts of the ObjectMonitor code to reach into the thread to get > > this lock count adjustment. I understand why you have had to do this but > > I would much rather see a change to the EA optimisation strategy so that > > this is not needed. > > > > > Note also that the property is a straight forward extension of the existing concept of deferred > > > local updates. It is embedded into the structure holding them. So not even the footprint of a > > > JavaThread is enlarged if no deferred updates are generated. > > > > [...] > > > > > > > > I'm actually duplicating the existing external suspend mechanism, because a thread can be > > > suspended at most once. And hey, and don't like that either! But it seems not unlikely that the > > > duplicate can be removed together with the original and the new type of handshakes that will be > > > used for thread suspend can be used for object deoptimization too. See today's discussion in > > > JDK-8227745 [2]. > > > > I hope that discussion bears some fruit, at the moment it seems not to > > be possible to use handshakes here. :( > > > > The external suspend mechanism is a royal pain in the proverbial that we > > have to carefully live with. The idea that we're duplicating that for > > use in another fringe area of functionality does not thrill me at all. > > > > To be clear, I understand the problem that exists and that you wish to > > solve, but for the runtime parts I balk at the complexity cost of > > solving it. > > I know it's complex, but by far no rocket science. > > Also I find it hard to imagine another fix for JDK-8233915 besides changing the JVM TI specification. > > Thanks, Richard. > > -----Original Message----- > From: David Holmes > Sent: Dienstag, 17. Dezember 2019 08:03 > To: Reingruber, Richard ; serviceability-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com) > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents > > > > David > > On 17/12/2019 4:57 pm, David Holmes wrote: >> Hi Richard, >> >> On 14/12/2019 5:01 am, Reingruber, Richard wrote: >>> Hi David, >>> >>> ?? > Some further queries/concerns: >>> ?? > >>> ?? > src/hotspot/share/runtime/objectMonitor.cpp >>> ?? > >>> ?? > Can you please explain the changes to ObjectMonitor::wait: >>> ?? > >>> ?? > !?? _recursions = save????? // restore the old recursion count >>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); // >>> ?? > increased by the deferred relock count >>> ?? > >>> ?? > what is the "deferred relock count"? I gather it relates to >>> ?? > >>> ?? > "The code was extended to be able to deoptimize objects of a >>> frame that >>> ?? > is not the top frame and to let another thread than the owning >>> thread do >>> ?? > it." >>> >>> Yes, these relate. Currently EA based optimizations are reverted, when >>> a compiled frame is replaced >>> with corresponding interpreter frames. Part of this is relocking >>> objects with eliminated >>> locking. New with the enhancement is that we do this also just before >>> object references are acquired >>> through JVMTI. In this case we deoptimize also the owning compiled >>> frame C and we register >>> deoptimized objects as deferred updates. When control returns to C it >>> gets deoptimized, we notice >>> that objects are already deoptimized (reallocated and relocked), so we >>> don't do it again (relocking >>> twice would be incorrect of course). Deferred updates are copied into >>> the new interpreter frames. >>> >>> Problem: relocking is not possible if the target thread T is waiting >>> on the monitor that needs to be >>> relocked. This happens only with non-local objects with >>> EliminateNestedLocks. Instead relocking is >>> deferred until T owns the monitor again. This is what the piece of >>> code above does. >> >> Sorry I need some more detail here. How can you wait() on an object >> monitor if the object allocation and/or locking was optimised away? And >> what is a "non-local object" in this context? Isn't EA restricted to >> thread-confined objects? >> >> Is it just that some of the locking gets optimized away e.g. >> >> synchronised(obj) { >> ? synchronised(obj) { >> ??? synchronised(obj) { >> ????? obj.wait(); >> ??? } >> ? } >> } >> >> If this is reduced to a form as-if it were a single lock of the monitor >> (due to EA) and the wait() triggers a JVM TI event which leads to the >> escape of "obj" then we need to reconstruct the true lock state, and so >> when the wait() internally unblocks and reacquires the monitor it has to >> set the true recursion count to 3, not the 1 that it appeared to be when >> wait() was initially called. Is that the scenario? >> >> If so I find this truly awful. Anyone using wait() in a realistic form >> requires a notification and so the object cannot be thread confined. In >> which case I would strongly argue that upon hitting the wait() the deopt >> should occur unconditionally and so the lock state is correct before we >> wait and so we don't need to mess with the recursion count internally >> when we reacquire the monitor. >> >>> >>> ?? > which I don't like the sound of at all when it comes to >>> ObjectMonitor >>> ?? > state. So I'd like to understand in detail exactly what is going >>> on here >>> ?? > and why.? This is a very intrusive change that seems to badly break >>> ?? > encapsulation and impacts future changes to ObjectMonitor that >>> are under >>> ?? > investigation. >>> >>> I would not regard this as breaking encapsulation. Certainly not badly. >>> >>> I've added a property relock_count_after_wait to JavaThread. The >>> property is well >>> encapsulated. Future ObjectMonitor implementations have to deal with >>> recursion too. They are free in >>> choosing a way to do that as long as that property is taken into >>> account. This is hardly a >>> limitation. >> >> I do think this badly breaks encapsulation as you have to add a callout >> from the guts of the ObjectMonitor code to reach into the thread to get >> this lock count adjustment. I understand why you have had to do this but >> I would much rather see a change to the EA optimisation strategy so that >> this is not needed. >> >>> Note also that the property is a straight forward extension of the >>> existing concept of deferred >>> local updates. It is embedded into the structure holding them. So not >>> even the footprint of a >>> JavaThread is enlarged if no deferred updates are generated. >>> >>> ?? > --- >>> ?? > >>> ?? > src/hotspot/share/runtime/thread.cpp >>> ?? > >>> ?? > Can you please explain why >>> JavaThread::wait_for_object_deoptimization >>> ?? > has to be handcrafted in this way rather than using proper >>> transitions. >>> ?? > >>> >>> I wrote wait_for_object_deoptimization taking >>> JavaThread::java_suspend_self_with_safepoint_check >>> as template. So in short: for the same reasons :) >>> >>> Threads reach both methods as part of thread state transitions, >>> therefore special handling is >>> required to change thread state on top of ongoing transitions. >>> >>> ?? > We got rid of "deopt suspend" some time ago and it is disturbing >>> to see >>> ?? > it being added back (effectively). This seems like it may be >>> something >>> ?? > that handshakes could be used for. >>> >>> Deopt suspend used to be something rather different with a similar >>> name[1]. It is not being added back. >> >> I stand corrected. Despite comments in the code to the contrary >> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of >> cleanup in this area 13 years ago :) >> >>> >>> I'm actually duplicating the existing external suspend mechanism, >>> because a thread can be suspended >>> at most once. And hey, and don't like that either! But it seems not >>> unlikely that the duplicate can >>> be removed together with the original and the new type of handshakes >>> that will be used for >>> thread suspend can be used for object deoptimization too. See today's >>> discussion in JDK-8227745 [2]. >> >> I hope that discussion bears some fruit, at the moment it seems not to >> be possible to use handshakes here. :( >> >> The external suspend mechanism is a royal pain in the proverbial that we >> have to carefully live with. The idea that we're duplicating that for >> use in another fringe area of functionality does not thrill me at all. >> >> To be clear, I understand the problem that exists and that you wish to >> solve, but for the runtime parts I balk at the complexity cost of >> solving it. >> >> Thanks, >> David >> ----- >> >>> Thanks, Richard. >>> >>> [1] Deopt suspend was something like an async. handshake for >>> architectures with register windows, >>> ???? where patching the return pc for deoptimization of a compiled >>> frame was racy if the owner thread >>> ???? was in native code. Instead a "deopt" suspend flag was set on >>> which the thread patched its own >>> ???? frame upon return from native. So no thread was suspended. It got >>> its name only from the name of >>> ???? the flags. >>> >>> [2] Discussion about using handshakes to sync. with the target thread: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14306727 >>> >>> >>> -----Original Message----- >>> From: David Holmes >>> Sent: Freitag, 13. Dezember 2019 00:56 >>> To: Reingruber, Richard ; >>> serviceability-dev at openjdk.java.net; >>> hotspot-compiler-dev at openjdk.java.net; >>> hotspot-runtime-dev at openjdk.java.net >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>> Performance in the Presence of JVMTI Agents >>> >>> Hi Richard, >>> >>> Some further queries/concerns: >>> >>> src/hotspot/share/runtime/objectMonitor.cpp >>> >>> Can you please explain the changes to ObjectMonitor::wait: >>> >>> !?? _recursions = save????? // restore the old recursion count >>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); // >>> increased by the deferred relock count >>> >>> what is the "deferred relock count"? I gather it relates to >>> >>> "The code was extended to be able to deoptimize objects of a frame that >>> is not the top frame and to let another thread than the owning thread do >>> it." >>> >>> which I don't like the sound of at all when it comes to ObjectMonitor >>> state. So I'd like to understand in detail exactly what is going on here >>> and why.? This is a very intrusive change that seems to badly break >>> encapsulation and impacts future changes to ObjectMonitor that are under >>> investigation. >>> >>> --- >>> >>> src/hotspot/share/runtime/thread.cpp >>> >>> Can you please explain why JavaThread::wait_for_object_deoptimization >>> has to be handcrafted in this way rather than using proper transitions. >>> >>> We got rid of "deopt suspend" some time ago and it is disturbing to see >>> it being added back (effectively). This seems like it may be something >>> that handshakes could be used for. >>> >>> Thanks, >>> David >>> ----- >>> >>> On 12/12/2019 7:02 am, David Holmes wrote: >>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote: >>>>> Hi David, >>>>> >>>>> ??? > Most of the details here are in areas I can comment on in detail, >>>>> but I >>>>> ??? > did take an initial general look at things. >>>>> >>>>> Thanks for taking the time! >>>> >>>> Apologies the above should read: >>>> >>>> "Most of the details here are in areas I *can't* comment on in detail >>>> ..." >>>> >>>> David >>>> >>>>> ??? > The only thing that jumped out at me is that I think the >>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. >>>>> ??? > >>>>> ??? > +? bool is_hidden_from_external_view() const { return true; } >>>>> >>>>> Yes, it should. Will add the method like above. >>>>> >>>>> ??? > Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>>> Without >>>>> ??? > active testing this will just bit-rot. >>>>> >>>>> DeoptimizeObjectsALot is meant for stress testing with a larger >>>>> workload. I will add a minimal test >>>>> to keep it fresh. >>>>> >>>>> ??? > Also on the tests I don't understand your @requires clause: >>>>> ??? > >>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>>> ??? > (vm.opt.TieredCompilation != true)) >>>>> ??? > >>>>> ??? > This seems to require that TieredCompilation is disabled, but >>>>> tiered is >>>>> ??? > our normal mode of operation. ?? >>>>> ??? > >>>>> >>>>> I removed the clause. I guess I wanted to target the tests towards the >>>>> code they are supposed to >>>>> test, and it's easier to analyze failures w/o tiered compilation and >>>>> with just one compiler thread. >>>>> >>>>> Additionally I will make use of >>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>> -----Original Message----- >>>>> From: David Holmes >>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 >>>>> To: Reingruber, Richard ; >>>>> serviceability-dev at openjdk.java.net; >>>>> hotspot-compiler-dev at openjdk.java.net; >>>>> hotspot-runtime-dev at openjdk.java.net >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better >>>>> Performance in the Presence of JVMTI Agents >>>>> >>>>> Hi Richard, >>>>> >>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: >>>>>> Hi, >>>>>> >>>>>> I would like to get reviews please for >>>>>> >>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ >>>>>> >>>>>> Corresponding RFE: >>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 >>>>>> >>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 >>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] >>>>>> >>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without >>>>>> issues (thanks!). In addition the >>>>>> change is being tested at SAP since I posted the first RFR some >>>>>> months ago. >>>>>> >>>>>> The intention of this enhancement is to benefit performance wise from >>>>>> escape analysis even if JVMTI >>>>>> agents request capabilities that allow them to access local variable >>>>>> values. E.g. if you start-up >>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then >>>>>> escape analysis is disabled right >>>>>> from the beginning, well before a debugger attaches -- if ever one >>>>>> should do so. With the >>>>>> enhancement, escape analysis will remain enabled until and after a >>>>>> debugger attaches. EA based >>>>>> optimizations are reverted just before an agent acquires the >>>>>> reference to an object. In the JBS item >>>>>> you'll find more details. >>>>> >>>>> Most of the details here are in areas I can comment on in detail, but I >>>>> did take an initial general look at things. >>>>> >>>>> The only thing that jumped out at me is that I think the >>>>> DeoptimizeObjectsALotThread should be a hidden thread. >>>>> >>>>> +? bool is_hidden_from_external_view() const { return true; } >>>>> >>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. >>>>> Without >>>>> active testing this will just bit-rot. >>>>> >>>>> Also on the tests I don't understand your @requires clause: >>>>> >>>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & >>>>> (vm.opt.TieredCompilation != true)) >>>>> >>>>> This seems to require that TieredCompilation is disabled, but tiered is >>>>> our normal mode of operation. ?? >>>>> >>>>> Thanks, >>>>> David >>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 >>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patch >>>>>> >>>>>> >>>>>> From chiroito107 at gmail.com Fri Feb 21 15:44:38 2020 From: chiroito107 at gmail.com (Chihiro Ito) Date: Sat, 22 Feb 2020 00:44:38 +0900 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> Message-ID: Hi Yasumasa, Thank you for your advice. I decided not to use regular expressions. because of the number of \is confusing. I stopped using codePointAt() and used CharsetEncoder to work with ISO 8859 -1. I added some environment variables to the test. However, environment variables that contain multi bytes or spaces are not included because jtreg does not support them. Could you review this again, please? Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.02/ Regards, Chihiro 2020?2?20?(?) 22:39 Yasumasa Suenaga : > Hi Chihiro, > > On 2020/02/20 20:20, Chihiro Ito wrote: > > Hi Yasumasa, > > > > Thank you for your quick review. > > > > I modified the code without Properties::store. > > > > Could you review this again, please? > > > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.01/ > > - Your change shows "\n" as "\\n". Is it ok? Currently "\n" would be > shown straightly. > - Your change uses Character::codePointAt to convert char to int value. > According to Javadoc, it would be different value if a char is in > surrogate range. > - Description of serializePropertiesToByteArray() says the return value > is encoded in ISO 8859-1, > but it does not seems to be so because the logic depends on the spec > of Properties::store. Is it ok? > - Test case does not stable because system properties might be > different from your environment. > I suggest you to set system properties for testing explicitly. E.g. > -Dnormal=normal_val -D"space space=blank blank" -Dnonascii=????? > -Dopenjdk_url=http://openjdk.java.net/ -Dbackslash="\\" > * Also I recommend you to check "\n" in the test from > `line.separator`. I think it is stable property. > > I've not convinced whether we should compliant to the comment which says > for ISO 8859-1. > If it is important, we can use CharsetEncoder from ISO_8859_1 as below: > > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-encoder/ > > OTOH we can keep current behavior, we can implement more simply as below: > (It's similar to yours.) > > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-props-style/ > > > Thanks, > > Yasumasa > > > > Regards, > > Chihiro > > > > > > 2020?2?20?(?) 9:34 Yasumasa Suenaga suenaga at oss.nttdata.com>>: > > > > Hi Chihiro, > > > > I think this problem is caused by spec of > `Properties::store(Writer)`. > > > > `Properties::store(OutputStream)` says that the output format is as > same as `store(Writer)` [1]. > > `Properties::store(Writer)` says that `#`, `!`, `=`, `:` are written > with a preceding backslash [2]. > > > > So I think we should not use `Properties::store` to serialize > properties. > > > > > > Thanks, > > > > Yasumasa > > > > > > [1] > https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.OutputStream,java.lang.String) > > [2] > https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.Writer,java.lang.String) > > > > > > On 2020/02/19 22:36, Chihiro Ito wrote: > > > Hi, > > > > > > Could you review this tiny fix, please? > > > > > > This problem affected not the only path on Windows, but also > Linux and URLs using ":". > > > > > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.00/ > > > JBS : https://bugs.openjdk.java.net/browse/JDK-8222489 > > > > > > Regards, > > > Chihiro > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Fri Feb 21 16:35:51 2020 From: christoph.langer at sap.com (Langer, Christoph) Date: Fri, 21 Feb 2020 16:35:51 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <2ba69264-b5bc-b9a1-d726-6665e56e5cd8@oss.nttdata.com> <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> Message-ID: Hi all, let me share my thoughts after going through this mail thread and interrogating Ralf quite a bit about the feature ??. First of all, I very much value the discussion and the points brought up here. When deciding about the introduction of an enhancement or a new feature, it's always wise to thoroughly discuss it and value benefits against maintenance cost incurred. However, in this case I'm at a point where I would really like to see this going in. Let me elaborate on this. In the mail cited below, I think Ralf enumerates all the benefits quite comprehensively. With the gzip feature built into the heapdumper, we'll get the option to easily have the VM dump its heap in a space saving format in the same time (or even a bit quicker) than we currently can get fully exploded hprofs. There's no need for additional configuration steps and arrangements, just a simple additional option in the existing jcmd. And with the slightly updated dump format, tool builders will get options to improve handling of compressed heap dumps. Speaking as somebody who has to do customer support once in a while, I can't tell you how valuable it is to be able to give the customer simple instructions that just work when it comes to directing them to provide diagnosis data. And that's clearly a point here. Also, given the loads of different deployment scenarios of JVM applications, e.g. cloud, containers, monolith servers... it's really good to have simple options. On the other hand, that's true, the change introduces a bit of additional complexity. But, without looking into the new code in all details, I think the amount is acceptable. Most of the code really only touches a distinct module for dumping the heap (heapdumper.cpp). Some additional 600 lines of code (the file already had 2000 before). But the code actually is not messing too deep with hotspot internals, so it should be quite maintainable. The rest of the code is a few lines about enhancing the dcmd and some additional access points into zlib. Furthermore, it brings a bit of testing code, but that is a good thing. So, this should really be acceptable - given that Ralf is around to support this once it's checked in and there's also the rest of the SAP team which will be able to help out here. The ideas collected in this thread that go beyond this change, e.g. the possibility to dump the heap out to the network, the option to get heapdumps out to the jcmd and also the potential enhancements to the -XX: HeapDumpBeforeFullGC, -XX: HeapDumpAfterFullGC and -XX:HeapDumpOnOutOfMemoryError are partly orthogonal and are probably worth pursuing on their own. So I really think we should allow this enhancement in and start focusing on a good code review ??. Best regards Christoph > -----Original Message----- > From: hotspot-runtime-dev bounces at openjdk.java.net> On Behalf Of Schmelter, Ralf > Sent: Donnerstag, 20. Februar 2020 14:21 > To: Yasumasa Suenaga ; Ioi Lam > ; serguei.spitsyn at oracle.com; hotspot-runtime- > dev at openjdk.java.net runtime > Cc: serviceability-dev at openjdk.java.net > Subject: [CAUTION] RE: RFR(L) 8237354: Add option to jcmd to write a > gzipped heap dump > > Hi Yasumasa, > > I think it would be great if we could redirect larger chunks data to jcmd. > > But you have to differentiate between binary data (for the heap dump) and > text data (for the e.g. codelist). > > Currently jcmd assumes all bytes to be UTF-8 encoded, converts them to > Unicode and then uses the platform encoding to write characters. This is not > suitable for binary data. > > And of course you cannot use the bufferedStream to get the output to jcmd. > You would have to implement an outputStream which can directly write to > the AttachListener connection. > > > But even with this change, I would still like the gzip compression to be done > in the VM. Let me try to list all the advantages I see for doing this: > > 1. It is by far the easiest to use. You just have to specify -gz for the jcmd. > While your command line (jcmd .... | gzip -c > file) is easy enough, it assumes > you have gzip (not by default on Windows) and it would be painfully slow (~ > 10 x and more), since it is not parallel. You could use pigz, but it is not as > ubiquitous as gzip. I know it is sometimes hard to image this could be a > problem for anyone, but it is. > > It is easy to tell a customer to execute jcmd GC.heap_dump -gz > test.hprof.gz. Adding additional requirements, especially if it is external > programs, and your chance of success diminish fast. > > > 2. The -XX:HeapDumpOnOutOfMemoryError, -XX: HeapDumpBeforeFullGC > and -XX: HeapDumpAfterFullGC options can easily create gzipped heap > dumps directly when the compression is in the VM. And especially if you > create more than one dump (with the before/after gc flags), compression is > very useful. Or if you want to support compressed heap dumps it in the > HotSpotDiagnosticMXBean. Just add a flag and/or compression level. > > > 3. The created gz-file is not a simple gz-file you would get when simply using > gzip. > > It is created in a way that makes it possible to treat it like a random access file > without decompressing it. > > Currently for example the Eclipse Memory Analyzer (MAT) has the option to > directly open a gzipped hprof file and use it without decompression. And for > the initial parsing, they can just read the file sequentially, so this is not too > slow. > > But when accessing the values of objects or arrays, they have to seek to > specific positions in the gzipped hprof file. This is currently implemented by > having a Java implementation of a InflaterInputStream which is capable to > completely copy its state. This copy is then used to start decompressing at > the specific offset for which is was created. As you can imagine, the state of > the inflater is not small (MAT assumes about 64Kb, 32kB is needed at least for > the dictionary), so it limits the number of starting positions you can use for > large files. But it works for all kinds of gzip compressed streams. > > The gzip implementation used to write the heap dump in the VM creates > many small gzip compressed chunks. At the start of each chunk you can > create a fresh GZIPInputStream without having to store any internal state. > You only need to remember the physical offset and the logical offset (so 2 > long values) for each chunk. If you then want to read data at a specific logical > offset, you binary search the nearest preceding chunk and create a > GZIPInputStream reading from the physical offset of that chunk. So on > average you have to decompress about half a chunk to get to the data you > need. > > If you look in the in webrev, you can see > http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.0/test/lib > /jdk/test/lib/hprof/parser/GzipRandomAccess.java.html. This implements > the needed logic to treat the gzipped hprof file as a random access file. I have > used it to add support for gzipped files in the jhat library (which is only used > in tests). In jhat hat for example, the resolution of references is done via > random access. And the file also contains all the functionality MAT would > need. > > You can generate a more or less equivalent file if you use pigz with the -- > independent option. But to make it easier to detect that the gzip file is > chunked (without decompressing it first), I've added a comment marking it as > a hprof file with a given chunk size. This would be missing from the pigz file, > but they instead adding 9 bytes when --independent is specified (00 00 ff ff > 00 00 00 ff ff), so you could detect it too. > > To summarize, the gzipped hprof file created by the VM makes it much > easier for tools to access them efficiently at random positions. You can do > something equivalent with pigz, but not with gzip. > > And getting support for this type of gzipped hprof file by the heap dump > tools will be much easier, if this is the format the openjdk produces, so it will > be widespread. > > Best regards, > Ralf > > -----Original Message----- > From: Yasumasa Suenaga > Sent: Donnerstag, 20. Februar 2020 00:59 > To: Ioi Lam ; Schmelter, Ralf > ; serguei.spitsyn at oracle.com; hotspot-runtime- > dev at openjdk.java.net runtime > Cc: serviceability-dev at openjdk.java.net > Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap > dump > > Hi, > > Generally I agree with Ioi, but I think it is not a problem only for gzipped heap > dump. > > For example, Compiler.codelist and Compiler.CodeHeap_Analytics might be > large text. > In addition, some users want to redirect the result from jcmd to other > command or log collector. > > So I think it would be better if jcmd provides stdout redurect option to all > subocmmands. E.g. > > $ jcmd GC.heap_dump -stdout | gzip -c - > heapdump.hprof.gz > > > Thanks, > > Yasumasa From alexey.menkov at oracle.com Fri Feb 21 18:51:15 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 21 Feb 2020 10:51:15 -0800 Subject: RFR [XS]: 8239462: jdk.hotspot.agent misses some ReleaseStringUTFChars calls in case of early returns In-Reply-To: References: <6ed0d004-3e8c-4b27-f583-06dbacf45173@oracle.com> Message-ID: <425e259a-e905-c80e-8db9-fa39458aaf6b@oracle.com> On 02/21/2020 00:09, Baesken, Matthias wrote: >> Also I see GetStringUTFChars(str, JNI_FALSE). This look bad as well - >> 2nd arg is a pointer, so it should be NULL or nullptr. > > Hi looks like there is another one here, do you think these JNI_FALSE params would really cause trouble ? Not even the compiler warns here ... > > src/java.desktop/unix/native/libawt_xawt/xawt/XlibWrapper.c:824: cname = (char *) (*env)->GetStringUTFChars(env, jstr, JNI_FALSE); This doesn't cause troubles just because both NULL & JNI_FALSE are defines (and are 0). but specify JNI_FALSE as pointer value is confusing. --alex > > Best regards, Matthias > > > >> Hi Matthias, >> >> Looks good in general, but I think it makes sense to fix #2 cases (at >> least I see them in LinuxDebuggerLocal). If GetStringUTFChars fails, the >> code will crash. >> Also I see GetStringUTFChars(str, JNI_FALSE). This look bad as well - >> 2nd arg is a pointer, so it should be NULL or nullptr. >> >> As for #1 and #3 - AFAIU they are both right ways. >> If GetStringUTFChars fails, it throws OOM and return NULL. >> >> And one more thing to consider. >> LinuxDebuggerLocal_attach0 function looks terrible - 7 >> ReleaseStringUTFChars calls for 2 GetStringUTFChars. >> Maybe it make sense to introduce simple wrapper like AutoJavaString in >> src/jdk.hotspot.agent/windows/native/libsaproc/sawindbg.cpp >> It would make the code simpler and less error prone. >> >> --alex >> >> On 02/20/2020 00:53, Baesken, Matthias wrote: >>> Hi Alex / Christoph, thanks for the reviews. >>> >>> New webrev : >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.1/ >>> >>> - includes LinuxDebuggerLocal.cpp >>> - adds a blank Christoph wanted to have >>> >>> >>> A question (hopefully not a stupid one ?? ): >>> At most places in the coding, GetStringUTFChars success is >>> 1. handled by checking NULL , like this : >>> >>> const char *s = (*env)->GetStringUTFChars(env, p, NULL); >>> if (s == NULL) { >>> // handle failure >>> } >>> >>> 2.At some places , success / failure is not handled at all . >>> >>> 3.Here (e.g. LinuxDebuggerLocal.cpp) success / failure check is done by >>> >>> if (env->ExceptionOccurred()) { ... } >>> >>> Which one is the best / right way to do it (most likely not 2.) ? >>> >>> >>> Best regards, Matthias >>> >>> >>> >>>> >>>> Looks like >>>> src/jdk.hotspot.agent/linux/native/libsaproc/LinuxDebuggerLocal.cpp >>>> has similar issues. It would be nice to fix them as well. >>>> >>>> --alex >>>> >>>> On 02/19/2020 06:21, Baesken, Matthias wrote: >>>>> Hello, please review this small change . >>>>> We miss at a few places ReleaseStringUTFChars calls in the native >>>>> jdk.hotspot.agent coding. >>>>> This happens in case of early returns . >>>>> >>>>> >>>>> Bug/webrev : >>>>> >>>>> https://bugs.openjdk.java.net/browse/JDK-8239462 >>>>> >>>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.0/ >>>>> >>>>> >>>>> Thanks, Matthias >>>>> From kim.barrett at oracle.com Fri Feb 21 20:47:53 2020 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 21 Feb 2020 15:47:53 -0500 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: References: Message-ID: <1149B0D9-CA32-4491-A2D9-8459EB90D8AB@oracle.com> > On Feb 7, 2020, at 10:53 AM, Zhengyu Gu wrote: > > Hi, > > I would like purpose this change that allows GC to provide ObjectMarker during JVMTI heap walk. > > Currently, JVMTI heap walk uses oop markword's 'marked' pattern to indicate 'visited' oop. > > Unfortunately, it conflicts with Shenandoah, who uses the pattern to indicate 'forwarding'. When JVMTI heap walk occurs in some of Shenandoah's concurrent heap (e.g. concurrent evacuation or concurrent reference updating phases), it can result corrupted heap, as it tries to resolve a real oop header as a forwarding pointer. > > This patch allows GC to provide ObjectMarker for JVMTI to track 'visited' oop, and uses current implementation as default, so that, it has no impact to GCs other than Shenandoah, who provides its own implementation. (Not a review.) I think the jfr leak profiler has the same problem. It too uses the markWord?s ?marked? pattern to indicate an oop it has visited, since JDK-8234173. From zgu at redhat.com Fri Feb 21 21:31:39 2020 From: zgu at redhat.com (Zhengyu Gu) Date: Fri, 21 Feb 2020 16:31:39 -0500 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: <1149B0D9-CA32-4491-A2D9-8459EB90D8AB@oracle.com> References: <1149B0D9-CA32-4491-A2D9-8459EB90D8AB@oracle.com> Message-ID: Hi Kim, On 2/21/20 3:47 PM, Kim Barrett wrote: >> On Feb 7, 2020, at 10:53 AM, Zhengyu Gu wrote: >> >> Hi, >> >> I would like purpose this change that allows GC to provide ObjectMarker during JVMTI heap walk. >> >> Currently, JVMTI heap walk uses oop markword's 'marked' pattern to indicate 'visited' oop. >> >> Unfortunately, it conflicts with Shenandoah, who uses the pattern to indicate 'forwarding'. When JVMTI heap walk occurs in some of Shenandoah's concurrent heap (e.g. concurrent evacuation or concurrent reference updating phases), it can result corrupted heap, as it tries to resolve a real oop header as a forwarding pointer. >> >> This patch allows GC to provide ObjectMarker for JVMTI to track 'visited' oop, and uses current implementation as default, so that, it has no impact to GCs other than Shenandoah, who provides its own implementation. > > (Not a review.) > > I think the jfr leak profiler has the same problem. It too uses the markWord?s ?marked? pattern > to indicate an oop it has visited, since JDK-8234173. Thanks for pointing it out, I will deal with it next, sigh! -Zhengyu > From alexey.menkov at oracle.com Fri Feb 21 21:47:41 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Fri, 21 Feb 2020 13:47:41 -0800 Subject: RFR [XS]: 8239462: jdk.hotspot.agent misses some ReleaseStringUTFChars calls in case of early returns In-Reply-To: References: <6ed0d004-3e8c-4b27-f583-06dbacf45173@oracle.com> Message-ID: IMO the solution with goto makes it even worse. If you don't want to introduce the wrapper, could you please restore changes in LinuxDebuggerLocal_attach0 from webrev.1 --alex On 02/21/2020 00:32, Baesken, Matthias wrote: > Hi Alex , > > new webrev : > > http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.2/ > > Best Regards, Matthias > > >> >> Hi Matthias, >> >> Looks good in general, but I think it makes sense to fix #2 cases (at >> least I see them in LinuxDebuggerLocal). If GetStringUTFChars fails, the >> code will crash. >> Also I see GetStringUTFChars(str, JNI_FALSE). This look bad as well - >> 2nd arg is a pointer, so it should be NULL or nullptr. >> >> As for #1 and #3 - AFAIU they are both right ways. >> If GetStringUTFChars fails, it throws OOM and return NULL. >> >> And one more thing to consider. >> LinuxDebuggerLocal_attach0 function looks terrible - 7 >> ReleaseStringUTFChars calls for 2 GetStringUTFChars. >> Maybe it make sense to introduce simple wrapper like AutoJavaString in >> src/jdk.hotspot.agent/windows/native/libsaproc/sawindbg.cpp >> It would make the code simpler and less error prone. >> >> --alex >> > From ioi.lam at oracle.com Sat Feb 22 01:19:14 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Fri, 21 Feb 2020 17:19:14 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> Message-ID: <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com> Ralf and Christoph, I agree that making it easy for the user is important, so dependency on an external program like pgzip will be a hassle. How about implementing the compression in a Java program? Will something like this be too much of a hassle? ??? jcmd $PID GC.dump -stdout | java -jar HeapDumpZipper.jar > heap.gz This way, we can implement the exact compression algorithm as Ralf described, without making it part of the VM. Writing it in Java probably would be easier to maintain. If it makes sense, we can include the Java code as part of the JDK, so there's no need to ship a separate JAR file to the user. ??? jcmd $PID GC.dump -stdout | java jdk.internal.heapdump.Zipper > heap.gz Thanks - Ioi On 2/21/20 8:35 AM, Langer, Christoph wrote: > Hi all, > > let me share my thoughts after going through this mail thread and interrogating Ralf quite a bit about the feature ??. > > First of all, I very much value the discussion and the points brought up here. When deciding about the introduction of an enhancement or a new feature, it's always wise to thoroughly discuss it and value benefits against maintenance cost incurred. However, in this case I'm at a point where I would really like to see this going in. Let me elaborate on this. > > In the mail cited below, I think Ralf enumerates all the benefits quite comprehensively. With the gzip feature built into the heapdumper, we'll get the option to easily have the VM dump its heap in a space saving format in the same time (or even a bit quicker) than we currently can get fully exploded hprofs. There's no need for additional configuration steps and arrangements, just a simple additional option in the existing jcmd. And with the slightly updated dump format, tool builders will get options to improve handling of compressed heap dumps. > > Speaking as somebody who has to do customer support once in a while, I can't tell you how valuable it is to be able to give the customer simple instructions that just work when it comes to directing them to provide diagnosis data. And that's clearly a point here. Also, given the loads of different deployment scenarios of JVM applications, e.g. cloud, containers, monolith servers... it's really good to have simple options. > > On the other hand, that's true, the change introduces a bit of additional complexity. But, without looking into the new code in all details, I think the amount is acceptable. Most of the code really only touches a distinct module for dumping the heap (heapdumper.cpp). Some additional 600 lines of code (the file already had 2000 before). But the code actually is not messing too deep with hotspot internals, so it should be quite maintainable. The rest of the code is a few lines about enhancing the dcmd and some additional access points into zlib. Furthermore, it brings a bit of testing code, but that is a good thing. So, this should really be acceptable - given that Ralf is around to support this once it's checked in and there's also the rest of the SAP team which will be able to help out here. > > The ideas collected in this thread that go beyond this change, e.g. the possibility to dump the heap out to the network, the option to get heapdumps out to the jcmd and also the potential enhancements to the -XX: HeapDumpBeforeFullGC, -XX: HeapDumpAfterFullGC and -XX:HeapDumpOnOutOfMemoryError are partly orthogonal and are probably worth pursuing on their own. > > So I really think we should allow this enhancement in and start focusing on a good code review ??. > > Best regards > Christoph > >> -----Original Message----- >> From: hotspot-runtime-dev > bounces at openjdk.java.net> On Behalf Of Schmelter, Ralf >> Sent: Donnerstag, 20. Februar 2020 14:21 >> To: Yasumasa Suenaga ; Ioi Lam >> ; serguei.spitsyn at oracle.com; hotspot-runtime- >> dev at openjdk.java.net runtime >> Cc: serviceability-dev at openjdk.java.net >> Subject: [CAUTION] RE: RFR(L) 8237354: Add option to jcmd to write a >> gzipped heap dump >> >> Hi Yasumasa, >> >> I think it would be great if we could redirect larger chunks data to jcmd. >> >> But you have to differentiate between binary data (for the heap dump) and >> text data (for the e.g. codelist). >> >> Currently jcmd assumes all bytes to be UTF-8 encoded, converts them to >> Unicode and then uses the platform encoding to write characters. This is not >> suitable for binary data. >> >> And of course you cannot use the bufferedStream to get the output to jcmd. >> You would have to implement an outputStream which can directly write to >> the AttachListener connection. >> >> >> But even with this change, I would still like the gzip compression to be done >> in the VM. Let me try to list all the advantages I see for doing this: >> >> 1. It is by far the easiest to use. You just have to specify -gz for the jcmd. >> While your command line (jcmd .... | gzip -c > file) is easy enough, it assumes >> you have gzip (not by default on Windows) and it would be painfully slow (~ >> 10 x and more), since it is not parallel. You could use pigz, but it is not as >> ubiquitous as gzip. I know it is sometimes hard to image this could be a >> problem for anyone, but it is. >> >> It is easy to tell a customer to execute jcmd GC.heap_dump -gz >> test.hprof.gz. Adding additional requirements, especially if it is external >> programs, and your chance of success diminish fast. >> >> >> 2. The -XX:HeapDumpOnOutOfMemoryError, -XX: HeapDumpBeforeFullGC >> and -XX: HeapDumpAfterFullGC options can easily create gzipped heap >> dumps directly when the compression is in the VM. And especially if you >> create more than one dump (with the before/after gc flags), compression is >> very useful. Or if you want to support compressed heap dumps it in the >> HotSpotDiagnosticMXBean. Just add a flag and/or compression level. >> >> >> 3. The created gz-file is not a simple gz-file you would get when simply using >> gzip. >> >> It is created in a way that makes it possible to treat it like a random access file >> without decompressing it. >> >> Currently for example the Eclipse Memory Analyzer (MAT) has the option to >> directly open a gzipped hprof file and use it without decompression. And for >> the initial parsing, they can just read the file sequentially, so this is not too >> slow. >> >> But when accessing the values of objects or arrays, they have to seek to >> specific positions in the gzipped hprof file. This is currently implemented by >> having a Java implementation of a InflaterInputStream which is capable to >> completely copy its state. This copy is then used to start decompressing at >> the specific offset for which is was created. As you can imagine, the state of >> the inflater is not small (MAT assumes about 64Kb, 32kB is needed at least for >> the dictionary), so it limits the number of starting positions you can use for >> large files. But it works for all kinds of gzip compressed streams. >> >> The gzip implementation used to write the heap dump in the VM creates >> many small gzip compressed chunks. At the start of each chunk you can >> create a fresh GZIPInputStream without having to store any internal state. >> You only need to remember the physical offset and the logical offset (so 2 >> long values) for each chunk. If you then want to read data at a specific logical >> offset, you binary search the nearest preceding chunk and create a >> GZIPInputStream reading from the physical offset of that chunk. So on >> average you have to decompress about half a chunk to get to the data you >> need. >> >> If you look in the in webrev, you can see >> http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.0/test/lib >> /jdk/test/lib/hprof/parser/GzipRandomAccess.java.html. This implements >> the needed logic to treat the gzipped hprof file as a random access file. I have >> used it to add support for gzipped files in the jhat library (which is only used >> in tests). In jhat hat for example, the resolution of references is done via >> random access. And the file also contains all the functionality MAT would >> need. >> >> You can generate a more or less equivalent file if you use pigz with the -- >> independent option. But to make it easier to detect that the gzip file is >> chunked (without decompressing it first), I've added a comment marking it as >> a hprof file with a given chunk size. This would be missing from the pigz file, >> but they instead adding 9 bytes when --independent is specified (00 00 ff ff >> 00 00 00 ff ff), so you could detect it too. >> >> To summarize, the gzipped hprof file created by the VM makes it much >> easier for tools to access them efficiently at random positions. You can do >> something equivalent with pigz, but not with gzip. >> >> And getting support for this type of gzipped hprof file by the heap dump >> tools will be much easier, if this is the format the openjdk produces, so it will >> be widespread. >> >> Best regards, >> Ralf >> >> -----Original Message----- >> From: Yasumasa Suenaga >> Sent: Donnerstag, 20. Februar 2020 00:59 >> To: Ioi Lam ; Schmelter, Ralf >> ; serguei.spitsyn at oracle.com; hotspot-runtime- >> dev at openjdk.java.net runtime >> Cc: serviceability-dev at openjdk.java.net >> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap >> dump >> >> Hi, >> >> Generally I agree with Ioi, but I think it is not a problem only for gzipped heap >> dump. >> >> For example, Compiler.codelist and Compiler.CodeHeap_Analytics might be >> large text. >> In addition, some users want to redirect the result from jcmd to other >> command or log collector. >> >> So I think it would be better if jcmd provides stdout redurect option to all >> subocmmands. E.g. >> >> $ jcmd GC.heap_dump -stdout | gzip -c - > heapdump.hprof.gz >> >> >> Thanks, >> >> Yasumasa From suenaga at oss.nttdata.com Sat Feb 22 03:32:09 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 22 Feb 2020 12:32:09 +0900 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> Message-ID: <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> Hi Chihiro, Thank you for updating the webrev. - You use BufferedWriter to create the output, however I think it would be more simply if you use PrintWriter. - Your change would work incorrectly when system property contains mixture of ascii and non-ascii. You can see it with "-Dmixture=a?i". It would be converted to "a\u0061\u3042", it should be "a\u3042i". - Currently key value which contains space char, it would be escaped, but your change does not do so. You can see it with "-D"space space=blank blank"". - You should not use String::trim to create String from ByteBuffer because property value might be contain blank in its tail. You might use ByteBuffer::slice or part of ByteBuffer::array for it. - Did you try to use escaped chars in jtreg testcase? I guess you can set multibytes chars (e.g. CJK chars) with "\u". In case of mixture of Japanese (Hiragana) and ASCII chars, you can embed "-Dmixture=a\u3042i" to testcase. (I'm not sure that...) - In test case, I recommend you to evaluate entire of line. For example, if you want to check line.separator, you should evaluate as below: output.shouldContain("line.separator=\\n"); Thanks, Yasumasa On 2020/02/22 0:44, Chihiro Ito wrote: > Hi Yasumasa, > > Thank you for your advice. > > I decided not to use regular expressions. because of the number of \is confusing. > I stopped using codePointAt() and used CharsetEncoder to work with ISO 8859 -1. > I added some environment variables to the test. However, environment variables that contain multi bytes or spaces are not included because jtreg does not support them. > > Could you review this again, please? > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.02/ > > Regards, > Chihiro > > 2020?2?20?(?) 22:39 Yasumasa Suenaga >: > > Hi Chihiro, > > On 2020/02/20 20:20, Chihiro Ito wrote: > > Hi Yasumasa, > > > > Thank you for your quick review. > > > > I modified the code without Properties::store. > > > > Could you review this again, please? > > > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.01/ > > ? ?- Your change shows "\n" as "\\n". Is it ok? Currently "\n" would be shown straightly. > ? ?- Your change uses Character::codePointAt to convert char to int value. > ? ? ?According to Javadoc, it would be different value if a char is in surrogate range. > ? ?- Description of serializePropertiesToByteArray() says the return value is encoded in ISO 8859-1, > ? ? ?but it does not seems to be so because the logic depends on the spec of Properties::store. Is it ok? > ? ?- Test case does not stable because system properties might be different from your environment. > ? ? ?I suggest you to set system properties for testing explicitly. E.g. > ? ? ? ? ?-Dnormal=normal_val -D"space space=blank blank" -Dnonascii=????? -Dopenjdk_url=http://openjdk.java.net/ -Dbackslash="\\" > ? ? ? ?* Also I recommend you to check "\n" in the test from `line.separator`. I think it is stable property. > > I've not convinced whether we should compliant to the comment which says for ISO 8859-1. > If it is important, we can use CharsetEncoder from ISO_8859_1 as below: > > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-encoder/ > > OTOH we can keep current behavior, we can implement more simply as below: > (It's similar to yours.) > > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-props-style/ > > > Thanks, > > Yasumasa > > > > Regards, > > Chihiro > > > > > > 2020?2?20?(?) 9:34 Yasumasa Suenaga >>: > > > >? ? ?Hi Chihiro, > > > >? ? ?I think this problem is caused by spec of `Properties::store(Writer)`. > > > >? ? ?`Properties::store(OutputStream)` says that the output format is as same as `store(Writer)` [1]. > >? ? ?`Properties::store(Writer)` says that `#`, `!`, `=`, `:` are written with a preceding backslash [2]. > > > >? ? ?So I think we should not use `Properties::store` to serialize properties. > > > > > >? ? ?Thanks, > > > >? ? ?Yasumasa > > > > > >? ? ?[1] https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.OutputStream,java.lang.String) > >? ? ?[2] https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.Writer,java.lang.String) > > > > > >? ? ?On 2020/02/19 22:36, Chihiro Ito wrote: > >? ? ? > Hi, > >? ? ? > > >? ? ? > Could you review this tiny fix, please? > >? ? ? > > >? ? ? > This problem affected not the only path on Windows, but also Linux and URLs using ":". > >? ? ? > > >? ? ? > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.00/ > >? ? ? > JBS : https://bugs.openjdk.java.net/browse/JDK-8222489 > >? ? ? > > >? ? ? > Regards, > >? ? ? > Chihiro > > > From chiroito107 at gmail.com Sat Feb 22 10:23:26 2020 From: chiroito107 at gmail.com (Chihiro Ito) Date: Sat, 22 Feb 2020 19:23:26 +0900 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> Message-ID: Hi Yasumasa, The line separator is not modified because it depends on the environment, but the others have been modified. Could you review this again? Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.03/ Regards, Chihiro 2020?2?22?(?) 12:32 Yasumasa Suenaga : > Hi Chihiro, > > Thank you for updating the webrev. > > > - You use BufferedWriter to create the output, however I think it would > be more simply if you use PrintWriter. > > - Your change would work incorrectly when system property contains > mixture of ascii and non-ascii. > You can see it with "-Dmixture=a?i". It would be converted to > "a\u0061\u3042", it should be "a\u3042i". > > - Currently key value which contains space char, it would be escaped, > but your change does not do so. > You can see it with "-D"space space=blank blank"". > > - You should not use String::trim to create String from ByteBuffer > because property value might be contain blank in its tail. > You might use ByteBuffer::slice or part of ByteBuffer::array for it. > > - Did you try to use escaped chars in jtreg testcase? I guess you can > set multibytes chars (e.g. CJK chars) with "\u". > In case of mixture of Japanese (Hiragana) and ASCII chars, you can > embed "-Dmixture=a\u3042i" to testcase. (I'm not sure that...) > > - In test case, I recommend you to evaluate entire of line. > For example, if you want to check line.separator, you should evaluate > as below: > output.shouldContain("line.separator=\\n"); > > > Thanks, > > Yasumasa > > > On 2020/02/22 0:44, Chihiro Ito wrote: > > Hi Yasumasa, > > > > Thank you for your advice. > > > > I decided not to use regular expressions. because of the number of \is > confusing. > > I stopped using codePointAt() and used CharsetEncoder to work with ISO > 8859 -1. > > I added some environment variables to the test. However, environment > variables that contain multi bytes or spaces are not included because jtreg > does not support them. > > > > Could you review this again, please? > > > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.02/ > > > > Regards, > > Chihiro > > > > 2020?2?20?(?) 22:39 Yasumasa Suenaga suenaga at oss.nttdata.com>>: > > > > Hi Chihiro, > > > > On 2020/02/20 20:20, Chihiro Ito wrote: > > > Hi Yasumasa, > > > > > > Thank you for your quick review. > > > > > > I modified the code without Properties::store. > > > > > > Could you review this again, please? > > > > > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.01/ > > > > - Your change shows "\n" as "\\n". Is it ok? Currently "\n" > would be shown straightly. > > - Your change uses Character::codePointAt to convert char to int > value. > > According to Javadoc, it would be different value if a char is > in surrogate range. > > - Description of serializePropertiesToByteArray() says the > return value is encoded in ISO 8859-1, > > but it does not seems to be so because the logic depends on > the spec of Properties::store. Is it ok? > > - Test case does not stable because system properties might be > different from your environment. > > I suggest you to set system properties for testing explicitly. > E.g. > > -Dnormal=normal_val -D"space space=blank blank" > -Dnonascii=????? -Dopenjdk_url=http://openjdk.java.net/ -Dbackslash="\\" > > * Also I recommend you to check "\n" in the test from > `line.separator`. I think it is stable property. > > > > I've not convinced whether we should compliant to the comment which > says for ISO 8859-1. > > If it is important, we can use CharsetEncoder from ISO_8859_1 as > below: > > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-encoder/ > > > > OTOH we can keep current behavior, we can implement more simply as > below: > > (It's similar to yours.) > > > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-props-style/ > > > > > > Thanks, > > > > Yasumasa > > > > > > > Regards, > > > Chihiro > > > > > > > > > 2020?2?20?(?) 9:34 Yasumasa Suenaga suenaga at oss.nttdata.com>>>: > > > > > > Hi Chihiro, > > > > > > I think this problem is caused by spec of > `Properties::store(Writer)`. > > > > > > `Properties::store(OutputStream)` says that the output format > is as same as `store(Writer)` [1]. > > > `Properties::store(Writer)` says that `#`, `!`, `=`, `:` are > written with a preceding backslash [2]. > > > > > > So I think we should not use `Properties::store` to serialize > properties. > > > > > > > > > Thanks, > > > > > > Yasumasa > > > > > > > > > [1] > https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.OutputStream,java.lang.String) > > > [2] > https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.Writer,java.lang.String) > > > > > > > > > On 2020/02/19 22:36, Chihiro Ito wrote: > > > > Hi, > > > > > > > > Could you review this tiny fix, please? > > > > > > > > This problem affected not the only path on Windows, but > also Linux and URLs using ":". > > > > > > > > Webrev : > http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.00/ > > > > JBS : https://bugs.openjdk.java.net/browse/JDK-8222489 > > > > > > > > Regards, > > > > Chihiro > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From suenaga at oss.nttdata.com Sat Feb 22 12:53:40 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sat, 22 Feb 2020 21:53:40 +0900 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> Message-ID: <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> Hi Chihiro, - My proposal is not enough, so you should refine as below. - Exception types in saveConvert() should be limited. Please do not use `throws Exception`. - I guess you use try-catch statement in serializePropertiesToByteArray due to above checked exception. It should be throw runtime exception when an exception occurs. - Capacity of byteBuf (charBuf.length() * 5) should be (charBuf.length() * 6) because non 8859-1 chars would be "\uxxxx" (6 chars). Also please leave comment for it because a maintainer might not understand the meaning of multiplying 6 in future. - `output.shouldNotContain("C:\\:\\\\");` in testcase is correct? I guess you want to check "C\\:\\\\" is not contained. - To check '\n', you can use Platform::isWindows as below: output.shouldContain(Platform.isWindows() ? "line.separator=\\r\\n" : "lineseparator=\\n"); Yasumasa On 2020/02/22 19:23, Chihiro Ito wrote: > Hi Yasumasa, > > The line separator is not modified because it depends on the environment, but the others have been modified. > > Could you review this again? > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.03/ > > Regards, > Chihiro > > 2020?2?22?(?) 12:32 Yasumasa Suenaga >: > > Hi Chihiro, > > Thank you for updating the webrev. > > > ? ?- You use BufferedWriter to create the output, however I think it would be more simply if you use PrintWriter. > > ? ?- Your change would work incorrectly when system property contains mixture of ascii and non-ascii. > ? ? ?You can see it with "-Dmixture=a?i". It would be converted to "a\u0061\u3042", it should be "a\u3042i". > > ? ?- Currently key value which contains space char, it would be escaped, but your change does not do so. > ? ? ?You can see it with "-D"space space=blank blank"". > > ? ?- You should not use String::trim to create String from ByteBuffer because property value might be contain blank in its tail. > ? ? ?You might use ByteBuffer::slice or part of ByteBuffer::array for it. > > ? ?- Did you try to use escaped chars in jtreg testcase? I guess you can set multibytes chars (e.g. CJK chars) with "\u". > ? ? ?In case of mixture of Japanese (Hiragana) and ASCII chars, you can embed "-Dmixture=a\u3042i" to testcase. (I'm not sure that...) > > ? ?- In test case, I recommend you to evaluate entire of line. > ? ? ?For example, if you want to check line.separator, you should evaluate as below: > ? ? ? ?output.shouldContain("line.separator=\\n"); > > > Thanks, > > Yasumasa > > > On 2020/02/22 0:44, Chihiro Ito wrote: > > Hi Yasumasa, > > > > Thank you for your advice. > > > > I decided not to use regular expressions. because of the number of \is confusing. > > I stopped using codePointAt() and used CharsetEncoder to work with ISO 8859 -1. > > I added some environment variables to the test. However, environment variables that contain multi bytes or spaces are not included because jtreg does not support them. > > > > Could you review this again, please? > > > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.02/ > > > > Regards, > > Chihiro > > > > 2020?2?20?(?) 22:39 Yasumasa Suenaga >>: > > > >? ? ?Hi Chihiro, > > > >? ? ?On 2020/02/20 20:20, Chihiro Ito wrote: > >? ? ? > Hi Yasumasa, > >? ? ? > > >? ? ? > Thank you for your quick review. > >? ? ? > > >? ? ? > I modified the code without Properties::store. > >? ? ? > > >? ? ? > Could you review this again, please? > >? ? ? > > >? ? ? > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.01/ > > > >? ? ? ? ?- Your change shows "\n" as "\\n". Is it ok? Currently "\n" would be shown straightly. > >? ? ? ? ?- Your change uses Character::codePointAt to convert char to int value. > >? ? ? ? ? ?According to Javadoc, it would be different value if a char is in surrogate range. > >? ? ? ? ?- Description of serializePropertiesToByteArray() says the return value is encoded in ISO 8859-1, > >? ? ? ? ? ?but it does not seems to be so because the logic depends on the spec of Properties::store. Is it ok? > >? ? ? ? ?- Test case does not stable because system properties might be different from your environment. > >? ? ? ? ? ?I suggest you to set system properties for testing explicitly. E.g. > >? ? ? ? ? ? ? ?-Dnormal=normal_val -D"space space=blank blank" -Dnonascii=????? -Dopenjdk_url=http://openjdk.java.net/ -Dbackslash="\\" > >? ? ? ? ? ? ?* Also I recommend you to check "\n" in the test from `line.separator`. I think it is stable property. > > > >? ? ?I've not convinced whether we should compliant to the comment which says for ISO 8859-1. > >? ? ?If it is important, we can use CharsetEncoder from ISO_8859_1 as below: > > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-encoder/ > > > >? ? ?OTOH we can keep current behavior, we can implement more simply as below: > >? ? ?(It's similar to yours.) > > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-props-style/ > > > > > >? ? ?Thanks, > > > >? ? ?Yasumasa > > > > > >? ? ? > Regards, > >? ? ? > Chihiro > >? ? ? > > >? ? ? > > >? ? ? > 2020?2?20?(?) 9:34 Yasumasa Suenaga > >>>: > >? ? ? > > >? ? ? >? ? ?Hi Chihiro, > >? ? ? > > >? ? ? >? ? ?I think this problem is caused by spec of `Properties::store(Writer)`. > >? ? ? > > >? ? ? >? ? ?`Properties::store(OutputStream)` says that the output format is as same as `store(Writer)` [1]. > >? ? ? >? ? ?`Properties::store(Writer)` says that `#`, `!`, `=`, `:` are written with a preceding backslash [2]. > >? ? ? > > >? ? ? >? ? ?So I think we should not use `Properties::store` to serialize properties. > >? ? ? > > >? ? ? > > >? ? ? >? ? ?Thanks, > >? ? ? > > >? ? ? >? ? ?Yasumasa > >? ? ? > > >? ? ? > > >? ? ? >? ? ?[1] https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.OutputStream,java.lang.String) > >? ? ? >? ? ?[2] https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.Writer,java.lang.String) > >? ? ? > > >? ? ? > > >? ? ? >? ? ?On 2020/02/19 22:36, Chihiro Ito wrote: > >? ? ? >? ? ? > Hi, > >? ? ? >? ? ? > > >? ? ? >? ? ? > Could you review this tiny fix, please? > >? ? ? >? ? ? > > >? ? ? >? ? ? > This problem affected not the only path on Windows, but also Linux and URLs using ":". > >? ? ? >? ? ? > > >? ? ? >? ? ? > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.00/ > >? ? ? >? ? ? > JBS : https://bugs.openjdk.java.net/browse/JDK-8222489 > >? ? ? >? ? ? > > >? ? ? >? ? ? > Regards, > >? ? ? >? ? ? > Chihiro > >? ? ? > > > > From chiroito107 at gmail.com Sat Feb 22 15:37:08 2020 From: chiroito107 at gmail.com (Chihiro Ito) Date: Sun, 23 Feb 2020 00:37:08 +0900 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> Message-ID: Hi Yasumasa, Thank you for your reviews so many times. How is this fix? Could you review this again, please? Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.04/ Regards, Chihiro 2020?2?22?(?) 21:53 Yasumasa Suenaga : > Hi Chihiro, > > > - My proposal is not enough, so you should refine as below. > - Exception types in saveConvert() should be limited. Please do not > use `throws Exception`. > - I guess you use try-catch statement in > serializePropertiesToByteArray due to above checked exception. > It should be throw runtime exception when an exception occurs. > - Capacity of byteBuf (charBuf.length() * 5) should be > (charBuf.length() * 6) > because non 8859-1 chars would be "\uxxxx" (6 chars). > Also please leave comment for it because a maintainer might not > understand the meaning of multiplying 6 in future. > > - `output.shouldNotContain("C:\\:\\\\");` in testcase is correct? > I guess you want to check "C\\:\\\\" is not contained. > > - To check '\n', you can use Platform::isWindows as below: > output.shouldContain(Platform.isWindows() ? "line.separator=\\r\\n" > : "lineseparator=\\n"); > > > Yasumasa > > > On 2020/02/22 19:23, Chihiro Ito wrote: > > Hi Yasumasa, > > > > The line separator is not modified because it depends on the > environment, but the others have been modified. > > > > Could you review this again? > > > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.03/ > > > > Regards, > > Chihiro > > > > 2020?2?22?(?) 12:32 Yasumasa Suenaga suenaga at oss.nttdata.com>>: > > > > Hi Chihiro, > > > > Thank you for updating the webrev. > > > > > > - You use BufferedWriter to create the output, however I think > it would be more simply if you use PrintWriter. > > > > - Your change would work incorrectly when system property > contains mixture of ascii and non-ascii. > > You can see it with "-Dmixture=a?i". It would be converted to > "a\u0061\u3042", it should be "a\u3042i". > > > > - Currently key value which contains space char, it would be > escaped, but your change does not do so. > > You can see it with "-D"space space=blank blank"". > > > > - You should not use String::trim to create String from > ByteBuffer because property value might be contain blank in its tail. > > You might use ByteBuffer::slice or part of ByteBuffer::array > for it. > > > > - Did you try to use escaped chars in jtreg testcase? I guess > you can set multibytes chars (e.g. CJK chars) with "\u". > > In case of mixture of Japanese (Hiragana) and ASCII chars, you > can embed "-Dmixture=a\u3042i" to testcase. (I'm not sure that...) > > > > - In test case, I recommend you to evaluate entire of line. > > For example, if you want to check line.separator, you should > evaluate as below: > > output.shouldContain("line.separator=\\n"); > > > > > > Thanks, > > > > Yasumasa > > > > > > On 2020/02/22 0:44, Chihiro Ito wrote: > > > Hi Yasumasa, > > > > > > Thank you for your advice. > > > > > > I decided not to use regular expressions. because of the number > of \is confusing. > > > I stopped using codePointAt() and used CharsetEncoder to work > with ISO 8859 -1. > > > I added some environment variables to the test. However, > environment variables that contain multi bytes or spaces are not included > because jtreg does not support them. > > > > > > Could you review this again, please? > > > > > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.02/ > > > > > > Regards, > > > Chihiro > > > > > > 2020?2?20?(?) 22:39 Yasumasa Suenaga suenaga at oss.nttdata.com>>>: > > > > > > Hi Chihiro, > > > > > > On 2020/02/20 20:20, Chihiro Ito wrote: > > > > Hi Yasumasa, > > > > > > > > Thank you for your quick review. > > > > > > > > I modified the code without Properties::store. > > > > > > > > Could you review this again, please? > > > > > > > > Webrev : > http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.01/ > > > > > > - Your change shows "\n" as "\\n". Is it ok? Currently > "\n" would be shown straightly. > > > - Your change uses Character::codePointAt to convert char > to int value. > > > According to Javadoc, it would be different value if a > char is in surrogate range. > > > - Description of serializePropertiesToByteArray() says > the return value is encoded in ISO 8859-1, > > > but it does not seems to be so because the logic > depends on the spec of Properties::store. Is it ok? > > > - Test case does not stable because system properties > might be different from your environment. > > > I suggest you to set system properties for testing > explicitly. E.g. > > > -Dnormal=normal_val -D"space space=blank blank" > -Dnonascii=????? -Dopenjdk_url=http://openjdk.java.net/ -Dbackslash="\\" > > > * Also I recommend you to check "\n" in the test from > `line.separator`. I think it is stable property. > > > > > > I've not convinced whether we should compliant to the comment > which says for ISO 8859-1. > > > If it is important, we can use CharsetEncoder from ISO_8859_1 > as below: > > > > > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-encoder/ > > > > > > OTOH we can keep current behavior, we can implement more > simply as below: > > > (It's similar to yours.) > > > > > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-props-style/ > > > > > > > > > Thanks, > > > > > > Yasumasa > > > > > > > > > > Regards, > > > > Chihiro > > > > > > > > > > > > 2020?2?20?(?) 9:34 Yasumasa Suenaga < > suenaga at oss.nttdata.com suenaga at oss.nttdata.com > suenaga at oss.nttdata.com suenaga at oss.nttdata.com >>>: > > > > > > > > Hi Chihiro, > > > > > > > > I think this problem is caused by spec of > `Properties::store(Writer)`. > > > > > > > > `Properties::store(OutputStream)` says that the output > format is as same as `store(Writer)` [1]. > > > > `Properties::store(Writer)` says that `#`, `!`, `=`, > `:` are written with a preceding backslash [2]. > > > > > > > > So I think we should not use `Properties::store` to > serialize properties. > > > > > > > > > > > > Thanks, > > > > > > > > Yasumasa > > > > > > > > > > > > [1] > https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.OutputStream,java.lang.String) > > > > [2] > https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.Writer,java.lang.String) > > > > > > > > > > > > On 2020/02/19 22:36, Chihiro Ito wrote: > > > > > Hi, > > > > > > > > > > Could you review this tiny fix, please? > > > > > > > > > > This problem affected not the only path on Windows, > but also Linux and URLs using ":". > > > > > > > > > > Webrev : > http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.00/ > > > > > JBS : > https://bugs.openjdk.java.net/browse/JDK-8222489 > > > > > > > > > > Regards, > > > > > Chihiro > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From larry.cable at oracle.com Sat Feb 22 16:20:40 2020 From: larry.cable at oracle.com (Laurence Cable) Date: Sat, 22 Feb 2020 08:20:40 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com> References: <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com> Message-ID: On 2/21/20 5:19 PM, Ioi Lam wrote: > Ralf and Christoph, > > I agree that making it easy for the user is important, so dependency > on an external program like pgzip will be a hassle. > > How about implementing the compression in a Java program? Will > something like this be too much of a hassle? > > jcmd $PID GC.dump -stdout | java -jar HeapDumpZipper.jar > heap.gz we could integrate the compression into cmd itself? > > This way, we can implement the exact compression algorithm as Ralf > described, without making it part of the VM. Writing it in Java > probably would be easier to maintain. > > If it makes sense, we can include the Java code as part of the JDK, so > there's no need to ship a separate JAR file to the user. > > jcmd $PID GC.dump -stdout | java jdk.internal.heapdump.Zipper > > heap.gz > > Thanks > - Ioi > > On 2/21/20 8:35 AM, Langer, Christoph wrote: >> Hi all, >> >> let me share my thoughts after going through this mail thread and >> interrogating Ralf quite a bit about the feature ??. >> >> First of all, I very much value the discussion and the points brought >> up here. When deciding about the introduction of an enhancement or a >> new feature, it's always wise to thoroughly discuss it and value >> benefits against maintenance cost incurred. However, in this case I'm >> at a point where I would really like to see this going in. Let me >> elaborate on this. >> >> In the mail cited below, I think Ralf enumerates all the benefits >> quite comprehensively. With the gzip feature built into the >> heapdumper, we'll get the option to easily have the VM dump its heap >> in a space saving format in the same time (or even a bit quicker) >> than we currently can get fully exploded hprofs. There's no need for >> additional configuration steps and arrangements, just a simple >> additional option in the existing jcmd. And with the slightly updated >> dump format, tool builders will get options to improve handling of >> compressed heap dumps. >> >> Speaking as somebody who has to do customer support once in a while, >> I can't tell you how valuable it is to be able to give the customer >> simple instructions that just work when it comes to directing them to >> provide diagnosis data. And that's clearly a point here. Also, given >> the loads of different deployment scenarios of JVM applications, e.g. >> cloud, containers, monolith servers... it's really good to have >> simple options. >> >> On the other hand, that's true, the change introduces a bit of >> additional complexity. But, without looking into the new code in all >> details, I think the amount is acceptable. Most of the code really >> only touches a distinct module for dumping the heap (heapdumper.cpp). >> Some additional 600 lines of code (the file already had 2000 before). >> But the code actually is not messing too deep with hotspot internals, >> so it should be quite maintainable. The rest of the code is a few >> lines about enhancing the dcmd and some additional access points into >> zlib. Furthermore, it brings a bit of testing code, but that is a >> good thing. So, this should really be acceptable - given that Ralf is >> around to support this once it's checked in and there's also the rest >> of the SAP team which will be able to help out here. >> >> The ideas collected in this thread that go beyond this change, e.g. >> the possibility to dump the heap out to the network, the option to >> get heapdumps out to the jcmd and also the potential enhancements to >> the -XX: HeapDumpBeforeFullGC, -XX: HeapDumpAfterFullGC and >> -XX:HeapDumpOnOutOfMemoryError are partly orthogonal and are probably >> worth pursuing on their own. >> >> So I really think we should allow this enhancement in and start >> focusing on a good code review ??. >> >> Best regards >> Christoph >>> -----Original Message----- >>> From: hotspot-runtime-dev >> bounces at openjdk.java.net> On Behalf Of Schmelter, Ralf >>> Sent: Donnerstag, 20. Februar 2020 14:21 >>> To: Yasumasa Suenaga ; Ioi Lam >>> ; serguei.spitsyn at oracle.com; hotspot-runtime- >>> dev at openjdk.java.net runtime >>> Cc: serviceability-dev at openjdk.java.net >>> Subject: [CAUTION] RE: RFR(L) 8237354: Add option to jcmd to write a >>> gzipped heap dump >>> >>> Hi Yasumasa, >>> >>> I think it would be great if we could redirect larger chunks data to >>> jcmd. >>> >>> But you have to differentiate between binary data (for the heap >>> dump) and >>> text data (for the e.g. codelist). >>> >>> Currently jcmd assumes all bytes to be UTF-8 encoded, converts them to >>> Unicode and then uses the platform encoding to write characters. >>> This is not >>> suitable for binary data. >>> >>> And of course you cannot use the bufferedStream to get the output to >>> jcmd. >>> You would have to implement an outputStream which can directly write to >>> the AttachListener connection. >>> >>> >>> But even with this change, I would still like the gzip compression >>> to be done >>> in the VM. Let me try to list all the advantages I see for doing this: >>> >>> 1. It is by far the easiest to use. You just have to specify -gz for >>> the jcmd. >>> While your command line (jcmd .... | gzip -c > file) is easy enough, >>> it assumes >>> you have gzip (not by default on Windows) and it would be painfully >>> slow (~ >>> 10 x and more), since it is not parallel. You could use pigz, but it >>> is not as >>> ubiquitous as gzip. I know it is sometimes hard to image this could >>> be a >>> problem for anyone, but it is. >>> >>> It is easy to tell a customer to execute jcmd GC.heap_dump -gz >>> test.hprof.gz. Adding additional requirements, especially if it is >>> external >>> programs, and your chance of success diminish fast. >>> >>> >>> 2. The -XX:HeapDumpOnOutOfMemoryError, -XX: HeapDumpBeforeFullGC >>> and -XX: HeapDumpAfterFullGC options can easily create gzipped heap >>> dumps directly when the compression is in the VM. And especially if you >>> create more than one dump (with the before/after gc flags), >>> compression is >>> very useful. Or if you want to support compressed heap dumps it in the >>> HotSpotDiagnosticMXBean. Just add a flag and/or compression level. >>> >>> >>> 3. The created gz-file is not a simple gz-file you would get when >>> simply using >>> gzip. >>> >>> It is created in a way that makes it possible to treat it like a >>> random access file >>> without decompressing it. >>> >>> Currently for example the Eclipse Memory Analyzer (MAT) has the >>> option to >>> directly open a gzipped hprof file and use it without decompression. >>> And for >>> the initial parsing, they can just read the file sequentially, so >>> this is not too >>> slow. >>> >>> But when accessing the values of objects or arrays, they have to >>> seek to >>> specific positions in the gzipped hprof file. This is currently >>> implemented by >>> having a Java implementation of a InflaterInputStream which is >>> capable to >>> completely copy its state. This copy is then used to start >>> decompressing at >>> the specific offset for which is was created. As you can imagine, >>> the state of >>> the inflater is not small (MAT assumes about 64Kb, 32kB is needed at >>> least for >>> the dictionary), so it limits the number of starting positions you >>> can use for >>> large files. But it works for all kinds of gzip compressed streams. >>> >>> The gzip implementation used to write the heap dump in the VM creates >>> many small gzip compressed chunks. At the start of each chunk you can >>> create a fresh GZIPInputStream without having to store any internal >>> state. >>> You only need to remember the physical offset and the logical offset >>> (so 2 >>> long values) for each chunk. If you then want to read data at a >>> specific logical >>> offset, you binary search the nearest preceding chunk and create a >>> GZIPInputStream reading from the physical offset of that chunk. So on >>> average you have to decompress about half a chunk to get to the data >>> you >>> need. >>> >>> If you look in the in webrev, you can see >>> http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.0/test/lib >>> >>> /jdk/test/lib/hprof/parser/GzipRandomAccess.java.html. This implements >>> the needed logic to treat the gzipped hprof file as a random access >>> file. I have >>> used it to add support for gzipped files in the jhat library (which >>> is only used >>> in tests). In jhat hat for example, the resolution of references is >>> done via >>> random access. And the file also contains all the functionality MAT >>> would >>> need. >>> >>> You can generate a more or less equivalent file if you use pigz with >>> the -- >>> independent option. But to make it easier to detect that the gzip >>> file is >>> chunked (without decompressing it first), I've added a comment >>> marking it as >>> a hprof file with a given chunk size. This would be missing from the >>> pigz file, >>> but they instead adding 9 bytes when --independent is specified (00 >>> 00 ff ff >>> 00 00 00 ff ff), so you could detect it too. >>> >>> To summarize, the gzipped hprof file created by the VM makes it much >>> easier for tools to access them efficiently at random positions. You >>> can do >>> something equivalent with pigz, but not with gzip. >>> >>> And getting support for this type of gzipped hprof file by the heap >>> dump >>> tools will be much easier, if this is the format the openjdk >>> produces, so it will >>> be widespread. >>> >>> Best regards, >>> Ralf >>> >>> -----Original Message----- >>> From: Yasumasa Suenaga >>> Sent: Donnerstag, 20. Februar 2020 00:59 >>> To: Ioi Lam ; Schmelter, Ralf >>> ; serguei.spitsyn at oracle.com; hotspot-runtime- >>> dev at openjdk.java.net runtime >>> Cc: serviceability-dev at openjdk.java.net >>> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap >>> dump >>> >>> Hi, >>> >>> Generally I agree with Ioi, but I think it is not a problem only for >>> gzipped heap >>> dump. >>> >>> For example, Compiler.codelist and Compiler.CodeHeap_Analytics might be >>> large text. >>> In addition, some users want to redirect the result from jcmd to other >>> command or log collector. >>> >>> So I think it would be better if jcmd provides stdout redurect >>> option to all >>> subocmmands. E.g. >>> >>> $ jcmd GC.heap_dump -stdout | gzip -c - > heapdump.hprof.gz >>> >>> >>> Thanks, >>> >>> Yasumasa > From suenaga at oss.nttdata.com Sun Feb 23 02:10:07 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sun, 23 Feb 2020 11:10:07 +0900 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> Message-ID: Hi Chihiro, Looks good. Thank you for your updates and patience! Yasumasa On 2020/02/23 0:37, Chihiro Ito wrote: > Hi Yasumasa, > > Thank you for your reviews so many times. > How is this fix? > Could you review this again, please? > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.04/ > > Regards, > Chihiro > > 2020?2?22?(?) 21:53 Yasumasa Suenaga >: > > Hi Chihiro, > > > ? ?- My proposal is not enough, so you should refine as below. > ? ? ? ?- Exception types in saveConvert() should be limited. Please do not use `throws Exception`. > ? ? ? ?- I guess you use try-catch statement in serializePropertiesToByteArray due to above checked exception. > ? ? ? ? ?It should be throw runtime exception when an exception occurs. > ? ? ? ?- Capacity of byteBuf (charBuf.length() * 5) should be (charBuf.length() * 6) > ? ? ? ? ?because non 8859-1 chars would be "\uxxxx" (6 chars). > ? ? ? ? ?Also please leave comment for it because a maintainer might not understand the meaning of multiplying 6 in future. > > ? ?- `output.shouldNotContain("C:\\:\\\\");` in testcase is correct? > ? ? ?I guess you want to check "C\\:\\\\" is not contained. > > ? ?- To check '\n', you can use Platform::isWindows as below: > ? ? ? ?output.shouldContain(Platform.isWindows() ? "line.separator=\\r\\n" : "lineseparator=\\n"); > > > Yasumasa > > > On 2020/02/22 19:23, Chihiro Ito wrote: > > Hi Yasumasa, > > > > The line separator is not modified because it depends on the environment, but the others have been modified. > > > > Could you review this again? > > > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.03/ > > > > Regards, > > Chihiro > > > > 2020?2?22?(?) 12:32 Yasumasa Suenaga >>: > > > >? ? ?Hi Chihiro, > > > >? ? ?Thank you for updating the webrev. > > > > > >? ? ? ? ?- You use BufferedWriter to create the output, however I think it would be more simply if you use PrintWriter. > > > >? ? ? ? ?- Your change would work incorrectly when system property contains mixture of ascii and non-ascii. > >? ? ? ? ? ?You can see it with "-Dmixture=a?i". It would be converted to "a\u0061\u3042", it should be "a\u3042i". > > > >? ? ? ? ?- Currently key value which contains space char, it would be escaped, but your change does not do so. > >? ? ? ? ? ?You can see it with "-D"space space=blank blank"". > > > >? ? ? ? ?- You should not use String::trim to create String from ByteBuffer because property value might be contain blank in its tail. > >? ? ? ? ? ?You might use ByteBuffer::slice or part of ByteBuffer::array for it. > > > >? ? ? ? ?- Did you try to use escaped chars in jtreg testcase? I guess you can set multibytes chars (e.g. CJK chars) with "\u". > >? ? ? ? ? ?In case of mixture of Japanese (Hiragana) and ASCII chars, you can embed "-Dmixture=a\u3042i" to testcase. (I'm not sure that...) > > > >? ? ? ? ?- In test case, I recommend you to evaluate entire of line. > >? ? ? ? ? ?For example, if you want to check line.separator, you should evaluate as below: > >? ? ? ? ? ? ?output.shouldContain("line.separator=\\n"); > > > > > >? ? ?Thanks, > > > >? ? ?Yasumasa > > > > > >? ? ?On 2020/02/22 0:44, Chihiro Ito wrote: > >? ? ? > Hi Yasumasa, > >? ? ? > > >? ? ? > Thank you for your advice. > >? ? ? > > >? ? ? > I decided not to use regular expressions. because of the number of \is confusing. > >? ? ? > I stopped using codePointAt() and used CharsetEncoder to work with ISO 8859 -1. > >? ? ? > I added some environment variables to the test. However, environment variables that contain multi bytes or spaces are not included because jtreg does not support them. > >? ? ? > > >? ? ? > Could you review this again, please? > >? ? ? > > >? ? ? > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.02/ > >? ? ? > > >? ? ? > Regards, > >? ? ? > Chihiro > >? ? ? > > >? ? ? > 2020?2?20?(?) 22:39 Yasumasa Suenaga > >>>: > >? ? ? > > >? ? ? >? ? ?Hi Chihiro, > >? ? ? > > >? ? ? >? ? ?On 2020/02/20 20:20, Chihiro Ito wrote: > >? ? ? >? ? ? > Hi Yasumasa, > >? ? ? >? ? ? > > >? ? ? >? ? ? > Thank you for your quick review. > >? ? ? >? ? ? > > >? ? ? >? ? ? > I modified the code without Properties::store. > >? ? ? >? ? ? > > >? ? ? >? ? ? > Could you review this again, please? > >? ? ? >? ? ? > > >? ? ? >? ? ? > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.01/ > >? ? ? > > >? ? ? >? ? ? ? ?- Your change shows "\n" as "\\n". Is it ok? Currently "\n" would be shown straightly. > >? ? ? >? ? ? ? ?- Your change uses Character::codePointAt to convert char to int value. > >? ? ? >? ? ? ? ? ?According to Javadoc, it would be different value if a char is in surrogate range. > >? ? ? >? ? ? ? ?- Description of serializePropertiesToByteArray() says the return value is encoded in ISO 8859-1, > >? ? ? >? ? ? ? ? ?but it does not seems to be so because the logic depends on the spec of Properties::store. Is it ok? > >? ? ? >? ? ? ? ?- Test case does not stable because system properties might be different from your environment. > >? ? ? >? ? ? ? ? ?I suggest you to set system properties for testing explicitly. E.g. > >? ? ? >? ? ? ? ? ? ? ?-Dnormal=normal_val -D"space space=blank blank" -Dnonascii=????? -Dopenjdk_url=http://openjdk.java.net/ -Dbackslash="\\" > >? ? ? >? ? ? ? ? ? ?* Also I recommend you to check "\n" in the test from `line.separator`. I think it is stable property. > >? ? ? > > >? ? ? >? ? ?I've not convinced whether we should compliant to the comment which says for ISO 8859-1. > >? ? ? >? ? ?If it is important, we can use CharsetEncoder from ISO_8859_1 as below: > >? ? ? > > >? ? ? > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-encoder/ > >? ? ? > > >? ? ? >? ? ?OTOH we can keep current behavior, we can implement more simply as below: > >? ? ? >? ? ?(It's similar to yours.) > >? ? ? > > >? ? ? > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-props-style/ > >? ? ? > > >? ? ? > > >? ? ? >? ? ?Thanks, > >? ? ? > > >? ? ? >? ? ?Yasumasa > >? ? ? > > >? ? ? > > >? ? ? >? ? ? > Regards, > >? ? ? >? ? ? > Chihiro > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? > 2020?2?20?(?) 9:34 Yasumasa Suenaga > >> > >>>>: > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Hi Chihiro, > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?I think this problem is caused by spec of `Properties::store(Writer)`. > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?`Properties::store(OutputStream)` says that the output format is as same as `store(Writer)` [1]. > >? ? ? >? ? ? >? ? ?`Properties::store(Writer)` says that `#`, `!`, `=`, `:` are written with a preceding backslash [2]. > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?So I think we should not use `Properties::store` to serialize properties. > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Thanks, > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?Yasumasa > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?[1] https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.OutputStream,java.lang.String) > >? ? ? >? ? ? >? ? ?[2] https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.Writer,java.lang.String) > >? ? ? >? ? ? > > >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ?On 2020/02/19 22:36, Chihiro Ito wrote: > >? ? ? >? ? ? >? ? ? > Hi, > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > Could you review this tiny fix, please? > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > This problem affected not the only path on Windows, but also Linux and URLs using ":". > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.00/ > >? ? ? >? ? ? >? ? ? > JBS : https://bugs.openjdk.java.net/browse/JDK-8222489 > >? ? ? >? ? ? >? ? ? > > >? ? ? >? ? ? >? ? ? > Regards, > >? ? ? >? ? ? >? ? ? > Chihiro > >? ? ? >? ? ? > > >? ? ? > > > > From suenaga at oss.nttdata.com Sun Feb 23 02:13:25 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Sun, 23 Feb 2020 11:13:25 +0900 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com> Message-ID: On 2020/02/23 1:20, Laurence Cable wrote: > > > On 2/21/20 5:19 PM, Ioi Lam wrote: >> Ralf and Christoph, >> >> I agree that making it easy for the user is important, so dependency on an external program like pgzip will be a hassle. >> >> How about implementing the compression in a Java program? Will something like this be too much of a hassle? >> >> ??? jcmd $PID GC.dump -stdout | java -jar HeapDumpZipper.jar > heap.gz > we could integrate the compression into cmd itself? It's reasonable to me. I think we can improve jcmd and HotSpot for it. Yasumasa >> This way, we can implement the exact compression algorithm as Ralf described, without making it part of the VM. Writing it in Java probably would be easier to maintain. >> >> If it makes sense, we can include the Java code as part of the JDK, so there's no need to ship a separate JAR file to the user. >> >> ??? jcmd $PID GC.dump -stdout | java jdk.internal.heapdump.Zipper > heap.gz >> >> Thanks >> - Ioi >> >> On 2/21/20 8:35 AM, Langer, Christoph wrote: >>> Hi all, >>> >>> let me share my thoughts after going through this mail thread and interrogating Ralf quite a bit about the feature ??. >>> >>> First of all, I very much value the discussion and the points brought up here. When deciding about the introduction of an enhancement or a new feature, it's always wise to thoroughly discuss it and value benefits against maintenance cost incurred. However, in this case I'm at a point where I would really like to see this going in. Let me elaborate on this. >>> >>> In the mail cited below, I think Ralf enumerates all the benefits quite comprehensively. With the gzip feature built into the heapdumper, we'll get the option to easily have the VM dump its heap in a space saving format in the same time (or even a bit quicker) than we currently can get fully exploded hprofs. There's no need for additional configuration steps and arrangements, just a simple additional option in the existing jcmd. And with the slightly updated dump format, tool builders will get options to improve handling of compressed heap dumps. >>> >>> Speaking as somebody who has to do customer support once in a while, I can't tell you how valuable it is to be able to give the customer simple instructions that just work when it comes to directing them to provide diagnosis data. And that's clearly a point here. Also, given the loads of different deployment scenarios of JVM applications, e.g. cloud, containers, monolith servers... it's really good to have simple options. >>> >>> On the other hand, that's true, the change introduces a bit of additional complexity. But, without looking into the new code in all details, I think the amount is acceptable. Most of the code really only touches a distinct module for dumping the heap (heapdumper.cpp). Some additional 600 lines of code (the file already had 2000 before). But the code actually is not messing too deep with hotspot internals, so it should be quite maintainable. The rest of the code is a few lines about enhancing the dcmd and some additional access points into zlib. Furthermore, it brings a bit of testing code, but that is a good thing. So, this should really be acceptable - given that Ralf is around to support this once it's checked in and there's also the rest of the SAP team which will be able to help out here. >>> >>> The ideas collected in this thread that go beyond this change, e.g. the possibility to dump the heap out to the network, the option to get heapdumps out to the jcmd and also the potential enhancements to the -XX: HeapDumpBeforeFullGC, -XX: HeapDumpAfterFullGC and -XX:HeapDumpOnOutOfMemoryError are partly orthogonal and are probably worth pursuing on their own. >>> >>> So I really think we should allow this enhancement in and start focusing on a good code review ??. >>> >>> Best regards >>> Christoph >>>> -----Original Message----- >>>> From: hotspot-runtime-dev >>> bounces at openjdk.java.net> On Behalf Of Schmelter, Ralf >>>> Sent: Donnerstag, 20. Februar 2020 14:21 >>>> To: Yasumasa Suenaga ; Ioi Lam >>>> ; serguei.spitsyn at oracle.com; hotspot-runtime- >>>> dev at openjdk.java.net runtime >>>> Cc: serviceability-dev at openjdk.java.net >>>> Subject: [CAUTION] RE: RFR(L) 8237354: Add option to jcmd to write a >>>> gzipped heap dump >>>> >>>> Hi Yasumasa, >>>> >>>> I think it would be great if we could redirect larger chunks data to jcmd. >>>> >>>> But you have to differentiate between binary data (for the heap dump) and >>>> text data (for the e.g. codelist). >>>> >>>> Currently jcmd assumes all bytes to be UTF-8 encoded, converts them to >>>> Unicode and then uses the platform encoding to write characters. This is not >>>> suitable for binary data. >>>> >>>> And of course you cannot use the bufferedStream to get the output to jcmd. >>>> You would have to implement an outputStream which can directly write to >>>> the AttachListener connection. >>>> >>>> >>>> But even with this change, I would still like the gzip compression to be done >>>> in the VM. Let me try to list all the advantages I see for doing this: >>>> >>>> 1. It is by far the easiest to use. You just have to specify -gz for the jcmd. >>>> While your command line (jcmd .... | gzip -c > file) is easy enough, it assumes >>>> you have gzip (not by default on Windows) and it would be painfully slow (~ >>>> 10 x and more), since it is not parallel. You could use pigz, but it is not as >>>> ubiquitous as gzip. I know it is sometimes hard to image this could be a >>>> problem for anyone, but it is. >>>> >>>> It is easy to tell a customer to execute jcmd GC.heap_dump -gz >>>> test.hprof.gz. Adding additional requirements, especially if it is external >>>> programs, and your chance of success diminish fast. >>>> >>>> >>>> 2. The -XX:HeapDumpOnOutOfMemoryError, -XX: HeapDumpBeforeFullGC >>>> and -XX: HeapDumpAfterFullGC options can easily create gzipped heap >>>> dumps directly when the compression is in the VM. And especially if you >>>> create more than one dump (with the before/after gc flags), compression is >>>> very useful. Or if you want to support compressed heap dumps it in the >>>> HotSpotDiagnosticMXBean. Just add a flag and/or compression level. >>>> >>>> >>>> 3. The created gz-file is not a simple gz-file you would get when simply using >>>> gzip. >>>> >>>> ? It is created in a way that makes it possible to treat it like a random access file >>>> without decompressing it. >>>> >>>> Currently for example the Eclipse Memory Analyzer (MAT) has the option to >>>> directly open a gzipped hprof file and use it without decompression. And for >>>> the initial parsing, they can just read the file sequentially, so this is not too >>>> slow. >>>> >>>> But when accessing the values of objects or arrays, they have to seek to >>>> specific positions in the gzipped hprof file. This is currently implemented by >>>> having a Java implementation of a InflaterInputStream which is capable to >>>> completely copy its state. This copy is then used to start decompressing at >>>> the specific offset for which is was created. As you can imagine, the state of >>>> the inflater is not small (MAT assumes about 64Kb, 32kB is needed at least for >>>> the dictionary), so it limits the number of starting positions you can use for >>>> large files. But it works for all kinds of gzip compressed streams. >>>> >>>> The gzip implementation used to write the heap dump in the VM creates >>>> many small gzip compressed chunks. At the start of each chunk you can >>>> create a fresh GZIPInputStream without having to store any internal state. >>>> You only need to remember the physical offset and the logical offset (so 2 >>>> long values) for each chunk. If you then want to read data at a specific logical >>>> offset, you binary search the nearest preceding chunk and create a >>>> GZIPInputStream reading from the physical offset of that chunk. So on >>>> average you have to decompress about half a chunk to get to the data you >>>> need. >>>> >>>> If you look in the in webrev, you can see >>>> http://cr.openjdk.java.net/~rschmelter/webrevs/8237354/webrev.0/test/lib >>>> /jdk/test/lib/hprof/parser/GzipRandomAccess.java.html. This implements >>>> the needed logic to treat the gzipped hprof file as a random access file. I have >>>> used it to add support for gzipped files in the jhat library (which is only used >>>> in tests). In jhat hat for example, the resolution of references is done via >>>> random access. And the file also contains all the functionality MAT would >>>> need. >>>> >>>> You can generate a more or less equivalent file if you use pigz with the -- >>>> independent option. But to make it easier to detect that the gzip file is >>>> chunked (without decompressing it first), I've added a comment marking it as >>>> a hprof file with a given chunk size. This would be missing from the pigz file, >>>> but they instead adding 9 bytes when --independent is specified (00 00 ff ff >>>> 00 00 00 ff ff), so you could detect it too. >>>> >>>> To summarize, the gzipped hprof file created by the VM makes it much >>>> easier for tools to access them efficiently at random positions. You can do >>>> something equivalent with pigz, but not with gzip. >>>> >>>> And getting support for this type of gzipped hprof file by the heap dump >>>> tools will be much easier, if this is the format the openjdk produces, so it will >>>> be widespread. >>>> >>>> Best regards, >>>> Ralf >>>> >>>> -----Original Message----- >>>> From: Yasumasa Suenaga >>>> Sent: Donnerstag, 20. Februar 2020 00:59 >>>> To: Ioi Lam ; Schmelter, Ralf >>>> ; serguei.spitsyn at oracle.com; hotspot-runtime- >>>> dev at openjdk.java.net runtime >>>> Cc: serviceability-dev at openjdk.java.net >>>> Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap >>>> dump >>>> >>>> Hi, >>>> >>>> Generally I agree with Ioi, but I think it is not a problem only for gzipped heap >>>> dump. >>>> >>>> For example, Compiler.codelist and Compiler.CodeHeap_Analytics might be >>>> large text. >>>> In addition, some users want to redirect the result from jcmd to other >>>> command or log collector. >>>> >>>> So I think it would be better if jcmd provides stdout redurect option to all >>>> subocmmands. E.g. >>>> >>>> ??? $ jcmd GC.heap_dump -stdout | gzip -c - > heapdump.hprof.gz >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >> > From chiroito107 at gmail.com Sun Feb 23 09:26:04 2020 From: chiroito107 at gmail.com (Chihiro Ito) Date: Sun, 23 Feb 2020 18:26:04 +0900 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> Message-ID: Hi Yasumasa, I appreciate that you have reviewed so many times. Regards, Chihiro 2020?2?23?(?) 11:10 Yasumasa Suenaga : > Hi Chihiro, > > Looks good. > Thank you for your updates and patience! > > > Yasumasa > > > On 2020/02/23 0:37, Chihiro Ito wrote: > > Hi Yasumasa, > > > > Thank you for your reviews so many times. > > How is this fix? > > Could you review this again, please? > > > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.04/ > > > > Regards, > > Chihiro > > > > 2020?2?22?(?) 21:53 Yasumasa Suenaga suenaga at oss.nttdata.com>>: > > > > Hi Chihiro, > > > > > > - My proposal is not enough, so you should refine as below. > > - Exception types in saveConvert() should be limited. Please > do not use `throws Exception`. > > - I guess you use try-catch statement in > serializePropertiesToByteArray due to above checked exception. > > It should be throw runtime exception when an exception > occurs. > > - Capacity of byteBuf (charBuf.length() * 5) should be > (charBuf.length() * 6) > > because non 8859-1 chars would be "\uxxxx" (6 chars). > > Also please leave comment for it because a maintainer > might not understand the meaning of multiplying 6 in future. > > > > - `output.shouldNotContain("C:\\:\\\\");` in testcase is correct? > > I guess you want to check "C\\:\\\\" is not contained. > > > > - To check '\n', you can use Platform::isWindows as below: > > output.shouldContain(Platform.isWindows() ? > "line.separator=\\r\\n" : "lineseparator=\\n"); > > > > > > Yasumasa > > > > > > On 2020/02/22 19:23, Chihiro Ito wrote: > > > Hi Yasumasa, > > > > > > The line separator is not modified because it depends on the > environment, but the others have been modified. > > > > > > Could you review this again? > > > > > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.03/ > > > > > > Regards, > > > Chihiro > > > > > > 2020?2?22?(?) 12:32 Yasumasa Suenaga suenaga at oss.nttdata.com>>>: > > > > > > Hi Chihiro, > > > > > > Thank you for updating the webrev. > > > > > > > > > - You use BufferedWriter to create the output, however I > think it would be more simply if you use PrintWriter. > > > > > > - Your change would work incorrectly when system property > contains mixture of ascii and non-ascii. > > > You can see it with "-Dmixture=a?i". It would be > converted to "a\u0061\u3042", it should be "a\u3042i". > > > > > > - Currently key value which contains space char, it would > be escaped, but your change does not do so. > > > You can see it with "-D"space space=blank blank"". > > > > > > - You should not use String::trim to create String from > ByteBuffer because property value might be contain blank in its tail. > > > You might use ByteBuffer::slice or part of > ByteBuffer::array for it. > > > > > > - Did you try to use escaped chars in jtreg testcase? I > guess you can set multibytes chars (e.g. CJK chars) with "\u". > > > In case of mixture of Japanese (Hiragana) and ASCII > chars, you can embed "-Dmixture=a\u3042i" to testcase. (I'm not sure > that...) > > > > > > - In test case, I recommend you to evaluate entire of > line. > > > For example, if you want to check line.separator, you > should evaluate as below: > > > output.shouldContain("line.separator=\\n"); > > > > > > > > > Thanks, > > > > > > Yasumasa > > > > > > > > > On 2020/02/22 0:44, Chihiro Ito wrote: > > > > Hi Yasumasa, > > > > > > > > Thank you for your advice. > > > > > > > > I decided not to use regular expressions. because of the > number of \is confusing. > > > > I stopped using codePointAt() and used CharsetEncoder to > work with ISO 8859 -1. > > > > I added some environment variables to the test. However, > environment variables that contain multi bytes or spaces are not included > because jtreg does not support them. > > > > > > > > Could you review this again, please? > > > > > > > > Webrev : > http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.02/ > > > > > > > > Regards, > > > > Chihiro > > > > > > > > 2020?2?20?(?) 22:39 Yasumasa Suenaga < > suenaga at oss.nttdata.com suenaga at oss.nttdata.com > suenaga at oss.nttdata.com suenaga at oss.nttdata.com >>>: > > > > > > > > Hi Chihiro, > > > > > > > > On 2020/02/20 20:20, Chihiro Ito wrote: > > > > > Hi Yasumasa, > > > > > > > > > > Thank you for your quick review. > > > > > > > > > > I modified the code without Properties::store. > > > > > > > > > > Could you review this again, please? > > > > > > > > > > Webrev : > http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.01/ > > > > > > > > - Your change shows "\n" as "\\n". Is it ok? > Currently "\n" would be shown straightly. > > > > - Your change uses Character::codePointAt to > convert char to int value. > > > > According to Javadoc, it would be different > value if a char is in surrogate range. > > > > - Description of serializePropertiesToByteArray() > says the return value is encoded in ISO 8859-1, > > > > but it does not seems to be so because the logic > depends on the spec of Properties::store. Is it ok? > > > > - Test case does not stable because system > properties might be different from your environment. > > > > I suggest you to set system properties for > testing explicitly. E.g. > > > > -Dnormal=normal_val -D"space space=blank > blank" -Dnonascii=????? -Dopenjdk_url=http://openjdk.java.net/ > -Dbackslash="\\" > > > > * Also I recommend you to check "\n" in the > test from `line.separator`. I think it is stable property. > > > > > > > > I've not convinced whether we should compliant to the > comment which says for ISO 8859-1. > > > > If it is important, we can use CharsetEncoder from > ISO_8859_1 as below: > > > > > > > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-encoder/ > > > > > > > > OTOH we can keep current behavior, we can implement > more simply as below: > > > > (It's similar to yours.) > > > > > > > > > http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-props-style/ > > > > > > > > > > > > Thanks, > > > > > > > > Yasumasa > > > > > > > > > > > > > Regards, > > > > > Chihiro > > > > > > > > > > > > > > > 2020?2?20?(?) 9:34 Yasumasa Suenaga < > suenaga at oss.nttdata.com suenaga at oss.nttdata.com > suenaga at oss.nttdata.com suenaga at oss.nttdata.com >> suenaga at oss.nttdata.com suenaga at oss.nttdata.com > suenaga at oss.nttdata.com suenaga at oss.nttdata.com >>>>: > > > > > > > > > > Hi Chihiro, > > > > > > > > > > I think this problem is caused by spec of > `Properties::store(Writer)`. > > > > > > > > > > `Properties::store(OutputStream)` says that the > output format is as same as `store(Writer)` [1]. > > > > > `Properties::store(Writer)` says that `#`, `!`, > `=`, `:` are written with a preceding backslash [2]. > > > > > > > > > > So I think we should not use > `Properties::store` to serialize properties. > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Yasumasa > > > > > > > > > > > > > > > [1] > https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.OutputStream,java.lang.String) > > > > > [2] > https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.Writer,java.lang.String) > > > > > > > > > > > > > > > On 2020/02/19 22:36, Chihiro Ito wrote: > > > > > > Hi, > > > > > > > > > > > > Could you review this tiny fix, please? > > > > > > > > > > > > This problem affected not the only path on > Windows, but also Linux and URLs using ":". > > > > > > > > > > > > Webrev : > http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.00/ > > > > > > JBS : > https://bugs.openjdk.java.net/browse/JDK-8222489 > > > > > > > > > > > > Regards, > > > > > > Chihiro > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Mon Feb 24 04:21:48 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Sun, 23 Feb 2020 20:21:48 -0800 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port Message-ID: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. New CSR [3] was created for this change and it needs to be reviewed as well. Man pages for jhsdb will be updated in a separate issue. The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). // delegate to the actual SA debug server. 367 DebugServer.main(newArgArray.toArray(new String[0])); However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated but I would prefer to address it in a separate issue. Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker container and connecting to it with the GUI debugger. Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 Thank you, Daniil From suenaga at oss.nttdata.com Mon Feb 24 13:45:13 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Mon, 24 Feb 2020 22:45:13 +0900 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> Message-ID: <1c9d0eaa-77bd-0aa3-9251-c3c693ed2471@oss.nttdata.com> Hi Daniil, - SALauncher::buildAttachArgs is not only to build arguments but also to check consistency of arguments. Thus you should use buildAttachArgs() in runDEBUGD(). If you do so, runDEBUGD() would be more simply. - SADebugDTest::testWithPidAndRmiPort would retry until `--rmiport` can be used. But you can use same port number as RMI registry (1099). It is same as relation between jmxremote.port and jmxremote.rmi.port. Thanks, Yasumasa On 2020/02/24 13:21, Daniil Titov wrote: > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > New CSR [3] was created for this change and it needs to be reviewed as well. > > Man pages for jhsdb will be updated in a separate issue. > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > // delegate to the actual SA debug server. > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > but I would prefer to address it in a separate issue. > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > container and connecting to it with the GUI debugger. > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > Thank you, > Daniil > > From martin.doerr at sap.com Mon Feb 24 13:51:50 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 24 Feb 2020 13:51:50 +0000 Subject: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element In-Reply-To: References: Message-ID: Hi, reposting on serviceability-dev (was core-libs-dev before). Bug: https://bugs.openjdk.java.net/browse/JDK-8239856 Webrev: http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/webrev.00/ Thanks for the review, Thomas! Best regards, Martin From: Thomas St?fe Sent: Montag, 24. Februar 2020 14:41 To: Doerr, Martin Cc: core-libs-dev at openjdk.java.net; Lindenmaier, Goetz ; Langer, Christoph Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element Oh okay. Then it looks okay to me. Cheers, Thomas On Mon, Feb 24, 2020 at 12:56 PM Doerr, Martin > wrote: Hi Thomas, thanks for the quick review. ATTRIBUTE_ALIGNED is defined in hotspot. I can?t use it for src/jdk.jdwp.agent/share/native/libjdwp/ArrayReferenceImpl.c. Christoph had already suggested to make it available for core libs, too, but I haven?t found a good place for it. Best regards, Martin From: Thomas St?fe > Sent: Montag, 24. Februar 2020 12:52 To: Doerr, Martin > Cc: core-libs-dev at openjdk.java.net; Lindenmaier, Goetz >; Langer, Christoph > Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element Hi Martin, maybe use ATTRIBUTE_ALIGNED instead? Cheers, Thomas On Mon, Feb 24, 2020 at 12:44 PM Doerr, Martin > wrote: Hi, we had fixed stack array alignment for Windows 32 bit with JDK-8220348. However, there are also stack allocated jlong and jdouble used as source for SetLongArrayRegion and SetDoubleArrayRegion with insufficient alignment for this platform. Here?s my proposed fix: http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/webrev.00/ Please review. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From goetz.lindenmaier at sap.com Mon Feb 24 16:39:08 2020 From: goetz.lindenmaier at sap.com (Lindenmaier, Goetz) Date: Mon, 24 Feb 2020 16:39:08 +0000 Subject: RFR(L) 8227745: Enable Escape Analysis for Better Performance in the Presence of JVMTI Agents In-Reply-To: References: <1f8a3c7a-fa0f-b5b2-4a8a-7d3d8dbbe1b5@oracle.com> <4b56a45c-a14c-6f74-2bfd-25deaabe8201@oracle.com> <5271429a-481d-ddb9-99dc-b3f6670fcc0b@oracle.com> Message-ID: Hi, I had a look at the progress of this change. Nothing happened since Richard posted his update using more handshakes [1]. But we (SAP) would appreciate a lot if this change could be successfully reviewed and pushed. I think there is basic understanding that this change is helpful. It fixes a number of issues with JVMTI, and will deliver the same performance benefits as EA does in current production mode for debugging scenarios. This is important for us as we run our VMs prepared for debugging in production mode. I understand that Robbin proposed to replace the usage of _suspend_flag with handshakes. Apparently, async handshakes are needed to do so. We have been waiting a while for removal of the _suspend_flag / introduction of async handshakes [2]. What is the status here? I think we should no longer wait, but proceed with this change. We will look into removing the usage of suspend_flag introduced here once it is possible to implement it with handshakes. Also, I think it's a good point in time to push this, as jdk15 is at the beginning of development. Best regards, Goetz. [1] http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-February/037984.html [2] http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037533.html > -----Original Message----- > From: hotspot-runtime-dev > On Behalf Of Reingruber, Richard > Sent: Dienstag, 4. Februar 2020 09:59 > To: David Holmes ; Vladimir Kozlov > (vladimir.kozlov at oracle.com) ; Robbin Ehn > ; serviceability-dev at openjdk.java.net; hotspot- > compiler-dev at openjdk.java.net; hotspot-runtime-dev at openjdk.java.net > Subject: [CAUTION] RE: RFR(L) 8227745: Enable Escape Analysis for Better > Performance in the Presence of JVMTI Agents > > Hi, > > I have prepared webrev.4 that incorporates feedback from webrev.3 (thanks!) > > Full: http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4/ > Incremental: > http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.4.inc/ > > I was not able to eliminate the additional suspend flag now. I'll take care of this > as soon as the > existing suspend-resume-mechanism is reworked. > > Testing: > > Nightly tests @SAP: > > JCK and JTREG, also in Xcomp mode, SPECjvm2008, SPECjbb2015, Renaissance > Suite, SAP specific tests > with fastdebug and release builds on all platforms > > Stress testing with DeoptimizeObjectsALot running SPECjvm2008 40x parallel > for 24h > > Thanks, Richard. > > > More details on the changes: > > * Hide DeoptimizeObjectsALotThread from external view. > > * Changed EscapeBarrier_lock to be a _safepoint_check_never lock. > It used to be _safepoint_check_sometimes, which will be eliminated sooner or > later. > I added explicit thread state changes with ThreadBlockInVM to code paths > where we can wait() > on EscapeBarrier_lock to become safepoint safe. > > * Use handshake EscapeBarrierSuspendHandshake to suspend target threads > instead of vm operation > VM_ThreadSuspendAllForObjDeopt. > > * Removed uses of Threads_lock. When adding a new thread we suspend it iff > EA optimizations are > being reverted. In the previous version we were waiting on Threads_lock > while EA optimizations > were reverted. See EscapeBarrier::thread_added(). > > * Made tests require Xmixed compilation mode. > > * Made tests agnostic regarding tiered compilation. > I.e. tc isn't disabled anymore, and the tests can be run with tc enabled or > disabled. > > * Exercising EATests.java as well with stress test options > DeoptimizeObjectsALot* > Due to the non-deterministic deoptimizations some tests need to be skipped. > We do this to prevent bit-rot of the stress test code. > > * Executing EATests.java as well with graal if available. Driver for this is > EATestsJVMCI.java. Graal cannot pass all tests, because it does not provide all > the new debug info > (namely not_global_escape_in_scope and arg_escape in scopeDesc.hpp). > And graal does not yet support the JVMTI operations force early return and > pop frame. > > * Removed tracing from new jdi tests in EATests.java. Too much trace output > before the debugging > connection is established can cause deadlock because output buffers fill up. > (See https://bugs.openjdk.java.net/browse/JDK-8173304) > > * Many copyright year changes and smaller clean-up changes of testing code > (trailing white-space and > the like). > > > -----Original Message----- > From: David Holmes > Sent: Donnerstag, 19. Dezember 2019 03:12 > To: Reingruber, Richard ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com) > > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance in > the Presence of JVMTI Agents > > Hi Richard, > > I think my issue is with the way EliminateNestedLocks works so I'm going > to look into that more deeply. > > Thanks for the explanations. > > David > > On 18/12/2019 12:47 am, Reingruber, Richard wrote: > > Hi David, > > > > > > > Some further queries/concerns: > > > > > > > > > > src/hotspot/share/runtime/objectMonitor.cpp > > > > > > > > > > Can you please explain the changes to ObjectMonitor::wait: > > > > > > > > > > ! _recursions = save // restore the old recursion count > > > > > ! + jt->get_and_reset_relock_count_after_wait(); // > > > > > increased by the deferred relock count > > > > > > > > > > what is the "deferred relock count"? I gather it relates to > > > > > > > > > > "The code was extended to be able to deoptimize objects of a > > > > frame that > > > > > is not the top frame and to let another thread than the owning > > > > thread do > > > > > it." > > > > > > > > Yes, these relate. Currently EA based optimizations are reverted, when a > compiled frame is > > > > replaced with corresponding interpreter frames. Part of this is relocking > objects with eliminated > > > > locking. New with the enhancement is that we do this also just before > object references are > > > > acquired through JVMTI. In this case we deoptimize also the owning > compiled frame C and we > > > > register deoptimized objects as deferred updates. When control returns > to C it gets deoptimized, > > > > we notice that objects are already deoptimized (reallocated and > relocked), so we don't do it again > > > > (relocking twice would be incorrect of course). Deferred updates are > copied into the new > > > > interpreter frames. > > > > > > > > Problem: relocking is not possible if the target thread T is waiting on the > monitor that needs to > > > > be relocked. This happens only with non-local objects with > EliminateNestedLocks. Instead relocking > > > > is deferred until T owns the monitor again. This is what the piece of > code above does. > > > > > > Sorry I need some more detail here. How can you wait() on an object > > > monitor if the object allocation and/or locking was optimised away? And > > > what is a "non-local object" in this context? Isn't EA restricted to > > > thread-confined objects? > > > > "Non-local object" is an object that escapes its thread. The issue I'm > addressing with the changes > > in ObjectMonitor::wait are almost unrelated to EA. They are caused by > EliminateNestedLocks, where C2 > > eliminates recursive locking of an already owned lock. The lock owning object > exists on the heap, it > > is locked and you can call wait() on it. > > > > EliminateLocks is the C2 option that controls lock elimination based on EA. > Both optimizations have > > in common that objects with eliminated locking need to be relocked when > deoptimizing a frame, > > i.e. when replacing a compiled frame with equivalent interpreter > > frames. Deoptimization::relock_objects does that job for /all/ eliminated > locks in scope. /All/ can > > be a mix of eliminated nested locks and locks of not-escaping objects. > > > > New with the enhancement: I call relock_objects earlier, just before objects > pontentially > > escape. But then later when the owning compiled frame gets deoptimized, I > must not do it again: > > > > See call to EscapeBarrier::objs_are_deoptimized in deoptimization.cpp: > > > > 373 if ((jvmci_enabled || ((DoEscapeAnalysis || EliminateNestedLocks) && > EliminateLocks)) > > 374 && !EscapeBarrier::objs_are_deoptimized(thread, deoptee.id())) { > > 375 bool unused; > > 376 eliminate_locks(thread, chunk, realloc_failures, deoptee, exec_mode, > unused); > > 377 } > > > > Now when calling relock_objects early it is quiet possible that I have to relock > an object the > > target thread currently waits for. Obviously I cannot relock in this case, > instead I chose to > > introduce relock_count_after_wait to JavaThread. > > > > > Is it just that some of the locking gets optimized away e.g. > > > > > > synchronised(obj) { > > > synchronised(obj) { > > > synchronised(obj) { > > > obj.wait(); > > > } > > > } > > > } > > > > > > If this is reduced to a form as-if it were a single lock of the monitor > > > (due to EA) and the wait() triggers a JVM TI event which leads to the > > > escape of "obj" then we need to reconstruct the true lock state, and so > > > when the wait() internally unblocks and reacquires the monitor it has to > > > set the true recursion count to 3, not the 1 that it appeared to be when > > > wait() was initially called. Is that the scenario? > > > > Kind of... except that the locking is not eliminated due to EA and there is no > JVM TI event > > triggered by wait. > > > > Add > > > > LocalObject l1 = new LocalObject(); > > > > in front of the synchrnized blocks and assume a JVM TI agent acquires l1. This > triggers the code in > > question. > > > > See that relocking/reallocating is transactional. If it is done then for /all/ > objects in scope and it is > > done at most once. It wouldn't be quite so easy to split this in relocking of > nested/EA-based > > eliminated locks. > > > > > If so I find this truly awful. Anyone using wait() in a realistic form > > > requires a notification and so the object cannot be thread confined. In > > > > It is not thread confined. > > > > > which case I would strongly argue that upon hitting the wait() the deopt > > > should occur unconditionally and so the lock state is correct before we > > > wait and so we don't need to mess with the recursion count internally > > > when we reacquire the monitor. > > > > > > > > > > > > which I don't like the sound of at all when it comes to ObjectMonitor > > > > > state. So I'd like to understand in detail exactly what is going on here > > > > > and why. This is a very intrusive change that seems to badly break > > > > > encapsulation and impacts future changes to ObjectMonitor that are > under > > > > > investigation. > > > > > > > > I would not regard this as breaking encapsulation. Certainly not badly. > > > > > > > > I've added a property relock_count_after_wait to JavaThread. The > property is well > > > > encapsulated. Future ObjectMonitor implementations have to deal with > recursion too. They are free > > > > in choosing a way to do that as long as that property is taken into > account. This is hardly a > > > > limitation. > > > > > > I do think this badly breaks encapsulation as you have to add a callout > > > from the guts of the ObjectMonitor code to reach into the thread to get > > > this lock count adjustment. I understand why you have had to do this but > > > I would much rather see a change to the EA optimisation strategy so that > > > this is not needed. > > > > > > > Note also that the property is a straight forward extension of the > existing concept of deferred > > > > local updates. It is embedded into the structure holding them. So not > even the footprint of a > > > > JavaThread is enlarged if no deferred updates are generated. > > > > > > [...] > > > > > > > > > > > I'm actually duplicating the existing external suspend mechanism, > because a thread can be > > > > suspended at most once. And hey, and don't like that either! But it > seems not unlikely that the > > > > duplicate can be removed together with the original and the new type > of handshakes that will be > > > > used for thread suspend can be used for object deoptimization too. See > today's discussion in > > > > JDK-8227745 [2]. > > > > > > I hope that discussion bears some fruit, at the moment it seems not to > > > be possible to use handshakes here. :( > > > > > > The external suspend mechanism is a royal pain in the proverbial that we > > > have to carefully live with. The idea that we're duplicating that for > > > use in another fringe area of functionality does not thrill me at all. > > > > > > To be clear, I understand the problem that exists and that you wish to > > > solve, but for the runtime parts I balk at the complexity cost of > > > solving it. > > > > I know it's complex, but by far no rocket science. > > > > Also I find it hard to imagine another fix for JDK-8233915 besides changing > the JVM TI specification. > > > > Thanks, Richard. > > > > -----Original Message----- > > From: David Holmes > > Sent: Dienstag, 17. Dezember 2019 08:03 > > To: Reingruber, Richard ; serviceability- > dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net; hotspot- > runtime-dev at openjdk.java.net; Vladimir Kozlov (vladimir.kozlov at oracle.com) > > > Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better Performance > in the Presence of JVMTI Agents > > > > > > > > David > > > > On 17/12/2019 4:57 pm, David Holmes wrote: > >> Hi Richard, > >> > >> On 14/12/2019 5:01 am, Reingruber, Richard wrote: > >>> Hi David, > >>> > >>> ?? > Some further queries/concerns: > >>> ?? > > >>> ?? > src/hotspot/share/runtime/objectMonitor.cpp > >>> ?? > > >>> ?? > Can you please explain the changes to ObjectMonitor::wait: > >>> ?? > > >>> ?? > !?? _recursions = save????? // restore the old recursion count > >>> ?? > !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > >>> ?? > increased by the deferred relock count > >>> ?? > > >>> ?? > what is the "deferred relock count"? I gather it relates to > >>> ?? > > >>> ?? > "The code was extended to be able to deoptimize objects of a > >>> frame that > >>> ?? > is not the top frame and to let another thread than the owning > >>> thread do > >>> ?? > it." > >>> > >>> Yes, these relate. Currently EA based optimizations are reverted, when > >>> a compiled frame is replaced > >>> with corresponding interpreter frames. Part of this is relocking > >>> objects with eliminated > >>> locking. New with the enhancement is that we do this also just before > >>> object references are acquired > >>> through JVMTI. In this case we deoptimize also the owning compiled > >>> frame C and we register > >>> deoptimized objects as deferred updates. When control returns to C it > >>> gets deoptimized, we notice > >>> that objects are already deoptimized (reallocated and relocked), so we > >>> don't do it again (relocking > >>> twice would be incorrect of course). Deferred updates are copied into > >>> the new interpreter frames. > >>> > >>> Problem: relocking is not possible if the target thread T is waiting > >>> on the monitor that needs to be > >>> relocked. This happens only with non-local objects with > >>> EliminateNestedLocks. Instead relocking is > >>> deferred until T owns the monitor again. This is what the piece of > >>> code above does. > >> > >> Sorry I need some more detail here. How can you wait() on an object > >> monitor if the object allocation and/or locking was optimised away? And > >> what is a "non-local object" in this context? Isn't EA restricted to > >> thread-confined objects? > >> > >> Is it just that some of the locking gets optimized away e.g. > >> > >> synchronised(obj) { > >> ? synchronised(obj) { > >> ??? synchronised(obj) { > >> ????? obj.wait(); > >> ??? } > >> ? } > >> } > >> > >> If this is reduced to a form as-if it were a single lock of the monitor > >> (due to EA) and the wait() triggers a JVM TI event which leads to the > >> escape of "obj" then we need to reconstruct the true lock state, and so > >> when the wait() internally unblocks and reacquires the monitor it has to > >> set the true recursion count to 3, not the 1 that it appeared to be when > >> wait() was initially called. Is that the scenario? > >> > >> If so I find this truly awful. Anyone using wait() in a realistic form > >> requires a notification and so the object cannot be thread confined. In > >> which case I would strongly argue that upon hitting the wait() the deopt > >> should occur unconditionally and so the lock state is correct before we > >> wait and so we don't need to mess with the recursion count internally > >> when we reacquire the monitor. > >> > >>> > >>> ?? > which I don't like the sound of at all when it comes to > >>> ObjectMonitor > >>> ?? > state. So I'd like to understand in detail exactly what is going > >>> on here > >>> ?? > and why.? This is a very intrusive change that seems to badly break > >>> ?? > encapsulation and impacts future changes to ObjectMonitor that > >>> are under > >>> ?? > investigation. > >>> > >>> I would not regard this as breaking encapsulation. Certainly not badly. > >>> > >>> I've added a property relock_count_after_wait to JavaThread. The > >>> property is well > >>> encapsulated. Future ObjectMonitor implementations have to deal with > >>> recursion too. They are free in > >>> choosing a way to do that as long as that property is taken into > >>> account. This is hardly a > >>> limitation. > >> > >> I do think this badly breaks encapsulation as you have to add a callout > >> from the guts of the ObjectMonitor code to reach into the thread to get > >> this lock count adjustment. I understand why you have had to do this but > >> I would much rather see a change to the EA optimisation strategy so that > >> this is not needed. > >> > >>> Note also that the property is a straight forward extension of the > >>> existing concept of deferred > >>> local updates. It is embedded into the structure holding them. So not > >>> even the footprint of a > >>> JavaThread is enlarged if no deferred updates are generated. > >>> > >>> ?? > --- > >>> ?? > > >>> ?? > src/hotspot/share/runtime/thread.cpp > >>> ?? > > >>> ?? > Can you please explain why > >>> JavaThread::wait_for_object_deoptimization > >>> ?? > has to be handcrafted in this way rather than using proper > >>> transitions. > >>> ?? > > >>> > >>> I wrote wait_for_object_deoptimization taking > >>> JavaThread::java_suspend_self_with_safepoint_check > >>> as template. So in short: for the same reasons :) > >>> > >>> Threads reach both methods as part of thread state transitions, > >>> therefore special handling is > >>> required to change thread state on top of ongoing transitions. > >>> > >>> ?? > We got rid of "deopt suspend" some time ago and it is disturbing > >>> to see > >>> ?? > it being added back (effectively). This seems like it may be > >>> something > >>> ?? > that handshakes could be used for. > >>> > >>> Deopt suspend used to be something rather different with a similar > >>> name[1]. It is not being added back. > >> > >> I stand corrected. Despite comments in the code to the contrary > >> deopt_suspend didn't actually cause a self-suspend. I was doing a lot of > >> cleanup in this area 13 years ago :) > >> > >>> > >>> I'm actually duplicating the existing external suspend mechanism, > >>> because a thread can be suspended > >>> at most once. And hey, and don't like that either! But it seems not > >>> unlikely that the duplicate can > >>> be removed together with the original and the new type of handshakes > >>> that will be used for > >>> thread suspend can be used for object deoptimization too. See today's > >>> discussion in JDK-8227745 [2]. > >> > >> I hope that discussion bears some fruit, at the moment it seems not to > >> be possible to use handshakes here. :( > >> > >> The external suspend mechanism is a royal pain in the proverbial that we > >> have to carefully live with. The idea that we're duplicating that for > >> use in another fringe area of functionality does not thrill me at all. > >> > >> To be clear, I understand the problem that exists and that you wish to > >> solve, but for the runtime parts I balk at the complexity cost of > >> solving it. > >> > >> Thanks, > >> David > >> ----- > >> > >>> Thanks, Richard. > >>> > >>> [1] Deopt suspend was something like an async. handshake for > >>> architectures with register windows, > >>> ???? where patching the return pc for deoptimization of a compiled > >>> frame was racy if the owner thread > >>> ???? was in native code. Instead a "deopt" suspend flag was set on > >>> which the thread patched its own > >>> ???? frame upon return from native. So no thread was suspended. It got > >>> its name only from the name of > >>> ???? the flags. > >>> > >>> [2] Discussion about using handshakes to sync. with the target thread: > >>> > >>> https://bugs.openjdk.java.net/browse/JDK- > 8227745?focusedCommentId=14306727&page=com.atlassian.jira.plugin.syste > m.issuetabpanels:comment-tabpanel#comment-14306727 > >>> > >>> > >>> -----Original Message----- > >>> From: David Holmes > >>> Sent: Freitag, 13. Dezember 2019 00:56 > >>> To: Reingruber, Richard ; > >>> serviceability-dev at openjdk.java.net; > >>> hotspot-compiler-dev at openjdk.java.net; > >>> hotspot-runtime-dev at openjdk.java.net > >>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > >>> Performance in the Presence of JVMTI Agents > >>> > >>> Hi Richard, > >>> > >>> Some further queries/concerns: > >>> > >>> src/hotspot/share/runtime/objectMonitor.cpp > >>> > >>> Can you please explain the changes to ObjectMonitor::wait: > >>> > >>> !?? _recursions = save????? // restore the old recursion count > >>> !???????????????? + jt->get_and_reset_relock_count_after_wait(); // > >>> increased by the deferred relock count > >>> > >>> what is the "deferred relock count"? I gather it relates to > >>> > >>> "The code was extended to be able to deoptimize objects of a frame that > >>> is not the top frame and to let another thread than the owning thread do > >>> it." > >>> > >>> which I don't like the sound of at all when it comes to ObjectMonitor > >>> state. So I'd like to understand in detail exactly what is going on here > >>> and why.? This is a very intrusive change that seems to badly break > >>> encapsulation and impacts future changes to ObjectMonitor that are under > >>> investigation. > >>> > >>> --- > >>> > >>> src/hotspot/share/runtime/thread.cpp > >>> > >>> Can you please explain why JavaThread::wait_for_object_deoptimization > >>> has to be handcrafted in this way rather than using proper transitions. > >>> > >>> We got rid of "deopt suspend" some time ago and it is disturbing to see > >>> it being added back (effectively). This seems like it may be something > >>> that handshakes could be used for. > >>> > >>> Thanks, > >>> David > >>> ----- > >>> > >>> On 12/12/2019 7:02 am, David Holmes wrote: > >>>> On 12/12/2019 1:07 am, Reingruber, Richard wrote: > >>>>> Hi David, > >>>>> > >>>>> ??? > Most of the details here are in areas I can comment on in detail, > >>>>> but I > >>>>> ??? > did take an initial general look at things. > >>>>> > >>>>> Thanks for taking the time! > >>>> > >>>> Apologies the above should read: > >>>> > >>>> "Most of the details here are in areas I *can't* comment on in detail > >>>> ..." > >>>> > >>>> David > >>>> > >>>>> ??? > The only thing that jumped out at me is that I think the > >>>>> ??? > DeoptimizeObjectsALotThread should be a hidden thread. > >>>>> ??? > > >>>>> ??? > +? bool is_hidden_from_external_view() const { return true; } > >>>>> > >>>>> Yes, it should. Will add the method like above. > >>>>> > >>>>> ??? > Also I don't see any testing of the DeoptimizeObjectsALotThread. > >>>>> Without > >>>>> ??? > active testing this will just bit-rot. > >>>>> > >>>>> DeoptimizeObjectsALot is meant for stress testing with a larger > >>>>> workload. I will add a minimal test > >>>>> to keep it fresh. > >>>>> > >>>>> ??? > Also on the tests I don't understand your @requires clause: > >>>>> ??? > > >>>>> ??? >?? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled > & > >>>>> ??? > (vm.opt.TieredCompilation != true)) > >>>>> ??? > > >>>>> ??? > This seems to require that TieredCompilation is disabled, but > >>>>> tiered is > >>>>> ??? > our normal mode of operation. ?? > >>>>> ??? > > >>>>> > >>>>> I removed the clause. I guess I wanted to target the tests towards the > >>>>> code they are supposed to > >>>>> test, and it's easier to analyze failures w/o tiered compilation and > >>>>> with just one compiler thread. > >>>>> > >>>>> Additionally I will make use of > >>>>> compiler.whitebox.CompilerWhiteBoxTest.THRESHOLD in the tests. > >>>>> > >>>>> Thanks, > >>>>> Richard. > >>>>> > >>>>> -----Original Message----- > >>>>> From: David Holmes > >>>>> Sent: Mittwoch, 11. Dezember 2019 08:03 > >>>>> To: Reingruber, Richard ; > >>>>> serviceability-dev at openjdk.java.net; > >>>>> hotspot-compiler-dev at openjdk.java.net; > >>>>> hotspot-runtime-dev at openjdk.java.net > >>>>> Subject: Re: RFR(L) 8227745: Enable Escape Analysis for Better > >>>>> Performance in the Presence of JVMTI Agents > >>>>> > >>>>> Hi Richard, > >>>>> > >>>>> On 11/12/2019 7:45 am, Reingruber, Richard wrote: > >>>>>> Hi, > >>>>>> > >>>>>> I would like to get reviews please for > >>>>>> > >>>>>> http://cr.openjdk.java.net/~rrich/webrevs/2019/8227745/webrev.3/ > >>>>>> > >>>>>> Corresponding RFE: > >>>>>> https://bugs.openjdk.java.net/browse/JDK-8227745 > >>>>>> > >>>>>> Fixes also https://bugs.openjdk.java.net/browse/JDK-8233915 > >>>>>> And potentially https://bugs.openjdk.java.net/browse/JDK-8214584 [1] > >>>>>> > >>>>>> Vladimir Kozlov kindly put webrev.3 through tier1-8 testing without > >>>>>> issues (thanks!). In addition the > >>>>>> change is being tested at SAP since I posted the first RFR some > >>>>>> months ago. > >>>>>> > >>>>>> The intention of this enhancement is to benefit performance wise from > >>>>>> escape analysis even if JVMTI > >>>>>> agents request capabilities that allow them to access local variable > >>>>>> values. E.g. if you start-up > >>>>>> with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n, then > >>>>>> escape analysis is disabled right > >>>>>> from the beginning, well before a debugger attaches -- if ever one > >>>>>> should do so. With the > >>>>>> enhancement, escape analysis will remain enabled until and after a > >>>>>> debugger attaches. EA based > >>>>>> optimizations are reverted just before an agent acquires the > >>>>>> reference to an object. In the JBS item > >>>>>> you'll find more details. > >>>>> > >>>>> Most of the details here are in areas I can comment on in detail, but I > >>>>> did take an initial general look at things. > >>>>> > >>>>> The only thing that jumped out at me is that I think the > >>>>> DeoptimizeObjectsALotThread should be a hidden thread. > >>>>> > >>>>> +? bool is_hidden_from_external_view() const { return true; } > >>>>> > >>>>> Also I don't see any testing of the DeoptimizeObjectsALotThread. > >>>>> Without > >>>>> active testing this will just bit-rot. > >>>>> > >>>>> Also on the tests I don't understand your @requires clause: > >>>>> > >>>>> ??? @requires ((vm.compMode != "Xcomp") & vm.compiler2.enabled & > >>>>> (vm.opt.TieredCompilation != true)) > >>>>> > >>>>> This seems to require that TieredCompilation is disabled, but tiered is > >>>>> our normal mode of operation. ?? > >>>>> > >>>>> Thanks, > >>>>> David > >>>>> > >>>>>> Thanks, > >>>>>> Richard. > >>>>>> > >>>>>> [1] Experimental fix for JDK-8214584 based on JDK-8227745 > >>>>>> > http://cr.openjdk.java.net/~rrich/webrevs/2019/8214584/experiment_v1.patc > h > >>>>>> > >>>>>> > >>>>>> From zgu at redhat.com Mon Feb 24 16:49:18 2020 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 24 Feb 2020 11:49:18 -0500 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: References: <8ea20a15-bdb9-27b0-c306-154f539a3674@oracle.com> <3a5d60fd-04d5-da96-3d79-242d43fdec79@redhat.com> <901d7307-cf36-0367-e09c-ff47c76bbc25@oracle.com> Message-ID: <75082842-c67d-7a60-58b3-c10c67f7646f@redhat.com> Hi all, Updated according to your comments: http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.02/ I modified vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test [1] to walk 300K objects. Without patch: Time: 987431 nsecs Time: 1135390 nsecs Time: 1142519 nsecs Time: 962816 nsecs Time: 1015958 nsecs Avg: 1048822 nsecs With patch: 1105015 nsecs 1142425 nsecs 968057 nsecs 1383838 nsecs 1079885 nsecs Avg: 1135844 nsecs So, it shows about 8% performance hit. Thanks, -Zhengyu [1] http://cr.openjdk.java.net/~zgu/JDK-8238633/test/webrev.00/ On 2/21/20 8:01 AM, coleen.phillimore at oracle.com wrote: > Adding serviceability-dev back. > Coleen > > On 2/21/20 7:59 AM, coleen.phillimore at oracle.com wrote: >> >> Hi, I had a quick look at this, minus the shenandoah code. >> >> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/src/hotspot/share/gc/shared/objectMarker.hpp.html >> >> >> I think this file could have forward declarations of GrowableArray and >> I didn't see a need for the markWord.hpp include. >> >> This change on the whole looks good to me. >> >> Coleen >> >> On 2/21/20 5:23 AM, Stefan Karlsson wrote: >>> Hi Zhengyu, >>> >>> On 2020-02-17 15:51, Zhengyu Gu wrote: >>>> Hi Stefan, >>>> >>>> Thanks for the review and suggestions, updated accordingly: >>>> >>>> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/ >>> >>> Thanks for moving the code. I think this looks good. >>> >>> If you're up for it, I have a couple of style change suggestions: >>> >>> 1) ObjectMarker uses two verbs to describe the same thing: "mark" and >>> "visit". I propose that we only use "mark" in ObjectMarker and leave >>> the usage of "visited" to the Jvmti code. >>> >>> 2) Some updates to odd whitespaces >>> >>> 3) Using forward declarations in Shenandoah code. >>> >>> I've bundled those changes into webrevs: >>> >>> https://cr.openjdk.java.net/~stefank/8238633/webrev.01.delta >>> https://cr.openjdk.java.net/~stefank/8238633/webrev.01 >>> >>> Regarding performance testing, the HeapWalkTests you used seems to >>> use a very small heap. I think it would be good to redo the >>> measurements on a larger heap. Could you take the HeapWalkTest and >>> add a few GBs of small, linked objects? >>> >>> Thank, >>> StefanK >>>> >>>>> >>>>> --- >>>>> Previously, the calls to 'mark' and 'visited' were inlineable, but >>>>> now every GC has to take a virtual call when marking the objects. >>>>> My guess is that this code is slow anyway, and that it doesn't >>>>> matter too much, but did you measure the effect of that change >>>>> with, for example, G1? >>>>> >>>> I did rough measurement, timing >>>> vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test. >>>> >>>> If you know any tests/benchmarks I should measure, please let me know. >>>> >>>> Thanks, >>>> >>>> -Zhengyu >>>> >>>> >>>>> Thanks, >>>>> StefanK >>>>> >>>>>> Test: >>>>>> ?? hotspot_gc >>>>>> ?? vmTestbase_nsk_jdi >>>>>> ?? vmTestbase_nsk_jvmti >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -Zhengyu >>>>>> >>>>>> >>>>> >>>> >>> >> > From erik.osterlund at oracle.com Mon Feb 24 17:04:33 2020 From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=) Date: Mon, 24 Feb 2020 18:04:33 +0100 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: <75082842-c67d-7a60-58b3-c10c67f7646f@redhat.com> References: <75082842-c67d-7a60-58b3-c10c67f7646f@redhat.com> Message-ID: Hi Zhengyu, Can?t your barriers just perform a NULL check on the forwardee instead? forwardee() == NULL never means forwarded, does it? Both JVMTI and JFR just ?mark? the markWord, leaving its forwardee == NULL. That way you can solve the issue in the backend instead, and we don?t need to do anything about JFR either. Or did I miss something? Thanks, /Erik > On 24 Feb 2020, at 17:49, Zhengyu Gu wrote: > > ?Hi all, > > Updated according to your comments: > http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.02/ > > I modified vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test [1] to walk 300K objects. > > Without patch: > Time: 987431 nsecs > Time: 1135390 nsecs > Time: 1142519 nsecs > Time: 962816 nsecs > Time: 1015958 nsecs > > Avg: 1048822 nsecs > > With patch: > 1105015 nsecs > 1142425 nsecs > 968057 nsecs > 1383838 nsecs > 1079885 nsecs > > Avg: 1135844 nsecs > > So, it shows about 8% performance hit. > > Thanks, > > -Zhengyu > > [1] http://cr.openjdk.java.net/~zgu/JDK-8238633/test/webrev.00/ > > > > > >> On 2/21/20 8:01 AM, coleen.phillimore at oracle.com wrote: >> Adding serviceability-dev back. >> Coleen >>> On 2/21/20 7:59 AM, coleen.phillimore at oracle.com wrote: >>> >>> Hi, I had a quick look at this, minus the shenandoah code. >>> >>> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/src/hotspot/share/gc/shared/objectMarker.hpp.html >>> >>> I think this file could have forward declarations of GrowableArray and I didn't see a need for the markWord.hpp include. >>> >>> This change on the whole looks good to me. >>> >>> Coleen >>> >>> On 2/21/20 5:23 AM, Stefan Karlsson wrote: >>>> Hi Zhengyu, >>>> >>>> On 2020-02-17 15:51, Zhengyu Gu wrote: >>>>> Hi Stefan, >>>>> >>>>> Thanks for the review and suggestions, updated accordingly: >>>>> >>>>> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/ >>>> >>>> Thanks for moving the code. I think this looks good. >>>> >>>> If you're up for it, I have a couple of style change suggestions: >>>> >>>> 1) ObjectMarker uses two verbs to describe the same thing: "mark" and "visit". I propose that we only use "mark" in ObjectMarker and leave the usage of "visited" to the Jvmti code. >>>> >>>> 2) Some updates to odd whitespaces >>>> >>>> 3) Using forward declarations in Shenandoah code. >>>> >>>> I've bundled those changes into webrevs: >>>> >>>> https://cr.openjdk.java.net/~stefank/8238633/webrev.01.delta >>>> https://cr.openjdk.java.net/~stefank/8238633/webrev.01 >>>> >>>> Regarding performance testing, the HeapWalkTests you used seems to use a very small heap. I think it would be good to redo the measurements on a larger heap. Could you take the HeapWalkTest and add a few GBs of small, linked objects? >>>> >>>> Thank, >>>> StefanK >>>>> >>>>>> >>>>>> --- >>>>>> Previously, the calls to 'mark' and 'visited' were inlineable, but now every GC has to take a virtual call when marking the objects. My guess is that this code is slow anyway, and that it doesn't matter too much, but did you measure the effect of that change with, for example, G1? >>>>>> >>>>> I did rough measurement, timing vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test. >>>>> >>>>> If you know any tests/benchmarks I should measure, please let me know. >>>>> >>>>> Thanks, >>>>> >>>>> -Zhengyu >>>>> >>>>> >>>>>> Thanks, >>>>>> StefanK >>>>>> >>>>>>> Test: >>>>>>> hotspot_gc >>>>>>> vmTestbase_nsk_jdi >>>>>>> vmTestbase_nsk_jvmti >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -Zhengyu >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> > From serguei.spitsyn at oracle.com Mon Feb 24 18:36:18 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 24 Feb 2020 10:36:18 -0800 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> Message-ID: <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Mon Feb 24 19:04:06 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 24 Feb 2020 11:04:06 -0800 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> Message-ID: Hi Daniil, I've looked at CSR and posted a couple of questions there. It'd be nice if you help to resolve my confusion. :) Thanks, Serguei On 2/23/20 20:21, Daniil Titov wrote: > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > New CSR [3] was created for this change and it needs to be reviewed as well. > > Man pages for jhsdb will be updated in a separate issue. > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > // delegate to the actual SA debug server. > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > but I would prefer to address it in a separate issue. > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > container and connecting to it with the GUI debugger. > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > Thank you, > Daniil > > From serguei.spitsyn at oracle.com Mon Feb 24 19:11:44 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 24 Feb 2020 11:11:44 -0800 Subject: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element In-Reply-To: References: Message-ID: <480add53-59c5-c132-f903-2aa99f121ffd@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Mon Feb 24 20:51:40 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 24 Feb 2020 12:51:40 -0800 Subject: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element In-Reply-To: References: Message-ID: An HTML attachment was scrubbed... URL: From zgu at redhat.com Mon Feb 24 20:58:55 2020 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 24 Feb 2020 15:58:55 -0500 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: References: <75082842-c67d-7a60-58b3-c10c67f7646f@redhat.com> Message-ID: Hi Erik, On 2/24/20 12:04 PM, Erik ?sterlund wrote: > Hi Zhengyu, > > Can?t your barriers just perform a NULL check on the forwardee instead? forwardee() == NULL never means forwarded, does it? Both JVMTI and JFR just ?mark? the markWord, leaving its forwardee == NULL. > > That way you can solve the issue in the backend instead, and we don?t need to do anything about JFR either. Or did I miss something? You are right, this is a much simple solution. But the concern is that, resolve_forward() is the most used barrier, additional null check is undesirable. After offline chat with my colleagues, we realize that it may be ok. As JVMTI/JFR heap walk happens at safepoints, we really don't have to add the null check in regular barrier. Instead, force GC to use different version of resolve_forward with null check. Let me protocol this alternative, will get back you soon. Thank, -Zhengyu > > Thanks, > /Erik > >> On 24 Feb 2020, at 17:49, Zhengyu Gu wrote: >> >> ?Hi all, >> >> Updated according to your comments: >> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.02/ >> >> I modified vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test [1] to walk 300K objects. >> >> Without patch: >> Time: 987431 nsecs >> Time: 1135390 nsecs >> Time: 1142519 nsecs >> Time: 962816 nsecs >> Time: 1015958 nsecs >> >> Avg: 1048822 nsecs >> >> With patch: >> 1105015 nsecs >> 1142425 nsecs >> 968057 nsecs >> 1383838 nsecs >> 1079885 nsecs >> >> Avg: 1135844 nsecs >> >> So, it shows about 8% performance hit. >> >> Thanks, >> >> -Zhengyu >> >> [1] http://cr.openjdk.java.net/~zgu/JDK-8238633/test/webrev.00/ >> >> >> >> >> >>> On 2/21/20 8:01 AM, coleen.phillimore at oracle.com wrote: >>> Adding serviceability-dev back. >>> Coleen >>>> On 2/21/20 7:59 AM, coleen.phillimore at oracle.com wrote: >>>> >>>> Hi, I had a quick look at this, minus the shenandoah code. >>>> >>>> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/src/hotspot/share/gc/shared/objectMarker.hpp.html >>>> >>>> I think this file could have forward declarations of GrowableArray and I didn't see a need for the markWord.hpp include. >>>> >>>> This change on the whole looks good to me. >>>> >>>> Coleen >>>> >>>> On 2/21/20 5:23 AM, Stefan Karlsson wrote: >>>>> Hi Zhengyu, >>>>> >>>>> On 2020-02-17 15:51, Zhengyu Gu wrote: >>>>>> Hi Stefan, >>>>>> >>>>>> Thanks for the review and suggestions, updated accordingly: >>>>>> >>>>>> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/ >>>>> >>>>> Thanks for moving the code. I think this looks good. >>>>> >>>>> If you're up for it, I have a couple of style change suggestions: >>>>> >>>>> 1) ObjectMarker uses two verbs to describe the same thing: "mark" and "visit". I propose that we only use "mark" in ObjectMarker and leave the usage of "visited" to the Jvmti code. >>>>> >>>>> 2) Some updates to odd whitespaces >>>>> >>>>> 3) Using forward declarations in Shenandoah code. >>>>> >>>>> I've bundled those changes into webrevs: >>>>> >>>>> https://cr.openjdk.java.net/~stefank/8238633/webrev.01.delta >>>>> https://cr.openjdk.java.net/~stefank/8238633/webrev.01 >>>>> >>>>> Regarding performance testing, the HeapWalkTests you used seems to use a very small heap. I think it would be good to redo the measurements on a larger heap. Could you take the HeapWalkTest and add a few GBs of small, linked objects? >>>>> >>>>> Thank, >>>>> StefanK >>>>>> >>>>>>> >>>>>>> --- >>>>>>> Previously, the calls to 'mark' and 'visited' were inlineable, but now every GC has to take a virtual call when marking the objects. My guess is that this code is slow anyway, and that it doesn't matter too much, but did you measure the effect of that change with, for example, G1? >>>>>>> >>>>>> I did rough measurement, timing vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test. >>>>>> >>>>>> If you know any tests/benchmarks I should measure, please let me know. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> -Zhengyu >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> StefanK >>>>>>> >>>>>>>> Test: >>>>>>>> hotspot_gc >>>>>>>> vmTestbase_nsk_jdi >>>>>>>> vmTestbase_nsk_jvmti >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> -Zhengyu >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >> > From erik.osterlund at oracle.com Mon Feb 24 21:24:10 2020 From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=) Date: Mon, 24 Feb 2020 22:24:10 +0100 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: References: Message-ID: <940D46CE-22D6-479B-B8EF-65FFA896C4A0@oracle.com> Hi Zhengyu, > On 24 Feb 2020, at 21:59, Zhengyu Gu wrote: > > ?Hi Erik, > >> On 2/24/20 12:04 PM, Erik ?sterlund wrote: >> Hi Zhengyu, >> Can?t your barriers just perform a NULL check on the forwardee instead? forwardee() == NULL never means forwarded, does it? Both JVMTI and JFR just ?mark? the markWord, leaving its forwardee == NULL. >> That way you can solve the issue in the backend instead, and we don?t need to do anything about JFR either. Or did I miss something? > > You are right, this is a much simple solution. But the concern is that, resolve_forward() is the most used barrier, additional null check is undesirable. > > After offline chat with my colleagues, we realize that it may be ok. As JVMTI/JFR heap walk happens at safepoints, we really don't have to add the null check in regular barrier. Instead, force GC to use different version of resolve_forward with null check. > > Let me protocol this alternative, will get back you soon. The JFR heap walker does use the shared barriers in the safepoint though. So that optimization sounds like it won?t work. Having said that, the null check will be taken only for runtime code, not when going through the JIT. I would be surprised if this very well predicted NULL check used by runtime code would be noticeable, especially since you are probably going to CAS as well in the same path this is taken (the mark word is ?marked?). So perhaps just adding the NULL check in the barrier for the case where the markWord ?is_marked? is the sane thing to do, knowing that the other costs taken in the same path will dominate. Thanks, /Erik > Thank, > > -Zhengyu > >> Thanks, >> /Erik >>>> On 24 Feb 2020, at 17:49, Zhengyu Gu wrote: >>> >>> ?Hi all, >>> >>> Updated according to your comments: >>> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.02/ >>> >>> I modified vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test [1] to walk 300K objects. >>> >>> Without patch: >>> Time: 987431 nsecs >>> Time: 1135390 nsecs >>> Time: 1142519 nsecs >>> Time: 962816 nsecs >>> Time: 1015958 nsecs >>> >>> Avg: 1048822 nsecs >>> >>> With patch: >>> 1105015 nsecs >>> 1142425 nsecs >>> 968057 nsecs >>> 1383838 nsecs >>> 1079885 nsecs >>> >>> Avg: 1135844 nsecs >>> >>> So, it shows about 8% performance hit. >>> >>> Thanks, >>> >>> -Zhengyu >>> >>> [1] http://cr.openjdk.java.net/~zgu/JDK-8238633/test/webrev.00/ >>> >>> >>> >>> >>> >>>> On 2/21/20 8:01 AM, coleen.phillimore at oracle.com wrote: >>>> Adding serviceability-dev back. >>>> Coleen >>>>> On 2/21/20 7:59 AM, coleen.phillimore at oracle.com wrote: >>>>> >>>>> Hi, I had a quick look at this, minus the shenandoah code. >>>>> >>>>> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/src/hotspot/share/gc/shared/objectMarker.hpp.html >>>>> >>>>> I think this file could have forward declarations of GrowableArray and I didn't see a need for the markWord.hpp include. >>>>> >>>>> This change on the whole looks good to me. >>>>> >>>>> Coleen >>>>> >>>>> On 2/21/20 5:23 AM, Stefan Karlsson wrote: >>>>>> Hi Zhengyu, >>>>>> >>>>>> On 2020-02-17 15:51, Zhengyu Gu wrote: >>>>>>> Hi Stefan, >>>>>>> >>>>>>> Thanks for the review and suggestions, updated accordingly: >>>>>>> >>>>>>> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/ >>>>>> >>>>>> Thanks for moving the code. I think this looks good. >>>>>> >>>>>> If you're up for it, I have a couple of style change suggestions: >>>>>> >>>>>> 1) ObjectMarker uses two verbs to describe the same thing: "mark" and "visit". I propose that we only use "mark" in ObjectMarker and leave the usage of "visited" to the Jvmti code. >>>>>> >>>>>> 2) Some updates to odd whitespaces >>>>>> >>>>>> 3) Using forward declarations in Shenandoah code. >>>>>> >>>>>> I've bundled those changes into webrevs: >>>>>> >>>>>> https://cr.openjdk.java.net/~stefank/8238633/webrev.01.delta >>>>>> https://cr.openjdk.java.net/~stefank/8238633/webrev.01 >>>>>> >>>>>> Regarding performance testing, the HeapWalkTests you used seems to use a very small heap. I think it would be good to redo the measurements on a larger heap. Could you take the HeapWalkTest and add a few GBs of small, linked objects? >>>>>> >>>>>> Thank, >>>>>> StefanK >>>>>>> >>>>>>>> >>>>>>>> --- >>>>>>>> Previously, the calls to 'mark' and 'visited' were inlineable, but now every GC has to take a virtual call when marking the objects. My guess is that this code is slow anyway, and that it doesn't matter too much, but did you measure the effect of that change with, for example, G1? >>>>>>>> >>>>>>> I did rough measurement, timing vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test. >>>>>>> >>>>>>> If you know any tests/benchmarks I should measure, please let me know. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -Zhengyu >>>>>>> >>>>>>> >>>>>>>> Thanks, >>>>>>>> StefanK >>>>>>>> >>>>>>>>> Test: >>>>>>>>> hotspot_gc >>>>>>>>> vmTestbase_nsk_jdi >>>>>>>>> vmTestbase_nsk_jvmti >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> -Zhengyu >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>> > From zgu at redhat.com Mon Feb 24 22:38:14 2020 From: zgu at redhat.com (Zhengyu Gu) Date: Mon, 24 Feb 2020 17:38:14 -0500 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: <940D46CE-22D6-479B-B8EF-65FFA896C4A0@oracle.com> References: <940D46CE-22D6-479B-B8EF-65FFA896C4A0@oracle.com> Message-ID: <3d8553da-f7a8-fe84-d923-b10c5b47775e@redhat.com> > > The JFR heap walker does use the shared barriers in the safepoint though. So that optimization sounds like it won?t work. Okay. > > Having said that, the null check will be taken only for runtime code, not when going through the JIT. I would be surprised if this very well predicted NULL check used by runtime code would be noticeable, especially since you are probably going to CAS as well in the same path this is taken (the mark word is ?marked?). > > So perhaps just adding the NULL check in the barrier for the case where the markWord ?is_marked? is the sane thing to do, knowing that the other costs taken in the same path will dominate. I have this patch, exactly what you suggested. I will let Aleksey run his numbers. diff -r ef1e608a5ecc src/hotspot/share/gc/shenandoah/shenandoahForwarding.inline.hpp --- a/src/hotspot/share/gc/shenandoah/shenandoahForwarding.inline.hpp Mon Feb 24 15:03:28 2020 +0100 +++ b/src/hotspot/share/gc/shenandoah/shenandoahForwarding.inline.hpp Mon Feb 24 12:56:08 2020 -0500 @@ -37,10 +37,12 @@ inline HeapWord* ShenandoahForwarding::get_forwardee_raw_unchecked(oop obj) { markWord mark = obj->mark_raw(); if (mark.is_marked()) { - return (HeapWord*) mark.clear_lock_bits().to_pointer(); - } else { - return cast_from_oop(obj); + HeapWord* fwdptr = (HeapWord*)mark.clear_lock_bits().to_pointer(); + if (fwdptr != NULL) { + return fwdptr; + } } + return cast_from_oop(obj); } Thanks, -Zhengyu > > Thanks, > /Erik > >> Thank, >> >> -Zhengyu >> >>> Thanks, >>> /Erik >>>>> On 24 Feb 2020, at 17:49, Zhengyu Gu wrote: >>>> >>>> ?Hi all, >>>> >>>> Updated according to your comments: >>>> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.02/ >>>> >>>> I modified vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test [1] to walk 300K objects. >>>> >>>> Without patch: >>>> Time: 987431 nsecs >>>> Time: 1135390 nsecs >>>> Time: 1142519 nsecs >>>> Time: 962816 nsecs >>>> Time: 1015958 nsecs >>>> >>>> Avg: 1048822 nsecs >>>> >>>> With patch: >>>> 1105015 nsecs >>>> 1142425 nsecs >>>> 968057 nsecs >>>> 1383838 nsecs >>>> 1079885 nsecs >>>> >>>> Avg: 1135844 nsecs >>>> >>>> So, it shows about 8% performance hit. >>>> >>>> Thanks, >>>> >>>> -Zhengyu >>>> >>>> [1] http://cr.openjdk.java.net/~zgu/JDK-8238633/test/webrev.00/ >>>> >>>> >>>> >>>> >>>> >>>>> On 2/21/20 8:01 AM, coleen.phillimore at oracle.com wrote: >>>>> Adding serviceability-dev back. >>>>> Coleen >>>>>> On 2/21/20 7:59 AM, coleen.phillimore at oracle.com wrote: >>>>>> >>>>>> Hi, I had a quick look at this, minus the shenandoah code. >>>>>> >>>>>> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/src/hotspot/share/gc/shared/objectMarker.hpp.html >>>>>> >>>>>> I think this file could have forward declarations of GrowableArray and I didn't see a need for the markWord.hpp include. >>>>>> >>>>>> This change on the whole looks good to me. >>>>>> >>>>>> Coleen >>>>>> >>>>>> On 2/21/20 5:23 AM, Stefan Karlsson wrote: >>>>>>> Hi Zhengyu, >>>>>>> >>>>>>> On 2020-02-17 15:51, Zhengyu Gu wrote: >>>>>>>> Hi Stefan, >>>>>>>> >>>>>>>> Thanks for the review and suggestions, updated accordingly: >>>>>>>> >>>>>>>> http://cr.openjdk.java.net/~zgu/JDK-8238633/webrev.01/ >>>>>>> >>>>>>> Thanks for moving the code. I think this looks good. >>>>>>> >>>>>>> If you're up for it, I have a couple of style change suggestions: >>>>>>> >>>>>>> 1) ObjectMarker uses two verbs to describe the same thing: "mark" and "visit". I propose that we only use "mark" in ObjectMarker and leave the usage of "visited" to the Jvmti code. >>>>>>> >>>>>>> 2) Some updates to odd whitespaces >>>>>>> >>>>>>> 3) Using forward declarations in Shenandoah code. >>>>>>> >>>>>>> I've bundled those changes into webrevs: >>>>>>> >>>>>>> https://cr.openjdk.java.net/~stefank/8238633/webrev.01.delta >>>>>>> https://cr.openjdk.java.net/~stefank/8238633/webrev.01 >>>>>>> >>>>>>> Regarding performance testing, the HeapWalkTests you used seems to use a very small heap. I think it would be good to redo the measurements on a larger heap. Could you take the HeapWalkTest and add a few GBs of small, linked objects? >>>>>>> >>>>>>> Thank, >>>>>>> StefanK >>>>>>>> >>>>>>>>> >>>>>>>>> --- >>>>>>>>> Previously, the calls to 'mark' and 'visited' were inlineable, but now every GC has to take a virtual call when marking the objects. My guess is that this code is slow anyway, and that it doesn't matter too much, but did you measure the effect of that change with, for example, G1? >>>>>>>>> >>>>>>>> I did rough measurement, timing vmTestbase/nsk/jvmti/unit/heap/HeapWalkTests/TestDescription.java test. >>>>>>>> >>>>>>>> If you know any tests/benchmarks I should measure, please let me know. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> >>>>>>>> -Zhengyu >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> StefanK >>>>>>>>> >>>>>>>>>> Test: >>>>>>>>>> hotspot_gc >>>>>>>>>> vmTestbase_nsk_jdi >>>>>>>>>> vmTestbase_nsk_jvmti >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> >>>>>>>>>> -Zhengyu >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>> >> > From chiroito107 at gmail.com Tue Feb 25 03:44:31 2020 From: chiroito107 at gmail.com (Chihiro Ito) Date: Tue, 25 Feb 2020 12:44:31 +0900 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> Message-ID: Hi Serguei, Thanks for your review and advice. I modified these. Could you review this again, please? Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/ Regards, Chihiro 2020?2?25?(?) 3:36 serguei.spitsyn at oracle.com : > Hi Chihiro, > > Thank you for taking care about this issue! > > It looks good to me. > Just a couple of minor comments. > > > http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.04/src/java.base/share/classes/jdk/internal/vm/VMSupport.java.frames.html > > 88 private static String toEscapeSpecialChar(String theString) { > > I'd suggest to use replace parameter name "theString" with "source" as you have below: > > 92 private static String toEscapeSpace(String source) { > 96 private static String toISO88591(String source) throws CharacterCodingException { > > > 98 var byteBuf = ByteBuffer.allocate(charBuf.length() * 6); // 6 is 2 bytes for '\\u' as String and 4 bytes for code point. > > I'd suggest to put the comment before as a separate line. > > > 107 byteBuf.put(String.format("\\u%04X", (int) charBuf.get()).getBytes()); > > Space is not needed after the cast in: (int) charBuf.get(). > > > > http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.04/test/jdk/sun/tools/jcmd/TestVM.java.html > > 2 * Copyright (c) 2020, 2020, Oracle and/or its affiliates. All rights reserved. > > Could you, please, replace: 2020, 2020 => "2020? > I don't think two numbers are needed for the same year. > > > Thanks, > Serguei > > > On 2/22/20 07:37, Chihiro Ito wrote: > > Hi Yasumasa, > > Thank you for your reviews so many times. > How is this fix? > Could you review this again, please? > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.04/ > > Regards, > Chihiro > > 2020?2?22?(?) 21:53 Yasumasa Suenaga : > >> Hi Chihiro, >> >> >> - My proposal is not enough, so you should refine as below. >> - Exception types in saveConvert() should be limited. Please do >> not use `throws Exception`. >> - I guess you use try-catch statement in >> serializePropertiesToByteArray due to above checked exception. >> It should be throw runtime exception when an exception occurs. >> - Capacity of byteBuf (charBuf.length() * 5) should be >> (charBuf.length() * 6) >> because non 8859-1 chars would be "\uxxxx" (6 chars). >> Also please leave comment for it because a maintainer might not >> understand the meaning of multiplying 6 in future. >> >> - `output.shouldNotContain("C:\\:\\\\");` in testcase is correct? >> I guess you want to check "C\\:\\\\" is not contained. >> >> - To check '\n', you can use Platform::isWindows as below: >> output.shouldContain(Platform.isWindows() ? >> "line.separator=\\r\\n" : "lineseparator=\\n"); >> >> >> Yasumasa >> >> >> On 2020/02/22 19:23, Chihiro Ito wrote: >> > Hi Yasumasa, >> > >> > The line separator is not modified because it depends on the >> environment, but the others have been modified. >> > >> > Could you review this again? >> > >> > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.03/ >> > >> > Regards, >> > Chihiro >> > >> > 2020?2?22?(?) 12:32 Yasumasa Suenaga > suenaga at oss.nttdata.com>>: >> > >> > Hi Chihiro, >> > >> > Thank you for updating the webrev. >> > >> > >> > - You use BufferedWriter to create the output, however I think >> it would be more simply if you use PrintWriter. >> > >> > - Your change would work incorrectly when system property >> contains mixture of ascii and non-ascii. >> > You can see it with "-Dmixture=a?i". It would be converted to >> "a\u0061\u3042", it should be "a\u3042i". >> > >> > - Currently key value which contains space char, it would be >> escaped, but your change does not do so. >> > You can see it with "-D"space space=blank blank"". >> > >> > - You should not use String::trim to create String from >> ByteBuffer because property value might be contain blank in its tail. >> > You might use ByteBuffer::slice or part of ByteBuffer::array >> for it. >> > >> > - Did you try to use escaped chars in jtreg testcase? I guess >> you can set multibytes chars (e.g. CJK chars) with "\u". >> > In case of mixture of Japanese (Hiragana) and ASCII chars, >> you can embed "-Dmixture=a\u3042i" to testcase. (I'm not sure that...) >> > >> > - In test case, I recommend you to evaluate entire of line. >> > For example, if you want to check line.separator, you should >> evaluate as below: >> > output.shouldContain("line.separator=\\n"); >> > >> > >> > Thanks, >> > >> > Yasumasa >> > >> > >> > On 2020/02/22 0:44, Chihiro Ito wrote: >> > > Hi Yasumasa, >> > > >> > > Thank you for your advice. >> > > >> > > I decided not to use regular expressions. because of the number >> of \is confusing. >> > > I stopped using codePointAt() and used CharsetEncoder to work >> with ISO 8859 -1. >> > > I added some environment variables to the test. However, >> environment variables that contain multi bytes or spaces are not included >> because jtreg does not support them. >> > > >> > > Could you review this again, please? >> > > >> > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.02/ >> > > >> > > Regards, >> > > Chihiro >> > > >> > > 2020?2?20?(?) 22:39 Yasumasa Suenaga > > suenaga at oss.nttdata.com>>>: >> > > >> > > Hi Chihiro, >> > > >> > > On 2020/02/20 20:20, Chihiro Ito wrote: >> > > > Hi Yasumasa, >> > > > >> > > > Thank you for your quick review. >> > > > >> > > > I modified the code without Properties::store. >> > > > >> > > > Could you review this again, please? >> > > > >> > > > Webrev : >> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.01/ >> > > >> > > - Your change shows "\n" as "\\n". Is it ok? Currently >> "\n" would be shown straightly. >> > > - Your change uses Character::codePointAt to convert >> char to int value. >> > > According to Javadoc, it would be different value if a >> char is in surrogate range. >> > > - Description of serializePropertiesToByteArray() says >> the return value is encoded in ISO 8859-1, >> > > but it does not seems to be so because the logic >> depends on the spec of Properties::store. Is it ok? >> > > - Test case does not stable because system properties >> might be different from your environment. >> > > I suggest you to set system properties for testing >> explicitly. E.g. >> > > -Dnormal=normal_val -D"space space=blank blank" >> -Dnonascii=????? -Dopenjdk_url=http://openjdk.java.net/ -Dbackslash="\\" >> > > * Also I recommend you to check "\n" in the test >> from `line.separator`. I think it is stable property. >> > > >> > > I've not convinced whether we should compliant to the >> comment which says for ISO 8859-1. >> > > If it is important, we can use CharsetEncoder from >> ISO_8859_1 as below: >> > > >> > > >> http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-encoder/ >> > > >> > > OTOH we can keep current behavior, we can implement more >> simply as below: >> > > (It's similar to yours.) >> > > >> > > >> http://cr.openjdk.java.net/~ysuenaga/JDK-8222489/proposal-props-style/ >> > > >> > > >> > > Thanks, >> > > >> > > Yasumasa >> > > >> > > >> > > > Regards, >> > > > Chihiro >> > > > >> > > > >> > > > 2020?2?20?(?) 9:34 Yasumasa Suenaga < >> suenaga at oss.nttdata.com > suenaga at oss.nttdata.com > > suenaga at oss.nttdata.com > suenaga at oss.nttdata.com >>>: >> > > > >> > > > Hi Chihiro, >> > > > >> > > > I think this problem is caused by spec of >> `Properties::store(Writer)`. >> > > > >> > > > `Properties::store(OutputStream)` says that the >> output format is as same as `store(Writer)` [1]. >> > > > `Properties::store(Writer)` says that `#`, `!`, `=`, >> `:` are written with a preceding backslash [2]. >> > > > >> > > > So I think we should not use `Properties::store` to >> serialize properties. >> > > > >> > > > >> > > > Thanks, >> > > > >> > > > Yasumasa >> > > > >> > > > >> > > > [1] >> https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.OutputStream,java.lang.String) >> > > > [2] >> https://download.java.net/java/early_access/jdk15/docs/api/java.base/java/util/Properties.html#store(java.io.Writer,java.lang.String) >> > > > >> > > > >> > > > On 2020/02/19 22:36, Chihiro Ito wrote: >> > > > > Hi, >> > > > > >> > > > > Could you review this tiny fix, please? >> > > > > >> > > > > This problem affected not the only path on >> Windows, but also Linux and URLs using ":". >> > > > > >> > > > > Webrev : >> http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.00/ >> > > > > JBS : >> https://bugs.openjdk.java.net/browse/JDK-8222489 >> > > > > >> > > > > Regards, >> > > > > Chihiro >> > > > >> > > >> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Tue Feb 25 05:48:14 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Mon, 24 Feb 2020 21:48:14 -0800 Subject: RFR(XS): 8193237 - SA: ClhsdbLauncher should show the command being executed Message-ID: Hello, Please review the following: https://bugs.openjdk.java.net/browse/JDK-8193237 http://cr.openjdk.java.net/~cjplummer/8193237/webrev.00/ The fix is to issue an "echo on" command before the test commands. The bug gives an example of how this fix improves the test output. thanks, Chris From serguei.spitsyn at oracle.com Tue Feb 25 07:15:20 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Mon, 24 Feb 2020 23:15:20 -0800 Subject: RFR(XS): 8193237 - SA: ClhsdbLauncher should show the command being executed In-Reply-To: References: Message-ID: <633cdb64-fe0f-4dac-60d7-9e7ee405928c@oracle.com> Hi Chris, This looks good to me. I always prefer verbose output in tests. :) Thanks, Serguei On 2/24/20 21:48, Chris Plummer wrote: > Hello, > > Please review the following: > > https://bugs.openjdk.java.net/browse/JDK-8193237 > http://cr.openjdk.java.net/~cjplummer/8193237/webrev.00/ > > The fix is to issue an "echo on" command before the test commands. The > bug gives an example of how this fix improves the test output. > > thanks, > > Chris From linzang at tencent.com Tue Feb 25 10:21:10 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Tue, 25 Feb 2020 10:21:10 +0000 Subject: RFR(XS): 8239916 - SA: delete dead code in jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ObjectHeap.java Message-ID: <6ccd3ea6fc974cecb202865c7528912e@tencent.com> Hi, Please review the following change: Bugs: https://bugs.openjdk.java.net/browse/JDK-8239916 webrev: http://cr.openjdk.java.net/~lzang/8239916/webrev/ Thanks, Lin -------------- next part -------------- An HTML attachment was scrubbed... URL: From christoph.langer at sap.com Tue Feb 25 10:21:42 2020 From: christoph.langer at sap.com (Langer, Christoph) Date: Tue, 25 Feb 2020 10:21:42 +0000 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com> References: <29e40cdf-8372-9858-bad8-2c9f81d94bcc@oss.nttdata.com> <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com> Message-ID: Hi Ioi, > Ralf and Christoph, > > I agree that making it easy for the user is important, so dependency on > an external program like pgzip will be a hassle. Yes ?? > How about implementing the compression in a Java program? Will something > like this be too much of a hassle? > > ??? jcmd $PID GC.dump -stdout | java -jar HeapDumpZipper.jar > heap.gz > > This way, we can implement the exact compression algorithm as Ralf > described, without making it part of the VM. Writing it in Java probably > would be easier to maintain. > > If it makes sense, we can include the Java code as part of the JDK, so > there's no need to ship a separate JAR file to the user. > > jcmd $PID GC.dump -stdout | java jdk.internal.heapdump.Zipper > > heap.gz Well, we definitely would have to enhance hotspot and jcmd to be able to stream the dump data out. And the Java code, doing the compression should also be internalized to jcmd such that piping is not required. It's hard to say whether maintainability will be easier when implementing these parts in Java. However, performance is definitely a thing which has to be considered - I guess doing the zipping in hotspot is better for that. And also the option to support the "heapdump on OOM" scenarios would not be handled here. Best regards Christoph From martin.doerr at sap.com Tue Feb 25 11:22:36 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 25 Feb 2020 11:22:36 +0000 Subject: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element In-Reply-To: References: Message-ID: Hi Chris, according to arraycopy.hpp, ?arraycopy operations are implicitly atomic on each array element.? This requires 8 Byte alignment for jlong and jdouble. I don?t want to give up this property just because Windows 32 bit doesn?t align them this way by default. All other supported platforms do it right by default. Best regards, Martin From: Chris Plummer Sent: Montag, 24. Februar 2020 21:52 To: Doerr, Martin ; OpenJDK Serviceability Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element Hi Martin, I'm not so sure I agree with the approach to this fix, nor for the one already done for JDK-8220348. Shouldn't a user be expected to be able to pass a jlong variable to SetLongArrayRegion() without the need for any special platform dependent modifiers added to the declaration of the variable? cheers, Chris On 2/24/20 5:51 AM, Doerr, Martin wrote: Hi, reposting on serviceability-dev (was core-libs-dev before). Bug: https://bugs.openjdk.java.net/browse/JDK-8239856 Webrev: http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/webrev.00/ Thanks for the review, Thomas! Best regards, Martin From: Thomas St?fe Sent: Montag, 24. Februar 2020 14:41 To: Doerr, Martin Cc: core-libs-dev at openjdk.java.net; Lindenmaier, Goetz ; Langer, Christoph Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element Oh okay. Then it looks okay to me. Cheers, Thomas On Mon, Feb 24, 2020 at 12:56 PM Doerr, Martin > wrote: Hi Thomas, thanks for the quick review. ATTRIBUTE_ALIGNED is defined in hotspot. I can?t use it for src/jdk.jdwp.agent/share/native/libjdwp/ArrayReferenceImpl.c. Christoph had already suggested to make it available for core libs, too, but I haven?t found a good place for it. Best regards, Martin From: Thomas St?fe > Sent: Montag, 24. Februar 2020 12:52 To: Doerr, Martin > Cc: core-libs-dev at openjdk.java.net; Lindenmaier, Goetz >; Langer, Christoph > Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element Hi Martin, maybe use ATTRIBUTE_ALIGNED instead? Cheers, Thomas On Mon, Feb 24, 2020 at 12:44 PM Doerr, Martin > wrote: Hi, we had fixed stack array alignment for Windows 32 bit with JDK-8220348. However, there are also stack allocated jlong and jdouble used as source for SetLongArrayRegion and SetDoubleArrayRegion with insufficient alignment for this platform. Here?s my proposed fix: http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/webrev.00/ Please review. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthias.baesken at sap.com Tue Feb 25 16:20:35 2020 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Tue, 25 Feb 2020 16:20:35 +0000 Subject: RFR [XS]: 8239462: jdk.hotspot.agent misses some ReleaseStringUTFChars calls in case of early returns In-Reply-To: References: <6ed0d004-3e8c-4b27-f583-06dbacf45173@oracle.com> Message-ID: New webrev : http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.3/ Best regards, Matthias > IMO the solution with goto makes it even worse. > If you don't want to introduce the wrapper, could you please restore > changes in LinuxDebuggerLocal_attach0 from webrev.1 > > --alex > > On 02/21/2020 00:32, Baesken, Matthias wrote: > > Hi Alex , > > > > new webrev : > > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.2/ > > > > Best Regards, Matthias > > > > > >> > >> Hi Matthias, > >> > >> Looks good in general, but I think it makes sense to fix #2 cases (at > >> least I see them in LinuxDebuggerLocal). If GetStringUTFChars fails, the > >> code will crash. > >> Also I see GetStringUTFChars(str, JNI_FALSE). This look bad as well - > >> 2nd arg is a pointer, so it should be NULL or nullptr. > >> > >> As for #1 and #3 - AFAIU they are both right ways. > >> If GetStringUTFChars fails, it throws OOM and return NULL. > >> > >> And one more thing to consider. > >> LinuxDebuggerLocal_attach0 function looks terrible - 7 > >> ReleaseStringUTFChars calls for 2 GetStringUTFChars. > >> Maybe it make sense to introduce simple wrapper like AutoJavaString in > >> src/jdk.hotspot.agent/windows/native/libsaproc/sawindbg.cpp > >> It would make the code simpler and less error prone. > >> > >> --alex > >> > > From ioi.lam at oracle.com Tue Feb 25 17:03:25 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 25 Feb 2020 09:03:25 -0800 Subject: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump In-Reply-To: References: <01361a9d-2855-db67-a176-73731fada08f@oracle.com> <0c687e55-ed91-e606-28a7-f9aef745ed8d@oracle.com> <490d58f7-7adc-00aa-b504-0ac284fe7eb5@oracle.com> <0343dfac-61f7-1b1c-ee96-bdee130578ad@oracle.com> Message-ID: Hi Christoph, This sounds fair. I will remove my objection :-) Thanks - Ioi On 2/25/20 2:21 AM, Langer, Christoph wrote: > Hi Ioi, > >> Ralf and Christoph, >> >> I agree that making it easy for the user is important, so dependency on >> an external program like pgzip will be a hassle. > Yes ?? > >> How about implementing the compression in a Java program? Will something >> like this be too much of a hassle? >> >> ??? jcmd $PID GC.dump -stdout | java -jar HeapDumpZipper.jar > heap.gz >> >> This way, we can implement the exact compression algorithm as Ralf >> described, without making it part of the VM. Writing it in Java probably >> would be easier to maintain. >> >> If it makes sense, we can include the Java code as part of the JDK, so >> there's no need to ship a separate JAR file to the user. >> >> jcmd $PID GC.dump -stdout | java jdk.internal.heapdump.Zipper > >> heap.gz > Well, we definitely would have to enhance hotspot and jcmd to be able to stream the dump data out. And the Java code, doing the compression should also be internalized to jcmd such that piping is not required. It's hard to say whether maintainability will be easier when implementing these parts in Java. However, performance is definitely a thing which has to be considered - I guess doing the zipping in hotspot is better for that. And also the option to support the "heapdump on OOM" scenarios would not be handled here. > > Best regards > Christoph > From chris.plummer at oracle.com Tue Feb 25 17:03:23 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 25 Feb 2020 09:03:23 -0800 Subject: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element In-Reply-To: References: Message-ID: <2d897aff-ebf9-b755-918a-aed28970c4d0@oracle.com> An HTML attachment was scrubbed... URL: From chris.plummer at oracle.com Tue Feb 25 18:02:51 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 25 Feb 2020 10:02:51 -0800 Subject: RFR(XS): 8239916 - SA: delete dead code in jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ObjectHeap.java In-Reply-To: <6ccd3ea6fc974cecb202865c7528912e@tencent.com> References: <6ccd3ea6fc974cecb202865c7528912e@tencent.com> Message-ID: An HTML attachment was scrubbed... URL: From martin.doerr at sap.com Tue Feb 25 18:03:06 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 25 Feb 2020 18:03:06 +0000 Subject: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element In-Reply-To: <480add53-59c5-c132-f903-2aa99f121ffd@oracle.com> References: <480add53-59c5-c132-f903-2aa99f121ffd@oracle.com> Message-ID: Hi Serguei, thanks for the review. Best regards, Martin From: serguei.spitsyn at oracle.com Sent: Montag, 24. Februar 2020 20:12 To: Doerr, Martin ; OpenJDK Serviceability Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element Hi Martin, It looks good to me. Thanks, Serguei On 2/24/20 05:51, Doerr, Martin wrote: Hi, reposting on serviceability-dev (was core-libs-dev before). Bug: https://bugs.openjdk.java.net/browse/JDK-8239856 Webrev: http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/webrev.00/ Thanks for the review, Thomas! Best regards, Martin From: Thomas St?fe Sent: Montag, 24. Februar 2020 14:41 To: Doerr, Martin Cc: core-libs-dev at openjdk.java.net; Lindenmaier, Goetz ; Langer, Christoph Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element Oh okay. Then it looks okay to me. Cheers, Thomas On Mon, Feb 24, 2020 at 12:56 PM Doerr, Martin > wrote: Hi Thomas, thanks for the quick review. ATTRIBUTE_ALIGNED is defined in hotspot. I can?t use it for src/jdk.jdwp.agent/share/native/libjdwp/ArrayReferenceImpl.c. Christoph had already suggested to make it available for core libs, too, but I haven?t found a good place for it. Best regards, Martin From: Thomas St?fe > Sent: Montag, 24. Februar 2020 12:52 To: Doerr, Martin > Cc: core-libs-dev at openjdk.java.net; Lindenmaier, Goetz >; Langer, Christoph > Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element Hi Martin, maybe use ATTRIBUTE_ALIGNED instead? Cheers, Thomas On Mon, Feb 24, 2020 at 12:44 PM Doerr, Martin > wrote: Hi, we had fixed stack array alignment for Windows 32 bit with JDK-8220348. However, there are also stack allocated jlong and jdouble used as source for SetLongArrayRegion and SetDoubleArrayRegion with insufficient alignment for this platform. Here?s my proposed fix: http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/webrev.00/ Please review. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniil.x.titov at oracle.com Tue Feb 25 18:05:09 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 25 Feb 2020 10:05:09 -0800 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> Message-ID: <1D2E87F2-16B6-490B-BCEC-6718DB2867BE@oracle.com> Hi Serguei, I added my comments there. In brief, I believe that in long term in the serviceability tools we should avoid using the system properties and prefer the command line options instead. Thanks, Daniil ?On 2/24/20, 11:04 AM, "serguei.spitsyn at oracle.com" wrote: Hi Daniil, I've looked at CSR and posted a couple of questions there. It'd be nice if you help to resolve my confusion. :) Thanks, Serguei On 2/23/20 20:21, Daniil Titov wrote: > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > New CSR [3] was created for this change and it needs to be reviewed as well. > > Man pages for jhsdb will be updated in a separate issue. > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > // delegate to the actual SA debug server. > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > but I would prefer to address it in a separate issue. > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > container and connecting to it with the GUI debugger. > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > Thank you, > Daniil > > From martin.doerr at sap.com Tue Feb 25 18:20:05 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 25 Feb 2020 18:20:05 +0000 Subject: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element In-Reply-To: <2d897aff-ebf9-b755-918a-aed28970c4d0@oracle.com> References: <2d897aff-ebf9-b755-918a-aed28970c4d0@oracle.com> Message-ID: Hi Chris, I know how JNI is meant. However, C/C++ is (almost) never platform independent. Especially when it comes to primitive types. My change is not particularly beautiful, but I haven?t found a more beautiful way to fix it. Note that SetLongArrayRegion seems to work without the alignment requirement in the product build. However, word tearing could possibly be observed. It's not possible to guarantee element-wise atomicity without alignment because of processor architecture. That?s why I think the assertion makes sense and violations at least in the code which is part of OpenJDK should be fixed IMHO. I had already asked for alternative fixes when I was working on JDK-8220348 (like force the compiler to 64-bit align 64-bit types on stack), but nobody has found a way to do this. Best regards, Martin From: Chris Plummer Sent: Dienstag, 25. Februar 2020 18:03 To: Doerr, Martin ; OpenJDK Serviceability ; hotspot-runtime-dev Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element [Adding runtime-dev as this regards the JNI spec] Hi Martin, JNI is meant as a means to write code that interfaces with the JVM in a platform independent way. Therefore the declaration of a jlong or a jdouble should not require any extra platform dependent considerations. This also means requirements of an internal JVM API should not impose any extra requirements on the JNI code. IMHO this should be fixed in hotspot. Maybe fixing it in jni_md.h (if there is a way to force 64-bit alignment) or in the makefiles (force the compiler to 64-bit align) would also be acceptable. thanks, Chris On 2/25/20 3:22 AM, Doerr, Martin wrote: Hi Chris, according to arraycopy.hpp, ?arraycopy operations are implicitly atomic on each array element.? This requires 8 Byte alignment for jlong and jdouble. I don?t want to give up this property just because Windows 32 bit doesn?t align them this way by default. All other supported platforms do it right by default. Best regards, Martin From: Chris Plummer Sent: Montag, 24. Februar 2020 21:52 To: Doerr, Martin ; OpenJDK Serviceability Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element Hi Martin, I'm not so sure I agree with the approach to this fix, nor for the one already done for JDK-8220348. Shouldn't a user be expected to be able to pass a jlong variable to SetLongArrayRegion() without the need for any special platform dependent modifiers added to the declaration of the variable? cheers, Chris On 2/24/20 5:51 AM, Doerr, Martin wrote: Hi, reposting on serviceability-dev (was core-libs-dev before). Bug: https://bugs.openjdk.java.net/browse/JDK-8239856 Webrev: http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/webrev.00/ Thanks for the review, Thomas! Best regards, Martin From: Thomas St?fe Sent: Montag, 24. Februar 2020 14:41 To: Doerr, Martin Cc: core-libs-dev at openjdk.java.net; Lindenmaier, Goetz ; Langer, Christoph Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element Oh okay. Then it looks okay to me. Cheers, Thomas On Mon, Feb 24, 2020 at 12:56 PM Doerr, Martin > wrote: Hi Thomas, thanks for the quick review. ATTRIBUTE_ALIGNED is defined in hotspot. I can?t use it for src/jdk.jdwp.agent/share/native/libjdwp/ArrayReferenceImpl.c. Christoph had already suggested to make it available for core libs, too, but I haven?t found a good place for it. Best regards, Martin From: Thomas St?fe > Sent: Montag, 24. Februar 2020 12:52 To: Doerr, Martin > Cc: core-libs-dev at openjdk.java.net; Lindenmaier, Goetz >; Langer, Christoph > Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element Hi Martin, maybe use ATTRIBUTE_ALIGNED instead? Cheers, Thomas On Mon, Feb 24, 2020 at 12:44 PM Doerr, Martin > wrote: Hi, we had fixed stack array alignment for Windows 32 bit with JDK-8220348. However, there are also stack allocated jlong and jdouble used as source for SetLongArrayRegion and SetDoubleArrayRegion with insufficient alignment for this platform. Here?s my proposed fix: http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/webrev.00/ Please review. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From serguei.spitsyn at oracle.com Tue Feb 25 19:07:16 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 25 Feb 2020 11:07:16 -0800 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: <1D2E87F2-16B6-490B-BCEC-6718DB2867BE@oracle.com> References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1D2E87F2-16B6-490B-BCEC-6718DB2867BE@oracle.com> Message-ID: <84fcc90a-07c0-eef2-c0b6-6ec472880c37@oracle.com> Hi Daniil, Thank you for reply. I agree with the approach to avoid using system properties. Then it is better to be consistent. I'd consider adding an RMI registry port option as well. Will look at your comments in the CSR and reply there. Thanks, Serguei On 2/25/20 10:05 AM, Daniil Titov wrote: > Hi Serguei, > > I added my comments there. In brief, I believe that in long term in the serviceability tools we should avoid > using the system properties and prefer the command line options instead. > > Thanks, > Daniil > > ?On 2/24/20, 11:04 AM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > I've looked at CSR and posted a couple of questions there. > It'd be nice if you help to resolve my confusion. :) > > Thanks, > Serguei > > > On 2/23/20 20:21, Daniil Titov wrote: > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > Man pages for jhsdb will be updated in a separate issue. > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > // delegate to the actual SA debug server. > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > but I would prefer to address it in a separate issue. > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > container and connecting to it with the GUI debugger. > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > Thank you, > > Daniil > > > > > > > > From alexey.menkov at oracle.com Tue Feb 25 19:30:28 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Tue, 25 Feb 2020 11:30:28 -0800 Subject: RFR [XS]: 8239462: jdk.hotspot.agent misses some ReleaseStringUTFChars calls in case of early returns In-Reply-To: References: <6ed0d004-3e8c-4b27-f583-06dbacf45173@oracle.com> Message-ID: Hi Matthias, LGTM --alex On 02/25/2020 08:20, Baesken, Matthias wrote: > > New webrev : > > > http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.3/ > > > Best regards, Matthias > > >> IMO the solution with goto makes it even worse. >> If you don't want to introduce the wrapper, could you please restore >> changes in LinuxDebuggerLocal_attach0 from webrev.1 >> >> --alex >> >> On 02/21/2020 00:32, Baesken, Matthias wrote: >>> Hi Alex , >>> >>> new webrev : >>> >>> http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.2/ >>> >>> Best Regards, Matthias >>> >>> >>>> >>>> Hi Matthias, >>>> >>>> Looks good in general, but I think it makes sense to fix #2 cases (at >>>> least I see them in LinuxDebuggerLocal). If GetStringUTFChars fails, the >>>> code will crash. >>>> Also I see GetStringUTFChars(str, JNI_FALSE). This look bad as well - >>>> 2nd arg is a pointer, so it should be NULL or nullptr. >>>> >>>> As for #1 and #3 - AFAIU they are both right ways. >>>> If GetStringUTFChars fails, it throws OOM and return NULL. >>>> >>>> And one more thing to consider. >>>> LinuxDebuggerLocal_attach0 function looks terrible - 7 >>>> ReleaseStringUTFChars calls for 2 GetStringUTFChars. >>>> Maybe it make sense to introduce simple wrapper like AutoJavaString in >>>> src/jdk.hotspot.agent/windows/native/libsaproc/sawindbg.cpp >>>> It would make the code simpler and less error prone. >>>> >>>> --alex >>>> >>> From daniil.x.titov at oracle.com Tue Feb 25 19:38:48 2020 From: daniil.x.titov at oracle.com (Daniil Titov) Date: Tue, 25 Feb 2020 11:38:48 -0800 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: <84fcc90a-07c0-eef2-c0b6-6ec472880c37@oracle.com> References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1D2E87F2-16B6-490B-BCEC-6718DB2867BE@oracle.com> <84fcc90a-07c0-eef2-c0b6-6ec472880c37@oracle.com> Message-ID: Hi Serguei, I will update the CSR and the fix to include this change. Thank you, Daniil ?On 2/25/20, 11:07 AM, "serguei.spitsyn at oracle.com" wrote: Hi Daniil, Thank you for reply. I agree with the approach to avoid using system properties. Then it is better to be consistent. I'd consider adding an RMI registry port option as well. Will look at your comments in the CSR and reply there. Thanks, Serguei On 2/25/20 10:05 AM, Daniil Titov wrote: > Hi Serguei, > > I added my comments there. In brief, I believe that in long term in the serviceability tools we should avoid > using the system properties and prefer the command line options instead. > > Thanks, > Daniil > > ?On 2/24/20, 11:04 AM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > I've looked at CSR and posted a couple of questions there. > It'd be nice if you help to resolve my confusion. :) > > Thanks, > Serguei > > > On 2/23/20 20:21, Daniil Titov wrote: > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > Man pages for jhsdb will be updated in a separate issue. > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > // delegate to the actual SA debug server. > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > but I would prefer to address it in a separate issue. > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > container and connecting to it with the GUI debugger. > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > Thank you, > > Daniil > > > > > > > > From chris.plummer at oracle.com Tue Feb 25 19:47:59 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Tue, 25 Feb 2020 11:47:59 -0800 Subject: RFR(XS): 8239379 - ProblemList serviceability/sa/sadebugd/DebugdConnectTest.java on OSX Message-ID: <371e1018-11f0-f087-a62a-e98b309ee108@oracle.com> Hi, I have to problem list this test due to JDK-8239062 [1], and have to do so before I can push the fix for JDK-8238268 [2], which exposes JDK-8239062 [1]. https://bugs.openjdk.java.net/browse/JDK-8239379 diff --git a/test/hotspot/jtreg/ProblemList.txt b/test/hotspot/jtreg/ProblemList.txt --- a/test/hotspot/jtreg/ProblemList.txt +++ b/test/hotspot/jtreg/ProblemList.txt @@ -127,6 +127,7 @@ ?serviceability/sa/DeadlockDetectionTest.java 8193639 solaris-all ?serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris-all ?serviceability/sa/jmap-hprof/JMapHProfLargeHeapTest.java 8193639 solaris-all +serviceability/sa/sadebugd/DebugdConnectTest.java 8239062 macosx-x64 ?serviceability/sa/TestClassDump.java 8193639 solaris-all ?serviceability/sa/TestClhsdbJstackLock.java 8193639 solaris-all ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris-all [1] https://bugs.openjdk.java.net/browse/JDK-8239062 [1] https://bugs.openjdk.java.net/browse/JDK-8238268 thanks, Chris From serguei.spitsyn at oracle.com Tue Feb 25 20:35:51 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 25 Feb 2020 12:35:51 -0800 Subject: RFR: 8196751: Add jhsdb option to specify debug server RMI connector port In-Reply-To: References: <6DA9405E-3A97-4164-90B1-805441D1B9A6@oracle.com> <1D2E87F2-16B6-490B-BCEC-6718DB2867BE@oracle.com> <84fcc90a-07c0-eef2-c0b6-6ec472880c37@oracle.com> Message-ID: Hi Daniil, Okay, thanks! Serguei On 2/25/20 11:38 AM, Daniil Titov wrote: > Hi Serguei, > > I will update the CSR and the fix to include this change. > > Thank you, > Daniil > > ?On 2/25/20, 11:07 AM, "serguei.spitsyn at oracle.com" wrote: > > Hi Daniil, > > Thank you for reply. > I agree with the approach to avoid using system properties. > Then it is better to be consistent. > I'd consider adding an RMI registry port option as well. > Will look at your comments in the CSR and reply there. > > Thanks, > Serguei > > > On 2/25/20 10:05 AM, Daniil Titov wrote: > > Hi Serguei, > > > > I added my comments there. In brief, I believe that in long term in the serviceability tools we should avoid > > using the system properties and prefer the command line options instead. > > > > Thanks, > > Daniil > > > > ?On 2/24/20, 11:04 AM, "serguei.spitsyn at oracle.com" wrote: > > > > Hi Daniil, > > > > I've looked at CSR and posted a couple of questions there. > > It'd be nice if you help to resolve my confusion. :) > > > > Thanks, > > Serguei > > > > > > On 2/23/20 20:21, Daniil Titov wrote: > > > Please review change that adds a new command line option to jhsdb tool for the debugd mode to specify a RMI connector port. > > > Currently a random port is used that prevents the debug server from being used behind a firewall or in a container. > > > > > > New CSR [3] was created for this change and it needs to be reviewed as well. > > > > > > Man pages for jhsdb will be updated in a separate issue. > > > > > > The current implementation (sun.jvm.hotspot.SALauncher) parses the command line options passed to jhsdb tool, > > > converts them to the ones for the debug server and then delegates the call to sun.jvm.hotspot.DebugServer.main(). > > > > > > // delegate to the actual SA debug server. > > > 367 DebugServer.main(newArgArray.toArray(new String[0])); > > > > > > However, sun.jvm.hotspot.DebugServer doesn't support named options and that prevents from efficiently adding new options to the tool. > > > I found it more suitable to start Hotspot agent directly in SALauncher rather than adding a new option in both sun.jvm.hotspot.SALauncher > > > and sun.jvm.hotspot.DebugServer and delegating the call. With this change I think sun.jvm.hotspot.DebugServer could be marked as a deprecated > > > but I would prefer to address it in a separate issue. > > > > > > Testing: Manual testing with attaching the debug server to the running Java process or to the core file inside a docker > > > container and connecting to it with the GUI debugger. > > > Mach5 tier1-tier3 tests (that include serviceability/sa/sadebugd tests) succeeded. > > > > > > [1] Webrev: http://cr.openjdk.java.net/~dtitov/8196751/webrev.01 > > > [2] Jira issue: https://bugs.openjdk.java.net/browse/JDK-8196751 > > > [3] CSR: https://bugs.openjdk.java.net/browse/JDK-8239831 > > > > > > Thank you, > > > Daniil > > > > > > > > > > > > > > > > > > From hohensee at amazon.com Tue Feb 25 21:13:38 2020 From: hohensee at amazon.com (Hohensee, Paul) Date: Tue, 25 Feb 2020 21:13:38 +0000 Subject: RFR(XS): 8239916 - SA: delete dead code in jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ObjectHeap.java In-Reply-To: References: <6ccd3ea6fc974cecb202865c7528912e@tencent.com> Message-ID: That?s indeed dead code, so lgtm. Thanks, Paul From: serviceability-dev on behalf of Chris Plummer Date: Tuesday, February 25, 2020 at 10:04 AM To: "linzang(??)" , serviceability-dev , "hotspot-gc-dev at openjdk.java.net" Subject: Re: RFR(XS): 8239916 - SA: delete dead code in jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ObjectHeap.java Adding hotspot-gc-dev. Chris On 2/25/20 2:21 AM, linzang(??) wrote: Hi, Please review the following change: Bugs: https://bugs.openjdk.java.net/browse/JDK-8239916 webrev: http://cr.openjdk.java.net/~lzang/8239916/webrev/ Thanks, Lin -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan.karlsson at oracle.com Tue Feb 25 21:46:59 2020 From: stefan.karlsson at oracle.com (Stefan Karlsson) Date: Tue, 25 Feb 2020 22:46:59 +0100 Subject: RFR(XS): 8239916 - SA: delete dead code in jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ObjectHeap.java In-Reply-To: References: <6ccd3ea6fc974cecb202865c7528912e@tencent.com> Message-ID: <9a67c326-f693-99ed-0c51-4f6bf96dd9b3@oracle.com> Looks good. This is left-overs from the CMS removal. StefanK On 2020-02-25 19:02, Chris Plummer wrote: > Adding hotspot-gc-dev. > > Chris > > On 2/25/20 2:21 AM, linzang(??) wrote: >> Hi, >> ? ? Please review the following change: >> ? ? Bugs: https://bugs.openjdk.java.net/browse/JDK-8239916 >> ? ? webrev: http://cr.openjdk.java.net/~lzang/8239916/webrev/ >> >> Thanks, >> Lin > From linzang at tencent.com Wed Feb 26 02:47:35 2020 From: linzang at tencent.com (=?utf-8?B?bGluemFuZyjoh6fnkLMp?=) Date: Wed, 26 Feb 2020 02:47:35 +0000 Subject: RFR(XS): 8239916 - SA: delete dead code in jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ObjectHeap.java(Internet mail) In-Reply-To: <9a67c326-f693-99ed-0c51-4f6bf96dd9b3@oracle.com> References: <6ccd3ea6fc974cecb202865c7528912e@tencent.com> <9a67c326-f693-99ed-0c51-4f6bf96dd9b3@oracle.com> Message-ID: <8C8E0733-3076-49F1-9527-F11A8860661C@tencent.com> Thanks for reviewing, so can this change be merged now? BRs, Lin > On Feb 26, 2020, at 5:46 AM, Stefan Karlsson wrote: > > Looks good. This is left-overs from the CMS removal. > > StefanK > > On 2020-02-25 19:02, Chris Plummer wrote: >> Adding hotspot-gc-dev. >> >> Chris >> >> On 2/25/20 2:21 AM, linzang(??) wrote: >>> Hi, >>> Please review the following change: >>> Bugs: https://bugs.openjdk.java.net/browse/JDK-8239916 >>> webrev: http://cr.openjdk.java.net/~lzang/8239916/webrev/ >>> >>> Thanks, >>> Lin >> > > From david.holmes at oracle.com Wed Feb 26 02:55:24 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 26 Feb 2020 12:55:24 +1000 Subject: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element In-Reply-To: References: <2d897aff-ebf9-b755-918a-aed28970c4d0@oracle.com> Message-ID: <49810f29-dad4-ac6a-a675-878f7f53fb28@oracle.com> Hi Martin, On 26/02/2020 4:20 am, Doerr, Martin wrote: > Hi Chris, > > I know how JNI is meant. However, C/C++ is (almost) never platform > independent. Especially when it comes to primitive types. There is potentially a question mark over how the JNI Get/SetArrayRegion methods are implemented, as the spec makes no mention of atomic updates or accesses. In the absence of any mention I would expect normal atomicity rules for Java datatypes to apply - which means long and double do not have to be atomic. If our implementation offers atomicity as an extra feature that is in itself okay, but if that feature imposes additional constraints on the programmer which are not evident in the specification, that is questionable IMO. If the lack of alignment simply results in potential non-atomic access that would be fine; but if it results in a runtime h/w fault then I would suggest we should not be attempting atomic accesses. IIUC you have to run in a special mode to enable memory alignment checks on x86, so it seems we would potentially just not get atomic accesses. The presence of the assertion to highlight the need for alignment is probably excessive in the case of these JNI APIs, but highly desirable for the low-level atomic copy routines themselves. I'm not concerned that these exceptions can "leak" up to the application code using these JNI API's simply because it only affects debug builds, and is easily remedied (either by changing the code or disabling this assertion). But if our own JDK code can encounter them, then we should modify that code. > > My change is not particularly beautiful, but I haven?t found a more > beautiful way to fix it. > > Note that SetLongArrayRegion seems to work without the alignment > requirement in the product build. However, word tearing could possibly > be observed. > > It's not possible to guarantee element-wise atomicity without alignment > because of processor architecture. That?s why I think the assertion > makes sense and violations at least in the code which is part of OpenJDK > should be fixed IMHO. Is this a windows only change because other compilers force 64-bit alignment of 64-bit types, even in 32-bit environments? I don't like seeing this be compiler specific when it is really processor specific and to be safe (and keep it simple) we should ensure 8-byte alignment in all cases it is needed. Cheers, David ----- > I had already asked for alternative fixes when I was working on > JDK-8220348 (like force the compiler to 64-bit align 64-bit types on > stack), but nobody has found a way to do this. > > Best regards, > > Martin > > *From:*Chris Plummer > *Sent:* Dienstag, 25. Februar 2020 18:03 > *To:* Doerr, Martin ; OpenJDK Serviceability > ; hotspot-runtime-dev > > *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about copying > unaligned array element > > [Adding runtime-dev as this regards the JNI spec] > > Hi Martin, > > JNI is meant as a means to write code that interfaces with the JVM in a > platform independent way. Therefore the declaration of a jlong or a > jdouble should not require any extra platform dependent considerations. > This also means requirements of an internal JVM API should not impose > any extra requirements on the JNI code. IMHO this should be fixed in > hotspot. Maybe fixing it in jni_md.h (if there is a way to force 64-bit > alignment) or in the makefiles (force the compiler to 64-bit align) > would also be acceptable. > > thanks, > > Chris > > On 2/25/20 3:22 AM, Doerr, Martin wrote: > > Hi Chris, > > according to arraycopy.hpp, > > ?arraycopy operations are implicitly atomic on each array element.? > > This requires 8 Byte alignment for jlong and jdouble. > > I don?t want to give up this property just because Windows 32 bit > doesn?t align them this way by default. > > All other supported platforms do it right by default. > > Best regards, > > Martin > > *From:*Chris Plummer > > *Sent:* Montag, 24. Februar 2020 21:52 > *To:* Doerr, Martin > ; OpenJDK Serviceability > > > *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about copying > unaligned array element > > Hi Martin, > > I'm not so sure I agree with the approach to this fix, nor for the > one already done for JDK-8220348. Shouldn't a user be expected to be > able to pass a jlong variable to SetLongArrayRegion() without the > need for any special platform dependent modifiers added to the > declaration of the variable? > > cheers, > > Chris > > On 2/24/20 5:51 AM, Doerr, Martin wrote: > > Hi, > > reposting on serviceability-dev (was core-libs-dev before). > > Bug: > > https://bugs.openjdk.java.net/browse/JDK-8239856 > > Webrev: > > http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/webrev.00/ > > Thanks for the review, Thomas! > > Best regards, > > Martin > > *From:*Thomas St?fe > > *Sent:* Montag, 24. Februar 2020 14:41 > *To:* Doerr, Martin > > *Cc:* core-libs-dev at openjdk.java.net > ; Lindenmaier, Goetz > ; > Langer, Christoph > > *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about copying > unaligned array element > > Oh okay. Then it looks okay to me. > > Cheers, Thomas > > On Mon, Feb 24, 2020 at 12:56 PM Doerr, Martin > > wrote: > > Hi Thomas, > > thanks for the quick review. > > ATTRIBUTE_ALIGNED is defined in hotspot. I can?t use it for > src/jdk.jdwp.agent/share/native/libjdwp/ArrayReferenceImpl.c. > > Christoph had already suggested to make it available for > core libs, too, but I haven?t found a good place for it. > > Best regards, > > Martin > > *From:*Thomas St?fe > > *Sent:* Montag, 24. Februar 2020 12:52 > *To:* Doerr, Martin > > *Cc:* core-libs-dev at openjdk.java.net > ; Lindenmaier, Goetz > >; Langer, Christoph > > > *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about > copying unaligned array element > > Hi Martin, > > maybe use?ATTRIBUTE_ALIGNED instead? > > Cheers, Thomas > > On Mon, Feb 24, 2020 at 12:44 PM Doerr, Martin > > wrote: > > Hi, > > we had fixed stack array alignment for Windows 32 bit > with JDK-8220348. > > However, there are also stack allocated jlong and > jdouble used as source for SetLongArrayRegion and > SetDoubleArrayRegion with insufficient alignment for > this platform. > > Here?s my proposed fix: > > http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/webrev.00/ > > Please review. > > Best regards, > > Martin > From serguei.spitsyn at oracle.com Wed Feb 26 07:38:49 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 25 Feb 2020 23:38:49 -0800 Subject: RFR [XS]: 8239462: jdk.hotspot.agent misses some ReleaseStringUTFChars calls in case of early returns In-Reply-To: References: <6ed0d004-3e8c-4b27-f583-06dbacf45173@oracle.com> Message-ID: <5ae2b1d1-f01b-1112-8400-cd9ea6d6cd14@oracle.com> Hi Matthias, LGTM++ Thanks, Serguei On 2/25/20 11:30, Alex Menkov wrote: > Hi Matthias, > > LGTM > > --alex > > On 02/25/2020 08:20, Baesken, Matthias wrote: >> >> New webrev : >> >> >> http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.3/ >> >> >> Best regards, Matthias >> >> >>> IMO the solution with goto makes it even worse. >>> If you don't want to introduce the wrapper, could you please restore >>> changes in LinuxDebuggerLocal_attach0 from webrev.1 >>> >>> --alex >>> >>> On 02/21/2020 00:32, Baesken, Matthias wrote: >>>> Hi Alex , >>>> >>>> new webrev : >>>> >>>> http://cr.openjdk.java.net/~mbaesken/webrevs/8239462.2/ >>>> >>>> Best Regards, Matthias >>>> >>>> >>>>> >>>>> Hi Matthias, >>>>> >>>>> Looks good in general, but I think it makes sense to fix #2 cases (at >>>>> least I see them in LinuxDebuggerLocal). If GetStringUTFChars >>>>> fails, the >>>>> code will crash. >>>>> Also I see GetStringUTFChars(str, JNI_FALSE). This look bad as well - >>>>> 2nd arg is a pointer, so it should be NULL or nullptr. >>>>> >>>>> As for #1 and #3 - AFAIU they are both right ways. >>>>> If GetStringUTFChars fails, it throws OOM and return NULL. >>>>> >>>>> And one more thing to consider. >>>>> LinuxDebuggerLocal_attach0 function looks terrible - 7 >>>>> ReleaseStringUTFChars calls for 2 GetStringUTFChars. >>>>> Maybe it make sense to introduce simple wrapper like >>>>> AutoJavaString in >>>>> src/jdk.hotspot.agent/windows/native/libsaproc/sawindbg.cpp >>>>> It would make the code simpler and less error prone. >>>>> >>>>> --alex >>>>> >>>> From serguei.spitsyn at oracle.com Wed Feb 26 07:45:06 2020 From: serguei.spitsyn at oracle.com (serguei.spitsyn at oracle.com) Date: Tue, 25 Feb 2020 23:45:06 -0800 Subject: RFR(XS): 8239379 - ProblemList serviceability/sa/sadebugd/DebugdConnectTest.java on OSX In-Reply-To: <371e1018-11f0-f087-a62a-e98b309ee108@oracle.com> References: <371e1018-11f0-f087-a62a-e98b309ee108@oracle.com> Message-ID: <6894bae6-3dc8-ed23-a9c5-185c459894a1@oracle.com> Hi Chris, It looks good. Thanks, Serguei On 2/25/20 11:47, Chris Plummer wrote: > Hi, > > I have to problem list this test due to JDK-8239062 [1], and have to > do so before I can push the fix for JDK-8238268 [2], which exposes > JDK-8239062 [1]. > > https://bugs.openjdk.java.net/browse/JDK-8239379 > > diff --git a/test/hotspot/jtreg/ProblemList.txt > b/test/hotspot/jtreg/ProblemList.txt > --- a/test/hotspot/jtreg/ProblemList.txt > +++ b/test/hotspot/jtreg/ProblemList.txt > @@ -127,6 +127,7 @@ > ?serviceability/sa/DeadlockDetectionTest.java 8193639 solaris-all > ?serviceability/sa/JhsdbThreadInfoTest.java 8193639 solaris-all > ?serviceability/sa/jmap-hprof/JMapHProfLargeHeapTest.java 8193639 > solaris-all > +serviceability/sa/sadebugd/DebugdConnectTest.java 8239062 macosx-x64 > ?serviceability/sa/TestClassDump.java 8193639 solaris-all > ?serviceability/sa/TestClhsdbJstackLock.java 8193639 solaris-all > ?serviceability/sa/TestCpoolForInvokeDynamic.java 8193639 solaris-all > > [1] https://bugs.openjdk.java.net/browse/JDK-8239062 > [1] https://bugs.openjdk.java.net/browse/JDK-8238268 > > thanks, > > Chris > From ralf.schmelter at sap.com Wed Feb 26 09:53:20 2020 From: ralf.schmelter at sap.com (Schmelter, Ralf) Date: Wed, 26 Feb 2020 09:53:20 +0000 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> Message-ID: Hi Chihiro, I have two remarks: 1. ISO Latin 1 characters which are not ASCII will not work with the code. While the Properties.store() method claims to create ISO Latin 1 String, it really only will create printable ASCII characters (apart from the comment, but it is ASCII too in this case). See Properties.saveConvert, where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. This is important, since the bytes of the ByteArrayOutputStream are then send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break the encoding. Just try using \u00DC in your test. 2. Your change makes it impossible to load the output with properties.load(). The old output could be loaded, since it was a valid properties file. But yours is not. For example, consider the filename c:\test\new. Formerly it would be encoded as: C\:\\test\\new And now it is: C:\test\new But the properties code would see "\n" as the newline character in your encoding. In fact you cannot differentiate between \n, \t, \f and \r originally being one or two characters. Best regards, Ralf From: serviceability-dev On Behalf Of Chihiro Ito Sent: Dienstag, 25. Februar 2020 04:45 To: serguei.spitsyn at oracle.com Cc: serviceability-dev at openjdk.java.net Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows Hi Serguei, Thanks for your review and advice. I modified these.? Could you review this again, please? Webrev :?http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/?? Regards, Chihiro From martin.doerr at sap.com Wed Feb 26 10:16:10 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 26 Feb 2020 10:16:10 +0000 Subject: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element In-Reply-To: <49810f29-dad4-ac6a-a675-878f7f53fb28@oracle.com> References: <2d897aff-ebf9-b755-918a-aed28970c4d0@oracle.com> <49810f29-dad4-ac6a-a675-878f7f53fb28@oracle.com> Message-ID: Hi David, thanks for you detailed input. > The presence of the assertion to highlight the need for alignment is > probably excessive in the case of these JNI APIs, but highly desirable > for the low-level atomic copy routines themselves. I'm not concerned > that these exceptions can "leak" up to the application code using these > JNI API's simply because it only affects debug builds, and is easily > remedied (either by changing the code or disabling this assertion). But > if our own JDK code can encounter them, then we should modify that code. This is an excellent explanation why I've proposed this change. > Is this a windows only change because other compilers force 64-bit > alignment of 64-bit types, even in 32-bit environments? I don't like > seeing this be compiler specific when it is really processor specific > and to be safe (and keep it simple) we should ensure 8-byte alignment in > all cases it is needed. It is a Windows 32 bit only problem. "Without __declspec(align(#)), the compiler generally aligns data on natural boundaries based on the target processor and the size of the data, up to 4-byte boundaries on 32-bit processors, and 8-byte boundaries on 64-bit processors." [1] GCC supports -malign-double for jlong / jdouble alignment on 32 bit processors [2]. Best regards, Martin [1] https://docs.microsoft.com/en-us/cpp/cpp/align-cpp?view=vs-2019 [2] https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html > -----Original Message----- > From: David Holmes > Sent: Mittwoch, 26. Februar 2020 03:55 > To: Doerr, Martin ; Chris Plummer > ; OpenJDK Serviceability dev at openjdk.java.net>; hotspot-runtime-dev dev at openjdk.java.net> > Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array > element > > Hi Martin, > > On 26/02/2020 4:20 am, Doerr, Martin wrote: > > Hi Chris, > > > > I know how JNI is meant. However, C/C++ is (almost) never platform > > independent. Especially when it comes to primitive types. > > There is potentially a question mark over how the JNI > Get/SetArrayRegion methods are implemented, as the spec > makes no mention of atomic updates or accesses. In the absence of any > mention I would expect normal atomicity rules for Java datatypes to > apply - which means long and double do not have to be atomic. > > If our implementation offers atomicity as an extra feature that is in > itself okay, but if that feature imposes additional constraints on the > programmer which are not evident in the specification, that is > questionable IMO. If the lack of alignment simply results in potential > non-atomic access that would be fine; but if it results in a runtime h/w > fault then I would suggest we should not be attempting atomic accesses. > > IIUC you have to run in a special mode to enable memory alignment checks > on x86, so it seems we would potentially just not get atomic accesses. > > The presence of the assertion to highlight the need for alignment is > probably excessive in the case of these JNI APIs, but highly desirable > for the low-level atomic copy routines themselves. I'm not concerned > that these exceptions can "leak" up to the application code using these > JNI API's simply because it only affects debug builds, and is easily > remedied (either by changing the code or disabling this assertion). But > if our own JDK code can encounter them, then we should modify that code. > > > > > My change is not particularly beautiful, but I haven?t found a more > > beautiful way to fix it. > > > > Note that SetLongArrayRegion seems to work without the alignment > > requirement in the product build. However, word tearing could possibly > > be observed. > > > > It's not possible to guarantee element-wise atomicity without alignment > > because of processor architecture. That?s why I think the assertion > > makes sense and violations at least in the code which is part of OpenJDK > > should be fixed IMHO. > > Is this a windows only change because other compilers force 64-bit > alignment of 64-bit types, even in 32-bit environments? I don't like > seeing this be compiler specific when it is really processor specific > and to be safe (and keep it simple) we should ensure 8-byte alignment in > all cases it is needed. > > Cheers, > David > ----- > > > I had already asked for alternative fixes when I was working on > > JDK-8220348 (like force the compiler to 64-bit align 64-bit types on > > stack), but nobody has found a way to do this. > > > > Best regards, > > > > Martin > > > > *From:*Chris Plummer > > *Sent:* Dienstag, 25. Februar 2020 18:03 > > *To:* Doerr, Martin ; OpenJDK Serviceability > > ; hotspot-runtime-dev > > > > *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about copying > > unaligned array element > > > > [Adding runtime-dev as this regards the JNI spec] > > > > Hi Martin, > > > > JNI is meant as a means to write code that interfaces with the JVM in a > > platform independent way. Therefore the declaration of a jlong or a > > jdouble should not require any extra platform dependent considerations. > > This also means requirements of an internal JVM API should not impose > > any extra requirements on the JNI code. IMHO this should be fixed in > > hotspot. Maybe fixing it in jni_md.h (if there is a way to force 64-bit > > alignment) or in the makefiles (force the compiler to 64-bit align) > > would also be acceptable. > > > > thanks, > > > > Chris > > > > On 2/25/20 3:22 AM, Doerr, Martin wrote: > > > > Hi Chris, > > > > according to arraycopy.hpp, > > > > ?arraycopy operations are implicitly atomic on each array element.? > > > > This requires 8 Byte alignment for jlong and jdouble. > > > > I don?t want to give up this property just because Windows 32 bit > > doesn?t align them this way by default. > > > > All other supported platforms do it right by default. > > > > Best regards, > > > > Martin > > > > *From:*Chris Plummer > > > > *Sent:* Montag, 24. Februar 2020 21:52 > > *To:* Doerr, Martin > > ; OpenJDK Serviceability > > > > > > *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about copying > > unaligned array element > > > > Hi Martin, > > > > I'm not so sure I agree with the approach to this fix, nor for the > > one already done for JDK-8220348. Shouldn't a user be expected to be > > able to pass a jlong variable to SetLongArrayRegion() without the > > need for any special platform dependent modifiers added to the > > declaration of the variable? > > > > cheers, > > > > Chris > > > > On 2/24/20 5:51 AM, Doerr, Martin wrote: > > > > Hi, > > > > reposting on serviceability-dev (was core-libs-dev before). > > > > Bug: > > > > https://bugs.openjdk.java.net/browse/JDK-8239856 > > > > Webrev: > > > > > http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/we > brev.00/ > > > > Thanks for the review, Thomas! > > > > Best regards, > > > > Martin > > > > *From:*Thomas St?fe > > > > *Sent:* Montag, 24. Februar 2020 14:41 > > *To:* Doerr, Martin > > > > *Cc:* core-libs-dev at openjdk.java.net > > ; Lindenmaier, Goetz > > ; > > Langer, Christoph > > > > *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about copying > > unaligned array element > > > > Oh okay. Then it looks okay to me. > > > > Cheers, Thomas > > > > On Mon, Feb 24, 2020 at 12:56 PM Doerr, Martin > > > wrote: > > > > Hi Thomas, > > > > thanks for the quick review. > > > > ATTRIBUTE_ALIGNED is defined in hotspot. I can?t use it for > > src/jdk.jdwp.agent/share/native/libjdwp/ArrayReferenceImpl.c. > > > > Christoph had already suggested to make it available for > > core libs, too, but I haven?t found a good place for it. > > > > Best regards, > > > > Martin > > > > *From:*Thomas St?fe > > > > *Sent:* Montag, 24. Februar 2020 12:52 > > *To:* Doerr, Martin > > > > *Cc:* core-libs-dev at openjdk.java.net > > ; Lindenmaier, Goetz > > > >; Langer, Christoph > > > > > *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about > > copying unaligned array element > > > > Hi Martin, > > > > maybe use?ATTRIBUTE_ALIGNED instead? > > > > Cheers, Thomas > > > > On Mon, Feb 24, 2020 at 12:44 PM Doerr, Martin > > > wrote: > > > > Hi, > > > > we had fixed stack array alignment for Windows 32 bit > > with JDK-8220348. > > > > However, there are also stack allocated jlong and > > jdouble used as source for SetLongArrayRegion and > > SetDoubleArrayRegion with insufficient alignment for > > this platform. > > > > Here?s my proposed fix: > > > > > http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/we > brev.00/ > > > > Please review. > > > > Best regards, > > > > Martin > > From david.holmes at oracle.com Wed Feb 26 13:32:15 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 26 Feb 2020 23:32:15 +1000 Subject: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element In-Reply-To: References: <2d897aff-ebf9-b755-918a-aed28970c4d0@oracle.com> <49810f29-dad4-ac6a-a675-878f7f53fb28@oracle.com> Message-ID: <6429ac7e-ef42-b6c9-fc91-851f5436b912@oracle.com> On 26/02/2020 8:16 pm, Doerr, Martin wrote: > Hi David, > > thanks for you detailed input. > >> The presence of the assertion to highlight the need for alignment is >> probably excessive in the case of these JNI APIs, but highly desirable >> for the low-level atomic copy routines themselves. I'm not concerned >> that these exceptions can "leak" up to the application code using these >> JNI API's simply because it only affects debug builds, and is easily >> remedied (either by changing the code or disabling this assertion). But >> if our own JDK code can encounter them, then we should modify that code. > > This is an excellent explanation why I've proposed this change. > > >> Is this a windows only change because other compilers force 64-bit >> alignment of 64-bit types, even in 32-bit environments? I don't like >> seeing this be compiler specific when it is really processor specific >> and to be safe (and keep it simple) we should ensure 8-byte alignment in >> all cases it is needed. > > It is a Windows 32 bit only problem. > "Without __declspec(align(#)), the compiler generally aligns data on natural boundaries based on the target processor and the size of the data, up to 4-byte boundaries on 32-bit processors, and 8-byte boundaries on 64-bit processors." [1] > > GCC supports -malign-double for jlong / jdouble alignment on 32 bit processors [2]. But we aren't using that in the build AFAICS and it isn't the default on x86_32. David ----- > Best regards, > Martin > > > [1] https://docs.microsoft.com/en-us/cpp/cpp/align-cpp?view=vs-2019 > [2] https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html > > >> -----Original Message----- >> From: David Holmes >> Sent: Mittwoch, 26. Februar 2020 03:55 >> To: Doerr, Martin ; Chris Plummer >> ; OpenJDK Serviceability > dev at openjdk.java.net>; hotspot-runtime-dev > dev at openjdk.java.net> >> Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array >> element >> >> Hi Martin, >> >> On 26/02/2020 4:20 am, Doerr, Martin wrote: >>> Hi Chris, >>> >>> I know how JNI is meant. However, C/C++ is (almost) never platform >>> independent. Especially when it comes to primitive types. >> >> There is potentially a question mark over how the JNI >> Get/SetArrayRegion methods are implemented, as the spec >> makes no mention of atomic updates or accesses. In the absence of any >> mention I would expect normal atomicity rules for Java datatypes to >> apply - which means long and double do not have to be atomic. >> >> If our implementation offers atomicity as an extra feature that is in >> itself okay, but if that feature imposes additional constraints on the >> programmer which are not evident in the specification, that is >> questionable IMO. If the lack of alignment simply results in potential >> non-atomic access that would be fine; but if it results in a runtime h/w >> fault then I would suggest we should not be attempting atomic accesses. >> >> IIUC you have to run in a special mode to enable memory alignment checks >> on x86, so it seems we would potentially just not get atomic accesses. >> >> The presence of the assertion to highlight the need for alignment is >> probably excessive in the case of these JNI APIs, but highly desirable >> for the low-level atomic copy routines themselves. I'm not concerned >> that these exceptions can "leak" up to the application code using these >> JNI API's simply because it only affects debug builds, and is easily >> remedied (either by changing the code or disabling this assertion). But >> if our own JDK code can encounter them, then we should modify that code. >> >>> >>> My change is not particularly beautiful, but I haven?t found a more >>> beautiful way to fix it. >>> >>> Note that SetLongArrayRegion seems to work without the alignment >>> requirement in the product build. However, word tearing could possibly >>> be observed. >>> >>> It's not possible to guarantee element-wise atomicity without alignment >>> because of processor architecture. That?s why I think the assertion >>> makes sense and violations at least in the code which is part of OpenJDK >>> should be fixed IMHO. >> >> Is this a windows only change because other compilers force 64-bit >> alignment of 64-bit types, even in 32-bit environments? I don't like >> seeing this be compiler specific when it is really processor specific >> and to be safe (and keep it simple) we should ensure 8-byte alignment in >> all cases it is needed. >> >> Cheers, >> David >> ----- >> >>> I had already asked for alternative fixes when I was working on >>> JDK-8220348 (like force the compiler to 64-bit align 64-bit types on >>> stack), but nobody has found a way to do this. >>> >>> Best regards, >>> >>> Martin >>> >>> *From:*Chris Plummer >>> *Sent:* Dienstag, 25. Februar 2020 18:03 >>> *To:* Doerr, Martin ; OpenJDK Serviceability >>> ; hotspot-runtime-dev >>> >>> *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about copying >>> unaligned array element >>> >>> [Adding runtime-dev as this regards the JNI spec] >>> >>> Hi Martin, >>> >>> JNI is meant as a means to write code that interfaces with the JVM in a >>> platform independent way. Therefore the declaration of a jlong or a >>> jdouble should not require any extra platform dependent considerations. >>> This also means requirements of an internal JVM API should not impose >>> any extra requirements on the JNI code. IMHO this should be fixed in >>> hotspot. Maybe fixing it in jni_md.h (if there is a way to force 64-bit >>> alignment) or in the makefiles (force the compiler to 64-bit align) >>> would also be acceptable. >>> >>> thanks, >>> >>> Chris >>> >>> On 2/25/20 3:22 AM, Doerr, Martin wrote: >>> >>> Hi Chris, >>> >>> according to arraycopy.hpp, >>> >>> ?arraycopy operations are implicitly atomic on each array element.? >>> >>> This requires 8 Byte alignment for jlong and jdouble. >>> >>> I don?t want to give up this property just because Windows 32 bit >>> doesn?t align them this way by default. >>> >>> All other supported platforms do it right by default. >>> >>> Best regards, >>> >>> Martin >>> >>> *From:*Chris Plummer >>> >>> *Sent:* Montag, 24. Februar 2020 21:52 >>> *To:* Doerr, Martin >>> ; OpenJDK Serviceability >>> >>> >>> *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about copying >>> unaligned array element >>> >>> Hi Martin, >>> >>> I'm not so sure I agree with the approach to this fix, nor for the >>> one already done for JDK-8220348. Shouldn't a user be expected to be >>> able to pass a jlong variable to SetLongArrayRegion() without the >>> need for any special platform dependent modifiers added to the >>> declaration of the variable? >>> >>> cheers, >>> >>> Chris >>> >>> On 2/24/20 5:51 AM, Doerr, Martin wrote: >>> >>> Hi, >>> >>> reposting on serviceability-dev (was core-libs-dev before). >>> >>> Bug: >>> >>> https://bugs.openjdk.java.net/browse/JDK-8239856 >>> >>> Webrev: >>> >>> >> http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/we >> brev.00/ >>> >>> Thanks for the review, Thomas! >>> >>> Best regards, >>> >>> Martin >>> >>> *From:*Thomas St?fe >>> >>> *Sent:* Montag, 24. Februar 2020 14:41 >>> *To:* Doerr, Martin >>> >>> *Cc:* core-libs-dev at openjdk.java.net >>> ; Lindenmaier, Goetz >>> ; >>> Langer, Christoph >>> >>> *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about copying >>> unaligned array element >>> >>> Oh okay. Then it looks okay to me. >>> >>> Cheers, Thomas >>> >>> On Mon, Feb 24, 2020 at 12:56 PM Doerr, Martin >>> > wrote: >>> >>> Hi Thomas, >>> >>> thanks for the quick review. >>> >>> ATTRIBUTE_ALIGNED is defined in hotspot. I can?t use it for >>> src/jdk.jdwp.agent/share/native/libjdwp/ArrayReferenceImpl.c. >>> >>> Christoph had already suggested to make it available for >>> core libs, too, but I haven?t found a good place for it. >>> >>> Best regards, >>> >>> Martin >>> >>> *From:*Thomas St?fe >> > >>> *Sent:* Montag, 24. Februar 2020 12:52 >>> *To:* Doerr, Martin >> > >>> *Cc:* core-libs-dev at openjdk.java.net >>> ; Lindenmaier, Goetz >>> >> >; Langer, Christoph >>> > >>> *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about >>> copying unaligned array element >>> >>> Hi Martin, >>> >>> maybe use?ATTRIBUTE_ALIGNED instead? >>> >>> Cheers, Thomas >>> >>> On Mon, Feb 24, 2020 at 12:44 PM Doerr, Martin >>> > wrote: >>> >>> Hi, >>> >>> we had fixed stack array alignment for Windows 32 bit >>> with JDK-8220348. >>> >>> However, there are also stack allocated jlong and >>> jdouble used as source for SetLongArrayRegion and >>> SetDoubleArrayRegion with insufficient alignment for >>> this platform. >>> >>> Here?s my proposed fix: >>> >>> >> http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/we >> brev.00/ >>> >>> Please review. >>> >>> Best regards, >>> >>> Martin >>> From martin.doerr at sap.com Wed Feb 26 16:39:03 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 26 Feb 2020 16:39:03 +0000 Subject: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array element In-Reply-To: <6429ac7e-ef42-b6c9-fc91-851f5436b912@oracle.com> References: <2d897aff-ebf9-b755-918a-aed28970c4d0@oracle.com> <49810f29-dad4-ac6a-a675-878f7f53fb28@oracle.com> <6429ac7e-ef42-b6c9-fc91-851f5436b912@oracle.com> Message-ID: Hi David, > > GCC supports -malign-double for jlong / jdouble alignment on 32 bit > processors [2]. > > But we aren't using that in the build AFAICS and it isn't the default on > x86_32. We have stopped building linux 32 bit since JDK8, so I can't tell. At least -malign-double should be a valid workaround which we don't have for Windows 32 bit. Best regards, Martin > -----Original Message----- > From: David Holmes > Sent: Mittwoch, 26. Februar 2020 14:32 > To: Doerr, Martin ; Chris Plummer > ; OpenJDK Serviceability dev at openjdk.java.net>; hotspot-runtime-dev dev at openjdk.java.net> > Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned array > element > > On 26/02/2020 8:16 pm, Doerr, Martin wrote: > > Hi David, > > > > thanks for you detailed input. > > > >> The presence of the assertion to highlight the need for alignment is > >> probably excessive in the case of these JNI APIs, but highly desirable > >> for the low-level atomic copy routines themselves. I'm not concerned > >> that these exceptions can "leak" up to the application code using these > >> JNI API's simply because it only affects debug builds, and is easily > >> remedied (either by changing the code or disabling this assertion). But > >> if our own JDK code can encounter them, then we should modify that > code. > > > > This is an excellent explanation why I've proposed this change. > > > > > >> Is this a windows only change because other compilers force 64-bit > >> alignment of 64-bit types, even in 32-bit environments? I don't like > >> seeing this be compiler specific when it is really processor specific > >> and to be safe (and keep it simple) we should ensure 8-byte alignment in > >> all cases it is needed. > > > > It is a Windows 32 bit only problem. > > "Without __declspec(align(#)), the compiler generally aligns data on natural > boundaries based on the target processor and the size of the data, up to 4- > byte boundaries on 32-bit processors, and 8-byte boundaries on 64-bit > processors." [1] > > > > GCC supports -malign-double for jlong / jdouble alignment on 32 bit > processors [2]. > > But we aren't using that in the build AFAICS and it isn't the default on > x86_32. > > David > ----- > > > Best regards, > > Martin > > > > > > [1] https://docs.microsoft.com/en-us/cpp/cpp/align-cpp?view=vs-2019 > > [2] https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html > > > > > >> -----Original Message----- > >> From: David Holmes > >> Sent: Mittwoch, 26. Februar 2020 03:55 > >> To: Doerr, Martin ; Chris Plummer > >> ; OpenJDK Serviceability >> dev at openjdk.java.net>; hotspot-runtime-dev >> dev at openjdk.java.net> > >> Subject: Re: RFR(XS): 8239856: [ntintel] asserts about copying unaligned > array > >> element > >> > >> Hi Martin, > >> > >> On 26/02/2020 4:20 am, Doerr, Martin wrote: > >>> Hi Chris, > >>> > >>> I know how JNI is meant. However, C/C++ is (almost) never platform > >>> independent. Especially when it comes to primitive types. > >> > >> There is potentially a question mark over how the JNI > >> Get/SetArrayRegion methods are implemented, as the > spec > >> makes no mention of atomic updates or accesses. In the absence of any > >> mention I would expect normal atomicity rules for Java datatypes to > >> apply - which means long and double do not have to be atomic. > >> > >> If our implementation offers atomicity as an extra feature that is in > >> itself okay, but if that feature imposes additional constraints on the > >> programmer which are not evident in the specification, that is > >> questionable IMO. If the lack of alignment simply results in potential > >> non-atomic access that would be fine; but if it results in a runtime h/w > >> fault then I would suggest we should not be attempting atomic accesses. > >> > >> IIUC you have to run in a special mode to enable memory alignment > checks > >> on x86, so it seems we would potentially just not get atomic accesses. > >> > >> The presence of the assertion to highlight the need for alignment is > >> probably excessive in the case of these JNI APIs, but highly desirable > >> for the low-level atomic copy routines themselves. I'm not concerned > >> that these exceptions can "leak" up to the application code using these > >> JNI API's simply because it only affects debug builds, and is easily > >> remedied (either by changing the code or disabling this assertion). But > >> if our own JDK code can encounter them, then we should modify that > code. > >> > >>> > >>> My change is not particularly beautiful, but I haven?t found a more > >>> beautiful way to fix it. > >>> > >>> Note that SetLongArrayRegion seems to work without the alignment > >>> requirement in the product build. However, word tearing could possibly > >>> be observed. > >>> > >>> It's not possible to guarantee element-wise atomicity without alignment > >>> because of processor architecture. That?s why I think the assertion > >>> makes sense and violations at least in the code which is part of OpenJDK > >>> should be fixed IMHO. > >> > >> Is this a windows only change because other compilers force 64-bit > >> alignment of 64-bit types, even in 32-bit environments? I don't like > >> seeing this be compiler specific when it is really processor specific > >> and to be safe (and keep it simple) we should ensure 8-byte alignment in > >> all cases it is needed. > >> > >> Cheers, > >> David > >> ----- > >> > >>> I had already asked for alternative fixes when I was working on > >>> JDK-8220348 (like force the compiler to 64-bit align 64-bit types on > >>> stack), but nobody has found a way to do this. > >>> > >>> Best regards, > >>> > >>> Martin > >>> > >>> *From:*Chris Plummer > >>> *Sent:* Dienstag, 25. Februar 2020 18:03 > >>> *To:* Doerr, Martin ; OpenJDK Serviceability > >>> ; hotspot-runtime-dev > >>> > >>> *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about copying > >>> unaligned array element > >>> > >>> [Adding runtime-dev as this regards the JNI spec] > >>> > >>> Hi Martin, > >>> > >>> JNI is meant as a means to write code that interfaces with the JVM in a > >>> platform independent way. Therefore the declaration of a jlong or a > >>> jdouble should not require any extra platform dependent > considerations. > >>> This also means requirements of an internal JVM API should not impose > >>> any extra requirements on the JNI code. IMHO this should be fixed in > >>> hotspot. Maybe fixing it in jni_md.h (if there is a way to force 64-bit > >>> alignment) or in the makefiles (force the compiler to 64-bit align) > >>> would also be acceptable. > >>> > >>> thanks, > >>> > >>> Chris > >>> > >>> On 2/25/20 3:22 AM, Doerr, Martin wrote: > >>> > >>> Hi Chris, > >>> > >>> according to arraycopy.hpp, > >>> > >>> ?arraycopy operations are implicitly atomic on each array element.? > >>> > >>> This requires 8 Byte alignment for jlong and jdouble. > >>> > >>> I don?t want to give up this property just because Windows 32 bit > >>> doesn?t align them this way by default. > >>> > >>> All other supported platforms do it right by default. > >>> > >>> Best regards, > >>> > >>> Martin > >>> > >>> *From:*Chris Plummer > >>> > >>> *Sent:* Montag, 24. Februar 2020 21:52 > >>> *To:* Doerr, Martin > >>> ; OpenJDK Serviceability > >>> > >>> > >>> *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about copying > >>> unaligned array element > >>> > >>> Hi Martin, > >>> > >>> I'm not so sure I agree with the approach to this fix, nor for the > >>> one already done for JDK-8220348. Shouldn't a user be expected to be > >>> able to pass a jlong variable to SetLongArrayRegion() without the > >>> need for any special platform dependent modifiers added to the > >>> declaration of the variable? > >>> > >>> cheers, > >>> > >>> Chris > >>> > >>> On 2/24/20 5:51 AM, Doerr, Martin wrote: > >>> > >>> Hi, > >>> > >>> reposting on serviceability-dev (was core-libs-dev before). > >>> > >>> Bug: > >>> > >>> https://bugs.openjdk.java.net/browse/JDK-8239856 > >>> > >>> Webrev: > >>> > >>> > >> > http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/we > >> brev.00/ > >>> > >>> Thanks for the review, Thomas! > >>> > >>> Best regards, > >>> > >>> Martin > >>> > >>> *From:*Thomas St?fe > >>> > >>> *Sent:* Montag, 24. Februar 2020 14:41 > >>> *To:* Doerr, Martin > >>> > >>> *Cc:* core-libs-dev at openjdk.java.net > >>> ; Lindenmaier, Goetz > >>> > ; > >>> Langer, Christoph > >>> > >>> *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about copying > >>> unaligned array element > >>> > >>> Oh okay. Then it looks okay to me. > >>> > >>> Cheers, Thomas > >>> > >>> On Mon, Feb 24, 2020 at 12:56 PM Doerr, Martin > >>> > wrote: > >>> > >>> Hi Thomas, > >>> > >>> thanks for the quick review. > >>> > >>> ATTRIBUTE_ALIGNED is defined in hotspot. I can?t use it for > >>> src/jdk.jdwp.agent/share/native/libjdwp/ArrayReferenceImpl.c. > >>> > >>> Christoph had already suggested to make it available for > >>> core libs, too, but I haven?t found a good place for it. > >>> > >>> Best regards, > >>> > >>> Martin > >>> > >>> *From:*Thomas St?fe >>> > > >>> *Sent:* Montag, 24. Februar 2020 12:52 > >>> *To:* Doerr, Martin >>> > > >>> *Cc:* core-libs-dev at openjdk.java.net > >>> ; Lindenmaier, Goetz > >>> >>> >; Langer, Christoph > >>> > > >>> *Subject:* Re: RFR(XS): 8239856: [ntintel] asserts about > >>> copying unaligned array element > >>> > >>> Hi Martin, > >>> > >>> maybe use?ATTRIBUTE_ALIGNED instead? > >>> > >>> Cheers, Thomas > >>> > >>> On Mon, Feb 24, 2020 at 12:44 PM Doerr, Martin > >>> > > wrote: > >>> > >>> Hi, > >>> > >>> we had fixed stack array alignment for Windows 32 bit > >>> with JDK-8220348. > >>> > >>> However, there are also stack allocated jlong and > >>> jdouble used as source for SetLongArrayRegion and > >>> SetDoubleArrayRegion with insufficient alignment for > >>> this platform. > >>> > >>> Here?s my proposed fix: > >>> > >>> > >> > http://cr.openjdk.java.net/~mdoerr/8239856_win32_long_double_align/we > >> brev.00/ > >>> > >>> Please review. > >>> > >>> Best regards, > >>> > >>> Martin > >>> From zgu at redhat.com Wed Feb 26 22:16:40 2020 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 26 Feb 2020 17:16:40 -0500 Subject: [15] RFR 8238633: JVMTI heap walk should consult GC for marking oops In-Reply-To: <3d8553da-f7a8-fe84-d923-b10c5b47775e@redhat.com> References: <940D46CE-22D6-479B-B8EF-65FFA896C4A0@oracle.com> <3d8553da-f7a8-fe84-d923-b10c5b47775e@redhat.com> Message-ID: <701adbd0-d66d-1002-d2e3-77ea02847613@redhat.com> Hi, >> So perhaps just adding the NULL check in the barrier for the case >> where the markWord ?is_marked? is the sane thing to do, knowing that >> the other costs taken in the same path will dominate. > > I have this patch, exactly what you suggested. I will let Aleksey run > his numbers. Aleksey is happy with the solution, as we are able to split compiler barrier and runtime barrier. Therefore, I would like to withdraw this CR. Thanks all of you! -Zhengyu From daniel.daugherty at oracle.com Wed Feb 26 22:49:05 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 26 Feb 2020 17:49:05 -0500 Subject: RFR(T): 8240132: ProblemList com/sun/jdi/InvokeHangTest.java Message-ID: <0a7730fa-ead7-8d2e-fe24-c66a53210a04@oracle.com> Greetings, I'm trying to reduce the noise in the jdk/jdk CI. I'm ProblemListing com/sun/jdi/InvokeHangTest.java on Linux due to this bug: ??? JDK-8218463 com/sun/jdi/InvokeHangTest.java fail "java.lang.Exception: InvokeHangTest: failed; bkpts = 64 " ??? https://bugs.openjdk.java.net/browse/JDK-8218463 I'm using the following subtask to do the ProblemListing: ??? JDK-8240132 ProblemList com/sun/jdi/InvokeHangTest.java ??? https://bugs.openjdk.java.net/browse/JDK-8240132 Here's the context diff: $ hg diff diff -r f67951f722a4 test/jdk/ProblemList.txt --- a/test/jdk/ProblemList.txt?? ?Wed Feb 26 21:24:02 2020 +0100 +++ b/test/jdk/ProblemList.txt?? ?Wed Feb 26 17:45:04 2020 -0500 @@ -915,6 +915,8 @@ ?com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all +com/sun/jdi/InvokeHangTest.java 8218463 linux-all + ?############################################################################ ?# jdk_time Thanks, in advance, for any comments, questions or suggestions. Dan From mikael.vidstedt at oracle.com Wed Feb 26 22:52:28 2020 From: mikael.vidstedt at oracle.com (Mikael Vidstedt) Date: Wed, 26 Feb 2020 14:52:28 -0800 Subject: RFR(T): 8240132: ProblemList com/sun/jdi/InvokeHangTest.java In-Reply-To: <0a7730fa-ead7-8d2e-fe24-c66a53210a04@oracle.com> References: <0a7730fa-ead7-8d2e-fe24-c66a53210a04@oracle.com> Message-ID: <0DFC541B-F54C-4887-B5FE-84556A47D70E@oracle.com> Looks good, thanks for doing it! Cheers, Mikael > On Feb 26, 2020, at 2:49 PM, Daniel D. Daugherty wrote: > > Greetings, > > I'm trying to reduce the noise in the jdk/jdk CI. I'm ProblemListing > com/sun/jdi/InvokeHangTest.java on Linux due to this bug: > > JDK-8218463 com/sun/jdi/InvokeHangTest.java fail "java.lang.Exception: InvokeHangTest: failed; bkpts = 64 " > https://bugs.openjdk.java.net/browse/JDK-8218463 > > I'm using the following subtask to do the ProblemListing: > > JDK-8240132 ProblemList com/sun/jdi/InvokeHangTest.java > https://bugs.openjdk.java.net/browse/JDK-8240132 > > Here's the context diff: > > $ hg diff > diff -r f67951f722a4 test/jdk/ProblemList.txt > --- a/test/jdk/ProblemList.txt Wed Feb 26 21:24:02 2020 +0100 > +++ b/test/jdk/ProblemList.txt Wed Feb 26 17:45:04 2020 -0500 > @@ -915,6 +915,8 @@ > > com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all > > +com/sun/jdi/InvokeHangTest.java 8218463 linux-all > + > ############################################################################ > > # jdk_time > > Thanks, in advance, for any comments, questions or suggestions. > > Dan From daniel.daugherty at oracle.com Wed Feb 26 23:00:36 2020 From: daniel.daugherty at oracle.com (Daniel D. Daugherty) Date: Wed, 26 Feb 2020 18:00:36 -0500 Subject: RFR(T): 8240132: ProblemList com/sun/jdi/InvokeHangTest.java In-Reply-To: <0DFC541B-F54C-4887-B5FE-84556A47D70E@oracle.com> References: <0a7730fa-ead7-8d2e-fe24-c66a53210a04@oracle.com> <0DFC541B-F54C-4887-B5FE-84556A47D70E@oracle.com> Message-ID: Thanks for the fast review! Dan On 2/26/20 5:52 PM, Mikael Vidstedt wrote: > Looks good, thanks for doing it! > > Cheers, > Mikael > >> On Feb 26, 2020, at 2:49 PM, Daniel D. Daugherty wrote: >> >> Greetings, >> >> I'm trying to reduce the noise in the jdk/jdk CI. I'm ProblemListing >> com/sun/jdi/InvokeHangTest.java on Linux due to this bug: >> >> JDK-8218463 com/sun/jdi/InvokeHangTest.java fail "java.lang.Exception: InvokeHangTest: failed; bkpts = 64 " >> https://bugs.openjdk.java.net/browse/JDK-8218463 >> >> I'm using the following subtask to do the ProblemListing: >> >> JDK-8240132 ProblemList com/sun/jdi/InvokeHangTest.java >> https://bugs.openjdk.java.net/browse/JDK-8240132 >> >> Here's the context diff: >> >> $ hg diff >> diff -r f67951f722a4 test/jdk/ProblemList.txt >> --- a/test/jdk/ProblemList.txt Wed Feb 26 21:24:02 2020 +0100 >> +++ b/test/jdk/ProblemList.txt Wed Feb 26 17:45:04 2020 -0500 >> @@ -915,6 +915,8 @@ >> >> com/sun/jdi/NashornPopFrameTest.java 8225620 generic-all >> >> +com/sun/jdi/InvokeHangTest.java 8218463 linux-all >> + >> ############################################################################ >> >> # jdk_time >> >> Thanks, in advance, for any comments, questions or suggestions. >> >> Dan From alexey.menkov at oracle.com Thu Feb 27 01:02:55 2020 From: alexey.menkov at oracle.com (Alex Menkov) Date: Wed, 26 Feb 2020 17:02:55 -0800 Subject: RFR(XS): 8193237 - SA: ClhsdbLauncher should show the command being executed In-Reply-To: <633cdb64-fe0f-4dac-60d7-9e7ee405928c@oracle.com> References: <633cdb64-fe0f-4dac-60d7-9e7ee405928c@oracle.com> Message-ID: <8836b0e0-d0e8-5ca1-cd34-bdc4481d13f7@oracle.com> +1 --alex On 02/24/2020 23:15, serguei.spitsyn at oracle.com wrote: > Hi Chris, > > This looks good to me. > I always prefer verbose output in tests. :) > > Thanks, > Serguei > > > On 2/24/20 21:48, Chris Plummer wrote: >> Hello, >> >> Please review the following: >> >> https://bugs.openjdk.java.net/browse/JDK-8193237 >> http://cr.openjdk.java.net/~cjplummer/8193237/webrev.00/ >> >> The fix is to issue an "echo on" command before the test commands. The >> bug gives an example of how this fix improves the test output. >> >> thanks, >> >> Chris > From chris.plummer at oracle.com Thu Feb 27 03:49:29 2020 From: chris.plummer at oracle.com (Chris Plummer) Date: Wed, 26 Feb 2020 19:49:29 -0800 Subject: RFR(XS): 8240142: Fix copyright in ThreadGroupReferenceImpl.h Message-ID: Hello, Please review the following copyright fix. A comma is missing: https://bugs.openjdk.java.net/browse/JDK-8240142 diff --git a/src/jdk.jdwp.agent/share/native/libjdwp/ThreadGroupReferenceImpl.h b/src/jdk.jdwp.agent/share/native/libjdwp/ThreadGroupReferenceImpl.h --- a/src/jdk.jdwp.agent/share/native/libjdwp/ThreadGroupReferenceImpl.h +++ b/src/jdk.jdwp.agent/share/native/libjdwp/ThreadGroupReferenceImpl.h @@ -1,5 +1,5 @@ ?/* - * Copyright (c) 1998, 2020 Oracle and/or its affiliates. All rights reserved. + * Copyright (c) 1998, 2020, Oracle and/or its affiliates. All rights reserved. ? * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ? * ? * This code is free software; you can redistribute it and/or modify it thanks, Chris From david.holmes at oracle.com Thu Feb 27 03:52:27 2020 From: david.holmes at oracle.com (David Holmes) Date: Thu, 27 Feb 2020 13:52:27 +1000 Subject: RFR(XS): 8240142: Fix copyright in ThreadGroupReferenceImpl.h In-Reply-To: References: Message-ID: <22ec3a90-2a76-fd2f-8a3a-a952358956de@oracle.com> Looks good and trivial. Cheers, David On 27/02/2020 1:49 pm, Chris Plummer wrote: > Hello, > > Please review the following copyright fix. A comma is missing: > > https://bugs.openjdk.java.net/browse/JDK-8240142 > > diff --git > a/src/jdk.jdwp.agent/share/native/libjdwp/ThreadGroupReferenceImpl.h > b/src/jdk.jdwp.agent/share/native/libjdwp/ThreadGroupReferenceImpl.h > --- a/src/jdk.jdwp.agent/share/native/libjdwp/ThreadGroupReferenceImpl.h > +++ b/src/jdk.jdwp.agent/share/native/libjdwp/ThreadGroupReferenceImpl.h > @@ -1,5 +1,5 @@ > ?/* > - * Copyright (c) 1998, 2020 Oracle and/or its affiliates. All rights > reserved. > + * Copyright (c) 1998, 2020, Oracle and/or its affiliates. All rights > reserved. > ? * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. > ? * > ? * This code is free software; you can redistribute it and/or modify it > > thanks, > > Chris > From suenaga at oss.nttdata.com Thu Feb 27 05:13:25 2020 From: suenaga at oss.nttdata.com (Yasumasa Suenaga) Date: Thu, 27 Feb 2020 14:13:25 +0900 Subject: : PING: RFR: 8234624: jstack mixed mode should refer DWARF In-Reply-To: References: <3890a0f3-55f4-c9e5-76c3-d3d18db6b79a@oss.nttdata.com> <2515a14d-90f5-da47-c802-966b06f20adc@oracle.com> <3ae29ebb-556f-f8c7-c107-61a5d18fce07@oss.nttdata.com> <5d699d6c-76e6-7846-fa3e-efbbaf29322a@oss.nttdata.com> Message-ID: Hi all, webrev.03 cannot be applied to current jdk/jdk due to 8239224 and 8239462 changes (they updated copyright year). So I modified webrev (only copyright year changes) to be able to apply to current jdk/jdk. Could you review it? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.04/ I need one more reviewer to push. Thanks, Yasumasa On 2020/02/17 13:07, Yasumasa Suenaga wrote: > PING: Could you review it? > >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ > > This change has been already reviewed by Serguei. > I need one more reviewer to push. > > > Thanks, > > Yasumasa > > > On 2020/02/03 1:37, Yasumasa Suenaga wrote: >> PING: Could you reveiw this change? >> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >> >> I believe this change helps troubleshooter to fight to postmortem analysis. >> >> >> Thanks, >> >> Yasumasa >> >> >> On 2020/01/19 3:16, Yasumasa Suenaga wrote: >>> PING: Could you review it? >>> >>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.03/ >>> >>> I updated webrev. I discussed with Serguei in off list, and I refactored webrev.02 . >>> It has passed tests on submit repo (mach5-one-ysuenaga-JDK-8234624-4-20200118-1353-8149549). >>> >>> >>> Thanks, >>> >>> Yasumasa >>> >>> >>> On 2019/12/15 10:51, Yasumasa Suenaga wrote: >>>> Hi Serguei, >>>> >>>> Thanks for your comment! >>>> I refactored LinuxCDebugger and LinuxAMD64CFrame in new webrev. >>>> Also I fixed to free lib->eh_frame.data in libproc_impl.c as Dmitry said. >>>> >>>> ?? http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.02/ >>>> >>>> This change has been passed all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-3-20191214-1527-7538487). >>>> >>>> >>>> Thanks, >>>> >>>> Yasumasa >>>> >>>> >>>> On 2019/12/14 10:02, serguei.spitsyn at oracle.com wrote: >>>>> Hi Yasumasa, >>>>> >>>>> This is nice move in general. >>>>> Thank you for working on this! >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/LinuxCDebugger.java.frames.html >>>>> >>>>> 96 long libptr = dbg.findLibPtrByAddress(pc); 97 if (libptr == 0L) { // Java frame 98 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 99 if (rbp == null) { 100 return null; 101 } 102 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 103 } else { // Native frame 104 DwarfParser dwarf; 105 try { 106 dwarf = new DwarfParser(libptr); 107 } catch (DebuggerException e) { 108 Address rbp = context.getRegisterAsAddress(AMD64ThreadContext.RBP); 109 if (rbp == null) { 110 return null; 111 } 112 return new LinuxAMD64CFrame(dbg, rbp, pc, null); 113 } 114 dwarf.processDwarf(pc); 115 Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && 116 !dwarf.isBPOffsetAvailable()) 117 ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) 118 : context.getRegisterAsAddress(dwarf.getCFARegister()) 119 .addOffsetTo(dwarf.getCFAOffset()); 120 if (cfa == null) { 121 return null; 122 } 123 return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); 124 } >>>>> >>>>> >>>>> I'd suggest to simplify the logic by refactoring to something like below: >>>>> >>>>> ?????????? long libptr = dbg.findLibPtrByAddress(pc); >>>>> ?????????? Address cfa = context.getRegisterAsAddress(AMD64ThreadContext.RBP); // Java frame >>>>> ?????????? DwarfParser dwarf = null; >>>>> >>>>> ?????????? if (libptr != 0L) { // Native frame >>>>> ???????????? try { >>>>> ?????????????? dwarf = new DwarfParser(libptr); >>>>> ?????????????? dwarf.processDwarf(pc); >>>>> ?????????????? Address cfa = ((dwarf.getCFARegister() == AMD64ThreadContext.RBP) && >>>>> ????????????????????????????? !dwarf.isBPOffsetAvailable()) >>>>> ???????????????????????????????? ? context.getRegisterAsAddress(AMD64ThreadContext.RBP) >>>>> ???????????????????????????????? : context.getRegisterAsAddress(dwarf.getCFARegister()) >>>>> .addOffsetTo(dwarf.getCFAOffset()); >>>>> >>>>> ??????????? } catch (DebuggerException e) { // bail out to Java frame case >>>>> ??????????? } >>>>> ????????? } >>>>> ????????? if (cfa == null) { >>>>> ??????????? return null; >>>>> ????????? } >>>>> ????????? return new LinuxAMD64CFrame(dbg, cfa, pc, dwarf); >>>>> >>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/debugger/linux/amd64/LinuxAMD64CFrame.java.frames.html >>>>> >>>>> 58 long ofs = useDwarf ? dwarf.getReturnAddressOffsetFromCFA() >>>>> >>>>> ?? Better to rename 'ofs' => 'offs'. >>>>> >>>>> 77 nextCFA = nextCFA.addOffsetTo(- nextDwarf.getBasePointerOffsetFromCFA()); >>>>> >>>>> ?? Extra space after '-' sign. >>>>> >>>>> 71 private Address getNextCFA(DwarfParser nextDwarf, ThreadContext context) { >>>>> >>>>> ?? It feels like the logic has to be somehow refactored/simplified as >>>>> ?? several typical fragments appears in slightly different contexts. >>>>> ?? But it is not easy to understand what it is. >>>>> ?? Could you, please, add some comments to key places explaining this logic. >>>>> ?? Then I'll check if it is possible to make it a little bit simpler. >>>>> >>>>> 109 private CFrame javaSender(ThreadContext context) { 110 Address nextCFA; 111 Address nextPC; 112 113 nextPC = getNextPC(false); 114 if (nextPC == null) { 115 return null; 116 } 117 118 DwarfParser nextDwarf = null; 119 long libptr = dbg.findLibPtrByAddress(nextPC); 120 if (libptr != 0L) { // Native frame 121 try { 122 nextDwarf = new DwarfParser(libptr); 123 } catch (DebuggerException e) { 124 nextCFA = getNextCFA(null, context); 125 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 126 } 127 nextDwarf.processDwarf(nextPC); 128 } 129 130 nextCFA = getNextCFA(nextDwarf, context); 131 return (nextCFA == null) ? null 132 : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); 133 } >>>>> >>>>> ??The above can be simplified if a DebuggerException can not be thrown from processDwarf(nextPC): >>>>> ????? private CFrame javaSender(ThreadContext context) { >>>>> ??????? Address nextPC = getNextPC(false); >>>>> ??????? if (nextPC == null) { >>>>> ????????? return null; >>>>> ??????? } >>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>> ??????? DwarfParser nextDwarf = null; >>>>> >>>>> ??????? if (libptr != 0L) { // Native frame >>>>> ????????? try { >>>>> ??????????? nextDwarf = new DwarfParser(libptr); >>>>> ??????????? nextDwarf.processDwarf(nextPC); >>>>> ????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>> ????????? } >>>>> ??????? } >>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>> ????? } >>>>> >>>>> 135 public CFrame sender(ThreadProxy thread) { 136 ThreadContext context = thread.getContext(); 137 138 if (dwarf == null) { // Java frame 139 return javaSender(context); 140 } 141 142 Address nextPC = getNextPC(true); 143 if (nextPC == null) { 144 return null; 145 } 146 147 Address nextCFA; 148 DwarfParser nextDwarf = dwarf; 149 if (!dwarf.isIn(nextPC)) { 150 long libptr = dbg.findLibPtrByAddress(nextPC); 151 if (libptr == 0L) { 152 // Next frame might be Java frame 153 nextCFA = getNextCFA(null, context); 154 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 155 } 156 try { 157 nextDwarf = new DwarfParser(libptr); 158 } catch (DebuggerException e) { 159 nextCFA = getNextCFA(null, context); 160 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, null); 161 } 162 } 163 164 nextDwarf.processDwarf(nextPC); 165 nextCFA = getNextCFA(nextDwarf, context); 166 return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, >>>>> nextCFA, nextPC, nextDwarf); 167 } >>>>> >>>>> ??This one can be also simplified a little: >>>>> >>>>> ????? public CFrame sender(ThreadProxy thread) { >>>>> ??????? ThreadContext context = thread.getContext(); >>>>> >>>>> ??????? if (dwarf == null) { // Java frame >>>>> ????????? return javaSender(context); >>>>> ??????? } >>>>> ??????? Address nextPC = getNextPC(true); >>>>> ??????? if (nextPC == null) { >>>>> ????????? return null; >>>>> ??????? } >>>>> ??????? DwarfParser nextDwarf = null; >>>>> ??????? if (!dwarf.isIn(nextPC)) { >>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>> ????????? if (libptr != 0L) { >>>>> ??????????? try { >>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>> ??????????? } >>>>> ????????? } >>>>> ??????? } >>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>> ????? } >>>>> >>>>> Finally, it looks like just one method could replace both >>>>> sender(ThreadProxy thread) and javaSender(ThreadContext context): >>>>> >>>>> ????? private CFrame commonSender(ThreadProxy thread) { >>>>> ??????? ThreadContext context = thread.getContext(); >>>>> ??????? Address nextPC = getNextPC(false); >>>>> ??????? if (nextPC == null) { >>>>> ????????? return null; >>>>> ??????? } >>>>> ??????? DwarfParser nextDwarf = null; >>>>> >>>>> ??????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>> ??????? if (dwarf == null || !dwarf.isIn(nextPC)) { >>>>> ????????? long libptr = dbg.findLibPtrByAddress(nextPC); >>>>> ????????? if (libptr != 0L) { >>>>> ??????????? try { >>>>> ????????????? nextDwarf = new DwarfParser(libptr); >>>>> ????????????? nextDwarf.processDwarf(nextPC); >>>>> ??????????? } catch (DebuggerException e) { // Bail out to Java frame >>>>> ??????????? } >>>>> ????????? } >>>>> ??????? } >>>>> ??????? Address nextCFA = getNextCFA(nextDwarf, context); >>>>> ??????? return (nextCFA == null) ? null : new LinuxAMD64CFrame(dbg, nextCFA, nextPC, nextDwarf); >>>>> ????? } >>>>> >>>>> I'm still reviewing the dwarf parser files. >>>>> >>>>> Thanks, >>>>> Serguei >>>>> >>>>> >>>>> On 11/28/19 4:39 AM, Yasumasa Suenaga wrote: >>>>>> Hi, >>>>>> >>>>>> I refactored LinuxAMD64CFrame.java . It works fine in serviceability/sa tests and >>>>>> all tests on submit repo (mach5-one-ysuenaga-JDK-8234624-2-20191128-0928-7059923). >>>>>> Could you review new webrev? >>>>>> >>>>>> http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.01/ >>>>>> >>>>>> The diff from previous webrev is here: >>>>>> http://hg.openjdk.java.net/jdk/submit/rev/4bc47efbc90b >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Yasumasa >>>>>> >>>>>> >>>>>> On 2019/11/25 14:08, Yasumasa Suenaga wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> Please review this change: >>>>>>> >>>>>>> ?? JBS: https://bugs.openjdk.java.net/browse/JDK-8234624 >>>>>>> ?? webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8234624/webrev.00/ >>>>>>> >>>>>>> >>>>>>> According to 2.7 Stack Unwind Algorithm in System V Application Binary Interface AMD64 >>>>>>> Architecture Processor Supplement [1], we need to use DWARF in .eh_frame or .debug_frame >>>>>>> for stack unwinding. >>>>>>> >>>>>>> As JDK-8022183 said, omit-frame-pointer is enabled by default since GCC 4.6, so system >>>>>>> library (e.g. libc) might be compiled with this feature. >>>>>>> >>>>>>> However `jhsdb jstack --mixed` does not do so, it uses base pointer register (RBP). >>>>>>> So it might be lack of stack frames. >>>>>>> >>>>>>> I guess JDK-8219201 is caused by same issue. >>>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Yasumasa >>>>>>> >>>>>>> >>>>>>> [1] https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf >>>>> From chiroito107 at gmail.com Thu Feb 27 13:11:19 2020 From: chiroito107 at gmail.com (Chihiro Ito) Date: Thu, 27 Feb 2020 22:11:19 +0900 Subject: RFR: JDK-8222489: jcmd VM.system_properties gives unusable paths on Windows In-Reply-To: References: <2f7bbd0f-75ac-05d3-f97d-9819f56fc98f@oss.nttdata.com> <171d3f8c-e0a6-0edf-8bbe-9fbc4b8f7614@oss.nttdata.com> <54d4e146-58bc-8318-6e27-922616ff4b37@oss.nttdata.com> <2b349369-730e-d92c-f7be-97554aed5387@oss.nttdata.com> <0a2df665-2e08-6139-c131-043a425b4916@oracle.com> Message-ID: Hi Ralf, Thank you for your advice. 1. The comment of serializePropertiesToByteArray in VMSupport is "The stream written to the byte array is ISO 8859-1 encoded.". But the previous implementation does not keep this. I think we need to implement encode by ISO 8859-1. 2. According to help, the feature of VM.system_properties is just "Print system properties". The users should not use this output for loading. The users use it when they want to see System Properties soon. Regards, Chihiro 2020?2?26?(?) 18:53 Schmelter, Ralf : > Hi Chihiro, > > I have two remarks: > > 1. ISO Latin 1 characters which are not ASCII will not work with the code. > While the Properties.store() method claims to create ISO Latin 1 String, it > really only will create printable ASCII characters (apart from the > comment, but it is ASCII too in this case). See Properties.saveConvert, > where the char is checked for < 0x20 or > 0x7e and then printed as \uxxxx. > This is important, since the bytes of the ByteArrayOutputStream are then > send to the jcmd. And jcmd expects UTF-8 encoded strings, which is OK if we > only used ASCII characters. But a ISO Latin 1 character >= 0x80 will break > the encoding. Just try using \u00DC in your test. > > 2. Your change makes it impossible to load the output with > properties.load(). The old output could be loaded, since it was a valid > properties file. But yours is not. For example, consider the filename > c:\test\new. Formerly it would be encoded as: > C\:\\test\\new > And now it is: > C:\test\new > But the properties code would see "\n" as the newline character in your > encoding. In fact you cannot differentiate between \n, \t, \f and \r > originally being one or two characters. > > Best regards, > Ralf > > > From: serviceability-dev On > Behalf Of Chihiro Ito > Sent: Dienstag, 25. Februar 2020 04:45 > To: serguei.spitsyn at oracle.com > Cc: serviceability-dev at openjdk.java.net > Subject: Re: RFR: JDK-8222489: jcmd VM.system_properties gives unusable > paths on Windows > > Hi Serguei, > > Thanks for your review and advice. > > I modified these. > Could you review this again, please? > > Webrev : http://cr.openjdk.java.net/~cito/JDK-8222489/webrev.05/ > > Regards, > Chihiro > > -------------- next part -------------- An HTML attachment was scrubbed... URL: